Replies: 2 comments 5 replies
-
Thanks @kaby76. I just released a new version with the additional exports. For the inclusion in main ANTLR: I asked Ter to wait, but Eric rushed in his type definition stuff and I had no chance. But that's not a big deal actually, since I plan for a long time to fork ANTLR4 and create a new one (ANTLRng) by porting it to TypeScript in a way that hopefully makes it easy for others to port that over to their preferred language. I want to allow ANTLR itself to progress again, introduce new features and so on. Some of the things you do in your tools would be highly welcome! I want to fix most of the open bugs of my ANTLR VS Code extension now and then start the next step by porting the runtime tests to TypeScript in a way that allows much quicker execution (currently they take between 2:30 and 10:00 mins for me, which is way too slow for safe refactoring). The slow execution was a major pain in the port from JS to TS. During this step I will lay ground for a new testing framework, which moves responsibility from the tool to the target authors, who can then optimize test execution that fits their platform better (e.g. avoid frequent builds for the C++ target). Neither targets nor their testing (and release process) should be the responsibility of the ANTLR tool. I love the work you have done to verify the new runtime with various grammars from the grammar directory. That gives additional confidence it is good for production use. Not sure if we can fix the failing grammars from the tool side, but maybe it would help to define a test harness that all grammars have to execute successfully before they are added to the grammar directory. But because of actions that might be difficult. I also plan to replace the STG based code generation with a plugin based approach (have no details yet), where target writers get code completion/inline help for each of the template rules that must be implemented. This can then also check for bad words, according to the target language. I have written now 2 targets and still don't know all of the mandatory target STG rules and which are optional or not being used at all, and in which context. That's scary somehow. A template editor should provide a list of these names, their parameters with types and help with the overall syntax. |
Beta Was this translation helpful? Give feedback.
-
@kaby76 I just got an idea for the conflicting-rule-names problem: what if we had an option for the antlr tool to specify a common prefix or suffix for all generated rules? This could be |
Beta Was this translation helpful? Give feedback.
-
Excellent work! This runtime should be the default for the "official" Antlr. Updating the .d.ts files in https://github.com/antlr/antlr4/tree/dev/runtime/JavaScript/src/antlr4 with the correct type declarations has been a pain.
Please publish a new version soon. ConsoleErrorListener is not exported in the published version 2.0.1 package, but it is available in the current version in the repo.
I tested the runtime with a large subset of grammars in grammars-v4, specifically those without actions. I've added to trgen templates for the target "Antlr4ng".
Of the 340 grammars in grammars-v4, 251 were tested with your runtime (via
for i in `find . -name desc.xml | grep -v Generated`; do dirname $i; x=`dirname $i`; no=0; for j in $x/*.g4; do got=`trparse -t ANTLRv4 $j | trxgrep ' //actionBlock' | trtext -c`; if [ $got -gt 0 ]; then no=1; fi; done; if [ $no -eq 0 ]; then pushd $x; trgen -t Antlr4ng --force; cd Generated-Antlr4ng; make; make test; popd; fi; done
).Of those, almost all passed. Of the 10 or so that failed, the problems fall into one of four categories.
The first category of tests that fail happen with "symbol conflicts". One particularly annoying conflict involves the parser start symbol, e.g., start symbol module of grammar clu. The "symbol conflict avoidance" implementation renames "module" to "module_" in the generated parser code. Unfortunately, we have no way to predict a priori that the start name has to be "module_". The compilation fails because I don't have any idea when to rename the start symbol in the driver to "module_". I noted the problem long ago, but no solution has been suggested. Other grammars with this problem are: esolang, haskell, kuka, and oberon (found via
for i in `find . -name desc.xml | grep -v Generated`; do trparse `dirname $i`/*.g4 | trxgrep ' /grammarSpec[grammarDecl[not(grammarType/LEXER)]] //parserRuleSpec[ruleBlock//TOKEN_REF/text()="EOF"]/RULE_REF[text() = "module"]' | trtext; done
).The parser symbol "constructor" cannot be used with your runtime. It causes a compilation error, so this symbol needs to be entered in the table for the code generator to rename.
The second category involves differences in error reporting output, in kotlin. I haven't yet tried to figure out why there is a diff, but it doesn't look like it's a runtime issue, but an issue with the driver I create in trgen.
The third category are for grammars that never really worked (pike). These grammars should be fixed.
The fourth category is a problem with compiling the generated code for rego. Somehow RegoLexer.ts does not contain a definition for channel COMMENTS_AND_FORMATTING. This seems to be a problem with codegen in the Antlr tool templates for the target.
Beta Was this translation helpful? Give feedback.
All reactions