-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An out-of-the-box generator for pest grammar. #29
Comments
Looks pretty good. Yes, we can possibly put it under pest-parser or see how else it can be incorporated. |
Hi @TheVeryDarkness , I had a more detailed look and I like it, but feel it can't be easily incorporated in a backward-compatible way in pest 2.X because of many breaking changes. Overall though, I think it's very promising and it fits very well with some original ideas for pest3: pest-parser/pest#416 (comment) (Plus for better errors, see: pest-parser/pest#885 (reply in thread) ) I started locally trying to extract out pest-typed into pest3, but I don't have that much time these days and my progress is slow in that. Would you like try to add that to pest3? The repo is here: https://github.com/pest-parser/pest3 so you can just open a PR with your changes there. Besides a basic parser of the surface grammar, it's incomplete, so you'd need to adapt pest-typed generator etc. to the new pest3 syntax, but other than that, you have a complete freedom there.
There are also a few more ideas for pest3, so there may be other changes, but we can perhaps start with that surface syntax parser in the repository, add the typed AST generation and iterate from there. What do you think? |
@TheVeryDarkness FYI regarding trivia, currently the intended semantics / usage is this:
A)
A) |
@tomtau I'll have some time one or two weeks later. I still have some exams and assignments now. And I guess it won't take too long to implement them. |
Can I jump in here and ask some general questions about ASTs in Rust and pest? From my point of view, there are two important questions about generating an AST from the grammar.
In general, the question is who should generate the AST. Because an AST is always an abstracted view of the parsing result. So, depending on what you want to do, your AST may look different and may be implemented differently. AST abstractionIf you look at the language workbenches mentioned above, they typically do the following when translated to pest:
This would be somewhat simpler than the current procedure in TheVeryDarkness/pest-typed Data StructureOne problem in Rust is references to AST nodes. In other dynamic languages it is easy to have references inside a tree structure and to three nodes. But in Rust it is more complicated, especially if you have mutable references.
So we could construct our AST based on We experimented with all the approaches and ended up with an arena AST. @TheVeryDarkness What were your thoughts for your approache? |
@marcfir Thanks for sharing your opinions. It should be a great idea to use an arena or an allocator here considering performance. And the lifetime can be omitted if we create And I'm also considering supporting something like hooks or visitors, which should be able to satisfy the usage of validation and symbolling. Well, I'd like to say that I adopted my current approach mainly because the way you listed above may not be trivially extended to arbitrary rules. For example, if we have a complex sequence, in which there are repetitions to a compound parsing expression, just like And generating And |
There's a PR pest-parser/pest#815 about hooks. |
There were a few interesting comments in the pest3 discussion regarding typed AST:
There are no concrete decisions yet. Regarding "hooks" (as in arbitrary Rust code executed within a grammar), I spoke to @dragostis about it and our consensus was that it'd bring extra complexity and drawbacks, e.g. grammar rewriting optimizer will need to give up when it encounters hooks or it won't be possible or easy to have the embeddable pest_vm (that's used for the online scratchpad editor on the main website). With the pest's focus on accessibility, correctness, and performance, instead of adding hooks, we'd preferably look for adding self-contained grammar features that can address main use cases people would want to use "hooks" for. As for my own preferences:
|
@TheVeryDarkness The problem with a rule like this I also hit the problem, that macros cannot share state. There is a related issue rust-lang/rust#44034. I am currently working on a language workbench in Rust based on pest. It's not feature complete and very unstable yet. So we haven't released it yet. Our first start was an automatic AST generation. So I think it makes sense to at least show the AST generation. I released the code at pestast. If I have more time I can also commit the arena tree based AST @tomtau I didn't have the discussions in mind. So maybe we can continue there at least for pest3. |
@marcfir Yes, it's practicable if rule components are well tagged (though tags are available only under feature grammar-extras). I think we can provide typed AST in both styles. By the way, |
Yes, for a greater visibility, it's better to use that pest3 brainstorming discussions instead. Once @TheVeryDarkness adds the |
@tomtau Sorry, I have some questions about the definition of trivia:
Are there any differences between them? And should (or could) both of them be defined in the same grammar? I though there is only one form of trivia, and there should be only one way to define the trivia. |
And one issue is whether we should keep skipped trivia inside our generated data structure? It may be useful in some cases, such as implementing a formatter, but can also be solved by defining the rule explicitly. So, I may drop skipped trivia in my PR, unlike what I've done in By the way, another question is whether silent rule should be either inline or ignored in the data structure. |
Yes, it's mentioned below; ~ means that there's an optional trivia (0 or more) between items, ^ means there's a mandatory trivia (1 or more). |
I'm not sure. I'd need to revisit some of the older issues and discussions or ask people for more feedback; maybe it'll make sense to keep it in the mandatory one, but not the optional one? Anyway, for the initial prototype, you can just pick whatever is easiest to go with for the start.
Maybe silent rules could be ignored, while the meta-rules/template functions could be inline? |
Does it mean we should write codes like below?
|
I guess * and + may be redundant in that definition, as it's implied by the operator, so perhaps just: comment | whitespace. |
@TheVeryDarkness how's going with that pest-typed port to pest3? Do you need any help? |
I have a fork here and I'm working on it. |
No problem, I'll check out the current fork! |
I've written some codes for generating Concrete Syntax Tree from Pest grammar.
Maybe that repository can be merged into some repository under pest-parser in the future if appreciated.
Examples can be found on docs.rs or examples directory.
An issue in pest suggests me to ask for feedback here :)
The text was updated successfully, but these errors were encountered: