Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

making built-in parser #1

Open
masatake opened this issue Nov 29, 2016 · 11 comments
Open

making built-in parser #1

masatake opened this issue Nov 29, 2016 · 11 comments

Comments

@masatake
Copy link
Contributor

Hi,

Thank you for using optlib I designed. As far as I know you are the first person using optlib seriously.
I have been disappointed that none utilizes it. But today I found your project!

How do you think being a member of @universal-ctags project for improving your elm regex parser?

The project offers at least following two good things:

Thank you.

@bitterjug
Copy link
Owner

Hi Thanks!
I think optlib is an interesting idea. I guessed there weren't many users because: 1) universal c-ctags isn't yet prime time. 2) Pretty much all the languages you'd want to tag have native parsers. But those are guesses. Yes. I'd love to join @universal-ctags, thanks, espeically if it means I get some encouragement to improve elm regex parser. I spotted you have testing framework. Didn't know there was a translator though.

I've been using my regex tagger for a while now with vim-tagbar and I'm fairly satisfied with it. There are some other bits and pieces to look into, however. While my own experience is good enough it's tempting for me to skip the 80% work to achieve the last 20% results, and spend more time writing Elm code.

In particular:

  • Could the module tag be a scope?
  • Is there a way to make multi-line block comments work as a scope so that within that scope other rules don't match? Then when I comment out a block of functions they disappear from the tags?
  • Are there any clever ways to use regex to match significant white-space languages? (Haskell ended up with its own tagging library -- I wonder how far we can get with regex for elm)

@masatake
Copy link
Contributor Author

masatake commented Dec 3, 2016

Hi Thanks!
I think optlib is an interesting idea. I guessed there weren't many users because: 1) universal c-ctags isn't yet prime time. 2) Pretty much all the languages you'd want to tag have native parsers. But those are guesses. Yes. I'd love to join @universal-ctags, thanks, espeically if it means I get some encouragement to improve elm regex parser. I spotted you have testing framework. Didn't know there was a translator though.

I opened universal-ctags/ctags#1213.

If possible, I would like you to make a pull request for merging elm parser into u-ctags.

  1. Could you add copyright notice to your parser file? LICENSE file exits separatedly but I would like put it to
    elm.conf. # can be used as comment starter. https://github.com/universal-ctags/ctags/blob/master/optlib/man.ctags is an example.

  2. make a clone repository fork from universal-ctags/ctags

  3. I will make a pull request to your ctags repository.
    I will pick up elm.conf and put it to my pull request. I may modify makefiles, too.

  4. merge the pull request mentioned in 3

  5. add a test case to your ctags repository
    This step is important because I don't know elm.

  6. could you make a pull request to universal-ctags/ctags?

  7. the other u-ctags member may review the PR.

  8. At that time you will be the member of the u-ctags org, you can merge your PR by yourself.

I will add --copyright-<LANG>=... option to ctags side. So your name and license policy can be printed
in ctags --version and pseudo tags printed at the top of tags file. Do you allow me to implement this option AFTER 8?

Thank you for reading.

@masatake
Copy link
Contributor Author

masatake commented Dec 3, 2016

Could the module tag be a scope?

push/pop don't help?
You use clear frequently. But I some of them are not needed.
https://github.com/universal-ctags/ctags/tree/master/Units/regex-with-scope.d

Is there a way to make multi-line block comments work as a scope so that within that scope other rules don't match? Then when I comment out a block of functions they disappear from the tags?

Yes, this may be the most favoured request for developers of regex parser.
There is no way. C side(u-ctags people cal it "main" part.) should be improved.

one of approach is implementing multiple tables in regex parser engine.

--regex-elm=/{-///{goto=comment}
--regex-elm=<comment>/.*-}///{goto=main}

Are there any clever ways to use regex to match significant white-space languages? (Haskell ended up with its own tagging library -- I wonder how far we can get with regex for elm)

I don't understand this. I need an example.

@bitterjug
Copy link
Owner

bitterjug commented Dec 3, 2016 via email

@masatake
Copy link
Contributor Author

masatake commented Dec 4, 2016

What's the {goto=} feature?

This is not implemented. Just an idea.
Currently a regex parser can have only one table where patters are defined.
My idea allows a parser have multiple tables, and switch them by input contexts( source code top level, comment, literal string, inside function, inside record type, and so on). goto is an imaginary flag triggers the switching. I'm not sure such complicated feature is really wanted or not.

I wrote a page about regex parser features I added to http://docs.ctags.io/en/latest/optlib.html.
You may be also interested in this: universal-ctags/ctags#1224

@bitterjug
Copy link
Owner

Okay that explains why I didn't find {goto} in your excellent docs.

Its a way to get separate sets of patterns to match in under different conditions. I had been wondering about this. In stead of switching explicitly between tables, I was wondering if it made sense to add a long flag to enable patterns to match the text only if some condition was true of the scope stack. For example:

  • If a particular scope is at the top of the stack
  • If a particular scope is not at the top of the stack
  • If a particular scope is somewhere on the stack
  • If a particular scope is not on the stack at all

Would something like this work?

@masatake
Copy link
Contributor Author

masatake commented Dec 6, 2016

I'm not sure. Prototypes are needed. For making a prototype 3 things are needed:

  • target input
  • expected output
  • imaginary long flag specificaiton
--regex-elm=/{-///{goto=comment}
--regex-elm=<comment>/.*-}///{goto=main}

Is an example of "imaginary long flag specification".

If a particular scope is at the top of the stack
If a particular scope is not at the top of the stack
If a particular scope is somewhere on the stack
If a particular scope is not on the stack at all

I see. How will you represent these condition/action pairs in command line?

@bitterjug
Copy link
Owner

Okay, here is a stab at adding Elm as a built-in optlib for u-ctags. The tests pass! And I can now get the exe to output elm tags without pointing it to my optlib so I guess I complied it in. I've not modified news.rst or the win32 stuff. If you're still on for making a PR with the appropriate updates that'd be great.

Merry Xmas @masatake

@masatake
Copy link
Contributor Author

masatake commented Jan 2, 2017

Thanks.

I make one pull request that I would like to merge to your chage.

Could you update news.rst and win32 stuff? win32 stuff? (win32 stuff is not sorted. So you can insert your .c anywhere you want. Appveyor, ci for winowns may notify us when we do something wrong in the win32 stuff.)

I have a bug comment about how to handle import.
I will write about it tonight. I caught cold.

Thank you again.

@masatake
Copy link
Contributor Author

masatake commented Jan 2, 2017

BTW, I recommend you to make a topic branch when adding a feature or fixing a bug.

@bitterjug
Copy link
Owner

bitterjug commented Jan 2, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants