Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault and crashed neovim #29

Open
Mofiqul opened this issue May 29, 2022 · 39 comments
Open

Segmentation Fault and crashed neovim #29

Mofiqul opened this issue May 29, 2022 · 39 comments

Comments

@Mofiqul
Copy link

Mofiqul commented May 29, 2022

Crashing neovim completely, Segmentation Fault (Core dumped).

@rlch
Copy link

rlch commented May 31, 2022

+1, I can replicate when attempting to type at the bottom of the file.

@genesistms
Copy link

Related open issue in neovim

nvim-treesitter/nvim-treesitter#2972

Workaround
For now add this commit

f95876f0ed3ef207bbd3c5c41987bc2e9cecfc97

to lockfile.json in nvim-treesitter plugin. And TSInstall dart

@Kavantix
Copy link

Kavantix commented Jun 1, 2022

The issue is in scanner.c somewhere, reverting it to the version at f95876f0ed3ef207bbd3c5c41987bc2e9cecfc97 fixes the crash

@RobertBrunhage
Copy link

I made a quick fork where I changed the version (feel free to use it until it's fixed) https://github.com/RobertBrunhage/nvim-treesitter

@CarlosNihelton
Copy link

Just creating an empty dart file and adding one non-white-space character without a trailing semicolon is enough to cause the segmentation fault. If we write the semicolon first and then insert anything before it the parser won't crash.

As simple as this:

printf "p" > konstant.dart && nvim konstant.dart => [SIGSEV]

printf "p;" > konstant.dart && nvim konstant.dart => [works just fine]

@ErikReider
Copy link

Same issue here: Reverting to the older revision works :)

@Kavantix
Copy link

Kavantix commented Jun 9, 2022

@UserNobody14 I've played around a bit with the scanner and the only way so far I have been able to prevent a segfault without a semi colon at the end is by changing the last return false; in tree_sitter_dart_external_scanner_scan to return lexer->lookahead == '\0'.
This however really sounds like a bug in tree sitter itself we stumbled upon, or it could even be in the grammar.

@CarlosNihelton
Copy link

@UserNobody14 I've played around a bit with the scanner and the only way so far I have been able to prevent a segfault without a semi colon at the end is by changing the last return false; in tree_sitter_dart_external_scanner_scan to return lexer->lookahead == '\0'. This however really sounds like a bug in tree sitter itself we stumbled upon, or it could even be in the grammar.

Hey! I don't have idea of why, but it worked for me as well!
@UserNobody14 is this a real fix or would it be just masking some other issue?
Unfortunately the commit that introduced the bug is very big to allow attempts to fix it without a deep understanding of tree-sitter and Dart language specification.

@UserNobody14
Copy link
Owner

Let me try adding that fix and running the grammar on a large list of files again, as far as I know it shouldn't cause a big issue. If it works, I'll merge it

@UserNobody14
Copy link
Owner

Hmmm. After looking into the problem I can't seem to either replicate it or get the solution you posted to work with our tests. I don't have neovim setup with dart, so I tried running all of the tests I traditionally use, like parsing files and such. Sadly I was not able to recreate any of these issues, which I believe indicates it only happens on reruns of the grammer. Unfortunately without an easy way to replicate it I just don't know what I should do.

Near as I can tell the "lookahead == '\0'" line causes it to return true while the string has not yet ended, but this causes nearly all of our tests to fail, and on other projects' scanner.c files, the return false line is still present, so I don't see why it should be causing such an issue.

I found that this configuration below successfully runs the tests, but without an easy way to see if it segfaults, i have no way of knowing if it actually fixes the problem. Does anyone have some easy steps for reproducing the issue?

                                                  const bool *valid_symbols) {
  // bool ret = false;
  // if (lexer->lookahead == '/') {
  //   return scan_multiline_comments(lexer);
  // }
  if (lexer->lookahead == '\0') {
    return false;
  }
  if (
      valid_symbols[TEMPLATE_CHARS_DOUBLE] ||
      valid_symbols[TEMPLATE_CHARS_SINGLE] ||
      valid_symbols[TEMPLATE_CHARS_DOUBLE_SINGLE] ||
      valid_symbols[TEMPLATE_CHARS_SINGLE_SINGLE]
  ) {
    return scan_templates(lexer, valid_symbols);
  }
  if (valid_symbols[AUTOMATIC_SEMICOLON]) {
      bool ret = scan_automatic_semicolon(lexer);
      return ret;
  }
  while (iswspace(lexer->lookahead)) lexer->advance(lexer, true);

  if (lexer->lookahead == '/') {
    return scan_multiline_comments(lexer);
  }
  return false;
}```

@theHamsta
Copy link
Contributor

theHamsta commented Jun 11, 2022

You don't need a Neovim setup. Just let the tree-sitter parser run. Instructions ikatyang/tree-sitter-markdown#14 (CC needs to be clang and python in the script python2). I've provided sample inputs that let the parser crash in the original issue, but I would recommend running the fuzzer (maybe even on CI)

@theHamsta
Copy link
Contributor

If this is really a bug in tree-sitter, you could have a look at our open issue with tree-sitter-prisma which crashes without a custom parser @Kavantix . There is even a closed issue for tree-sitter-pug which hd the same issue that could be fixed by a check for \0

@Kavantix
Copy link

@theHamsta could you link the issues you mentioned?

@UserNobody14 I reproduced it by running tree-sitter highlight on a file containing a few random characters and no semicolon at the end. E.g. dhdh. So no need for using vim while testing

@theHamsta
Copy link
Contributor

theHamsta commented Jun 13, 2022

@Kavantix here you go: nvim-treesitter/nvim-treesitter#2972 (comment) As you say, no need to use vim. I bisected using git rebase -exec '...' using the setup in ikatyang/tree-sitter-markdown#14 . There are some problems with the queries in this repo, so I had to modify the fuzzing main a bit to skip checking the queries. (basically skip all usages of https://github.com/tree-sitter/tree-sitter/blob/b729029a403046375a77a8e320211da64da00629/test/fuzz/fuzzer.cc#L7 in this file

I should really open a PR to document how to setup fuzzing in upstream tree-sitter. We found quite a lot parser issue with tree-sitter's fuzzing setup.

@ghost
Copy link

ghost commented Jul 7, 2022

Is the current workaround is the one offered by @genesistms , to revert to SHA f95876f of this repo?

This is a manual step so all new users of Neovim + TS + Dart will have to perform it on their own.
Would it be possible to revert the changes done on this repo to that SHA instead?

nvim-treesitter/nvim-treesitter#2972 which was referenced earlier as the matching TS issue is now marked as closed.

@vigoux
Copy link

vigoux commented Jul 7, 2022

Coming to this from nvim-treesitter.

Adding the following to test/corpus/dart.txt replicates the issue with tree-sitter test:

=====================
Weird file
=====================


d

---------------

This should at least help debugging the test.

The reason of the segfault is still a mistery to me, and the scanner.c is cryptic to me, especially scan_templates which does some kind of weird loop things.

@Kavantix
Copy link

Kavantix commented Jul 7, 2022

Changing the scanner to always return false immediately gives the same crash, so it is not because of the scanner

@vigoux
Copy link

vigoux commented Jul 7, 2022

Now that's interesting...

@rphuber
Copy link

rphuber commented Jul 8, 2022

Any chance to revert back to the working commit until the problem is solved?

@theHamsta
Copy link
Contributor

nvim-treesitter now uses an older working version for now: nvim-treesitter/nvim-treesitter@d99b4cd

@ahelwer
Copy link

ahelwer commented Jul 13, 2022

@theHamsta a fuzzing setup guide would be tremendously appreciated, I'd love to fuzz my grammar.

@theHamsta
Copy link
Contributor

@ahelwer Can't count how often I linked this post ikatyang/tree-sitter-markdown#14 (comment)

@kyazdani42
Copy link

@theHamsta maybe we could host a wiki page for this, although the comment is good, debugging the error is a bit more complicated than just running the fuzzer :)

@theHamsta
Copy link
Contributor

Ideally such a documentation is contributed to tree-sitter upstream. On how to create safe, fuzzed parsers.

@kyazdani42
Copy link

I guess so. I will look at this next week

@UserNobody14
Copy link
Owner

Alright I got a commit working with @Kavantix 's highlight dhdh file as well as @vigoux 's "weird file" test (thanks btw!). Unfortunately it's still hard for me to be sure it won't crash again. Let me know if it keeps happening with the latest changes.

I'm not a genius user of scanner.c, sadly, I just kinda copied and pasted from various other tree sitter repos until I got it working. If anyone knowledgeable of C and tree-sitter wants to take a crack at it, I'd be happy to assist/merge your commits.

@theHamsta
Copy link
Contributor

theHamsta commented Aug 1, 2022

Seems to be ok now. Fuzzed for 55mins. Can the others please confirm that this is gone now also in Neovim?

@fitrh
Copy link

fitrh commented Aug 2, 2022

I can confirm that this issue has been fixed on neovim, although it seems that the speed for the first load is back to normal

@ayoubelmhamdi
Copy link

also, the flutter project does not crash now for me

@Kavantix
Copy link

For me the crash still happens on 53485a8

@johnstef99
Copy link

yeap still happening :(

@TimWhiting TimWhiting reopened this Sep 27, 2022
@UserNobody14
Copy link
Owner

Can anyone provide a test or a specific file that crashes? I've tested it with empty files and files with a single variable but no dice.

@theHamsta
Copy link
Contributor

theHamsta commented Oct 12, 2022

This is for querying (it uses the highlight queries in this project repo). Let me check whether it also crashes for parsing.

=================================================================
==18396==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a1549d7674 at pc 0x55a1549141c2 bp 0x7fff81d95910 sp 0x7fff81d95908
READ of size 2 at 0x55a1549d7674 thread T0
    #0 0x55a1549141c1 in ts_tree_cursor_current_status (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x4741c1) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe)
    #1 0x55a154816478 in ts_query_cursor__advance query.c
    #2 0x55a154811196 in ts_query_cursor_next_match (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x371196) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe)
    #3 0x55a1546fecad in LLVMFuzzerTestOneInput (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x25ecad) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe)
    #4 0x55a15460d250 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x16d250) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe)
    #5 0x55a15460c9b5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x16c9b5) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe)
    #6 0x55a15460e1a5 in fuzzer::Fuzzer::MutateAndTestOne() (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x16e1a5) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe)
    #7 0x55a15460eda5 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x16eda5) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe)
    #8 0x55a1545fcef0 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x15cef0) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe)
    #9 0x55a154626412 in main (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x186412) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe)
    #10 0x7f3b6302350f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #11 0x7f3b630235c8 in __libc_start_main csu/../csu/libc-start.c:381:3
    #12 0x55a1545f1ae4 in _start (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x151ae4) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe)

0x55a1549d7674 is located 0 bytes after global variable 'ts_field_map_entries' defined in '/home/stephan/projects/tree-sitter/test/fixtures/grammars/dart/src/parser.c:3390' (0x55a1549d7360) of size 788
SUMMARY: AddressSanitizer: global-buffer-overflow (/home/stephan/projects/tree-sitter/out/dart_fuzzer+0x4741c1) (BuildId: dc3168fc09ef0bd840db576e17453b5329cf40fe) in ts_tree_cursor_current_status
Shadow bytes around the buggy address:
  0x55a1549d7380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x55a1549d7400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x55a1549d7480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x55a1549d7500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x55a1549d7580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x55a1549d7600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[04]f9
  0x55a1549d7680: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x55a1549d7700: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x55a1549d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x55a1549d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x55a1549d7880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==18396==ABORTING
MS: 2 CrossOver-InsertByte-; base unit: 5c10b5b2cd673a0616d529aa5234b12ee7153
[crash-6b5a6e5f767759c97ee0cb4ff08b35c32f40b808.txt](https://github.com/UserNobody14/tree-sitter-dart/files/9768311/crash-6b5a6e5f767759c97ee0cb4ff08b35c32f40b808.txt)
808
0xac,0x2c,
\254,
artifact_prefix='./'; Test unit written to ./crash-6b5a6e5f767759c97ee0cb4ff08b35c32f40b808
Base64: rCw=

@maxbrunsfeld do you have an idea why the fuzzer could be crashing when querying but not on parsing?

@theHamsta
Copy link
Contributor

@theHamsta
Copy link
Contributor

It does not seem to crash when just using the fuzzer to parse. Just when the fuzzer script also tries to query using highlight.scm

@maxbrunsfeld
Copy link

If you can find a minimal query and source file that reproduces the error (via the tree-sitter query CLI command), could you open a Tree-sitter issue?

@removed-usr
Copy link

It just happened to me

@TimWhiting
Copy link
Collaborator

I'm assuming people are still seeing this. I am working on updating this library for newer dart syntax, and have been trying to clean up the grammar slightly as I go. However, it seems like the issue stems not from the grammar but rather from the highlights file. The highlights currently only have a few test cases. I'll see what I can do to get a more comprehensive testing setup there, which will help diagnose this issue. In the meantime if someone wants to try the latest commits in this repo and determine whether the issue is no longer present that would be great. (I do not personally use NeoVim or the highlights currently). Additionally anyone who would like to contribute to the test cases for highlighting I'd be happy to review PRs.

CC @theHamsta

@mjcurciojr
Copy link

@TimWhiting Can confirm, I am using the latest commit and still experiencing crashes when using with neovim and dartls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests