Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Packrat Parsing Implementation
As discussed in #74, this pull request implements packrat parsing using a internally specified set of memoization functions in
./src/cache.ts
.1. Checklist
2. Discussion
Packrat parsing [1] is a technique for enabling linear-time backtracking in top-down parsers. A common issue with recursive descent parsing is exponential time blowup, whereby the time it takes to parse a string scales exponentially with respect to input length.
Our implementation here using a memoization wrapper around the incremental state update function for each parser. The full, global cache state is cleared on every run.
[1] Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking (PDF). Bryan Ford, 2002, MIT
2.1. Race Condition Risk
If two instances of
run
are running in parallel, this could lead to a poisoned cache, and incorrect parse results. Since the parser does not operate on the basis of Promises, it is unclear to me whether this is an important risk.2.2. Incremental Parsing
This implementation resets the global cache on every
run
andfork
call. This is necessary to avoid errors in parsing - because the cache is parameterized only onstate
, and nottarget
, the cache can become poisoned unless it is refreshed on every individual run.On our fork, I think we will attempt to parameterize the state update cache with
target
, so that the cache may persist across multiple runs, thereby implementing efficient incremental parsing. A configurable LRU cache will be utilized to prevent arbitrary memory usage.I have avoided implementing this more advanced form in this pull request, as it is a lot more experimental, opinionated, and risky. As yet, I am not convinced it's even possible to do easily - if
target
always corresponds to do full input, then parameterizing the cache ontarget
won't implement incremental parsing.3. Suggested release notes
Versioning is left up to maintainer discretion, but for convenience here is my suggested changelog.
4. Questions to resolve:
cache.ts
? I did not want to unnecessarily bloat the pull request, but if we want them I can write some.