Implement right contexts (lookahead) #37

osa1 · 2021-12-10T13:07:56Z

Fixes #29

osa1 · 2021-12-11T15:51:14Z

I think I will have to generate separate NFA/DFAs for right contexts. Currently in DFA simulation I use this to run right contexts:

// Similar to `simulate`, but does not keep track of the last match as we don't need "longest
// match" semantics and backtracking
fn simulate_right_ctx<A>(
    dfa: &DFA<StateIdx, A>,
    init: StateIdx,
    accept: StateIdx,
    mut char_indices: std::str::CharIndices,
) -> bool {
    if init == accept {
        return true;
    }

    let mut state = init;

    while let Some((_, char)) = char_indices.next() {
        match next(dfa, state, char) {
            None => {
                // Stuck
                return false;
            }
            Some(next_state) => {
                if next_state == accept {
                    return true;
                }

                state = next_state;
            }
        }
    }

    match next_end_of_input(dfa, state) {
        None => false,
        Some(next_state) => next_state == accept,
    }
}

In the generated code, for next above we will have a match state { ... } where in each alternative we will have a match char { ... }. These matches will duplicate the code for the main DFA next method, for each right context. That's a lot of duplication.

If we maintain separate DFAs for right contexts we can generate smaller code for next that doesn't have states of the main DFA.

osa1 · 2022-01-31T14:53:38Z

So one of the tests I'd added last time I worked on this is this:

lexer! {
    Lexer -> u32;

    // Per longest match we "a" should return 2, not 1
    'a' = 1,
    'a' > $ = 2,
}

let mut lexer = Lexer::new("a");
assert_eq!(next(&mut lexer), Some(Ok(2)));
assert_eq!(next(&mut lexer), None);

However as I think about this again now I realize that this is not a good idea. For this semantics we need to run all right contexts of a state, even after finding on that matches. I'm not sure if this semantics is useful, and it can certainly be slower than needed. Instead I think it should be good enough to try the rules in order and run semantic action of the first one that matches.

This means if there's a rule without a right context in the same state, then the ones with the right context will never be tried. We should probably start generating warnings in these cases, but maybe not in this PR.

osa1 added 6 commits December 9, 2021 13:48

Implement right context type and parsing

4a7da01

Implement NFA right context simulation -- not tested

5f2f0ac

Implement DFA right context simulation -- not tested

c5ec566

Start testing

663fd7e

Fix NFA right context simulation, update NFA debug output

d3d1440

Merge remote-tracking branch 'origin/main' into right_context

5ba5a08

osa1 mentioned this pull request Dec 10, 2021

Lookahead could be useful #29

Closed

Fix DFA simulation

babb2d7

osa1 added 17 commits December 12, 2021 13:50

Merge remote-tracking branch 'origin/main' into right_context

ac1d4a2

Start implementing separate DFAs for right contexts

e178bdd

Enable right ctx tests

38167dd

WIP

a905ba2

Start simplifying right ctx DFAs

a4caf92

WIP: Start implementing codegen for right contexts

785f1d7

Fix a few bugs, start adding tests

5b0c0ae

Start handling right contexts in codegen, add more tests

5550e35

Make iter field public

bf81d48

More right ctx tests

7a942f0

Add a failing test

cbd1e1a

Merge remote-tracking branch 'origin/main' into right_context

9bb4f97

Implement helper fn make_if

0789752

Run right contexts in rest of the cases

326c9e0

Remove duplicate code

eb390a1

Remove duplicate code

b1ffac6

Remove unused stuff

5badb4e

osa1 added 3 commits January 31, 2022 18:04

Merge remote-tracking branch 'origin/main' into right_context

fb795e1

Fix a lint, remove invalid test

2fa142b

Add more tests, last one reveals a compilation error

efa8cfd

osa1 added 5 commits January 31, 2022 18:24

Generate right ctxs before search tables

5bdd0a8

More tests

6072e3a

Add more tests

177955e

Update CHANGELOG, README

64a37f9

Typo fix

9439412

osa1 marked this pull request as ready for review January 31, 2022 17:10

osa1 merged commit ed05fec into main Jan 31, 2022

osa1 deleted the right_context branch January 31, 2022 17:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement right contexts (lookahead) #37

Implement right contexts (lookahead) #37

osa1 commented Dec 10, 2021 •

edited

Loading

osa1 commented Dec 11, 2021

osa1 commented Jan 31, 2022

Implement right contexts (lookahead) #37

Implement right contexts (lookahead) #37

Conversation

osa1 commented Dec 10, 2021 • edited Loading

osa1 commented Dec 11, 2021

osa1 commented Jan 31, 2022

osa1 commented Dec 10, 2021 •

edited

Loading