Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow referring to elided fields values in later fields' parsers with seq! #406

Open
2 tasks done
DJDuque opened this issue Dec 27, 2023 · 11 comments
Open
2 tasks done
Labels
A-combinator Area: combinators C-enhancement Category: Raise on the bar on expectations

Comments

@DJDuque
Copy link
Contributor

DJDuque commented Dec 27, 2023

Please complete the following tasks

winnow version

0.5.31

Describe your use case

Having _temporary variables that don't make it into the struct (but also don't get completely ignored like _) could be beneficial for cases like this.
The above seq! call could look something like:

 pub(crate) fn event_view<'a>(
    endian: Endianness,
) -> impl Parser<&'a [u8], EventView<'a>, ContextError> {
    seq! {EventView {
        event_id: u16(endian),
        trigger_mask: u16(endian),
        serial_number: u32(endian),
        timestamp: u32(endian),
        _event_size: u32(endian).verify(|&n| n >= 8),
        _banks_size: u32(endian).verify(&n| n == _event_size - 8),
        bank_views: dispatch! {u32(endian);
                1 => length_and_then(success(_banks_size), repeat_till0(padded_bank(bank_16_view(endian)), eof)),
                17 => length_and_then(success(_banks_size), repeat_till0(padded_bank(bank_32_view(endian)), eof)),
                49 => length_and_then(success(_banks_size), repeat_till0(padded_bank(bank_32a_view(endian)), eof)),
                _ => fail,
            }.map(|(bank_views, _)| bank_views),
    }}
}

which is easier to follow and understand.

This will also allow us to use this macro to do stuff similar to nom-derive's anonimous fields mentioned here, e.g.

seq! {MyStruct {
    _b0: be_u8,
    li: success(_b0).map(|n| n >> 6),
    version: success(_b0).map(|n| (n >> 3) & 0b111),
    // ... etc
}}
@DJDuque DJDuque added the C-enhancement Category: Raise on the bar on expectations label Dec 27, 2023
@epage epage changed the title Add _temporary variables to seq! Allow referring to elided fields with seq! Dec 27, 2023
@epage epage added the A-combinator Area: combinators label Dec 27, 2023
@epage epage changed the title Allow referring to elided fields with seq! Allow referring to elided fields values in later fields' parsers with seq! Dec 27, 2023
@epage
Copy link
Collaborator

epage commented Dec 27, 2023

I would love to support this but I'm unsure of a way to do this with macro_rules which I'd really like to stick to with seq!.

@himikof
Copy link

himikof commented Dec 28, 2023

I would love to support this but I'm unsure of a way to do this with macro_rules which I'd really like to stick to with seq!.

Maybe not exactly as proposed, but can the following syntax idea work as an implementable alternative?

seq! {MyStruct {
    #[ignore] b0: be_u8,
    li: success(b0).map(|n| n >> 6),
    version: success(b0).map(|n| (n >> 3) & 0b111),
    // ... etc
}}

@DJDuque
Copy link
Contributor Author

DJDuque commented Feb 19, 2024

@epage What do you think about @himikof's suggestion?

I love the seq macro, and I keep using it for basically everything. Having a #[ignore] or #[temporary] would be a huge improvement imo.

I am a bit busy with work right now, but if you like this suggestion, I can try to implement it.

@epage
Copy link
Collaborator

epage commented Feb 19, 2024

Sorry, lost track of this.

If we do this, I'd at least like to see mirrored support for tuples as well.

My main concern is with what to name it to make the intent clear. We aren't ignoring the parser, only the container initialization. Maybe #[temp] works for that? Any other thoughts?

@DJDuque
Copy link
Contributor Author

DJDuque commented Apr 25, 2024

I think I got this working for structs in #511, but I don't really know how to implement this for tuples.

Currently we can do something like (from the docs):

fn point(input: &mut &[u8]) -> PResult<(u32, u32)> {
    let num = dec_uint::<_, u32, ContextError>;
    seq!(num, _: (space0, b',', space0), num).parse_next(input)
}

Do we expect something like: seq!(var: num, empty.value(var)) to be possible?
And then #[temp] e.g. seq!(#[temp] half_var: num, empty.value(half_var * 2), empty.value(half_var * 2))?

@epage
Copy link
Collaborator

epage commented Apr 25, 2024

@DJDuque thank for that work! I keep going back and forth on how we should handle this and if it should be handled. Especially seeing it in action, #[temp] feels weird and I worry that it makes the intent of the code less clear. This makes me wonder if we're trying to force too much through seq! and it would be better to instead encourage people to write their parser more explicitly in these cases. Macros are already a bit magical and without care they can be hard understand how to write them from the docs since the "signature" is opaque, how to read them in code if too much is done, how to get them to compile due to the signature problems and little traps like handling of commas.

This isn't too say "no" but I want to exercise some caution here and would appreciate input on managing the balance with seq!.

@DJDuque
Copy link
Contributor Author

DJDuque commented Apr 25, 2024

I agree that managing things can get out of hand if we try to force too much through macros, but I also think that referring to previous fields is common enough that it is reasonable for seq! to handle this.

The current seq! macro is so convenient that I think not having #[temp] (or an equivalent) encourages people to work around it with e.g. chained flat_maps (or other alternatives less readable than #[temp]) instead of falling back to write the full parser in a more explicit way.

It would be good to have the opinion of other seq! users. In my opinion, this is a huge quality-of-life improvement parsing binary data, and it warrants the extra macro complexity.

@bbb651

This comment was marked as off-topic.

@epage

This comment was marked as off-topic.

@epage
Copy link
Collaborator

epage commented Jan 3, 2025

#655, specifically 04e8444, added placeholder names for elided fields (e.g. _1). However, macro hygiene prevents users from accessing those identifiers.

@epage
Copy link
Collaborator

epage commented Jan 9, 2025

One idea is using @ like match (though I feel its backwards

 pub(crate) fn event_view<'a>(
    endian: Endianness,
) -> impl Parser<&'a [u8], EventView<'a>, ContextError> {
    seq! {EventView {
        event_id: u16(endian),
        trigger_mask: u16(endian),
        serial_number: u32(endian),
        timestamp: u32(endian),
        event_size @ _ : u32(endian).verify(|&n| n >= 8),
        banks_size @ _: u32(endian).verify(&n| n == event_size - 8),
        bank_views: dispatch! {u32(endian);
                1 => length_and_then(success(banks_size), repeat_till0(padded_bank(bank_16_view(endian)), eof)),
                17 => length_and_then(success(banks_size), repeat_till0(padded_bank(bank_32_view(endian)), eof)),
                49 => length_and_then(success(banks_size), repeat_till0(padded_bank(bank_32a_view(endian)), eof)),
                _ => fail,
            }.map(|(bank_views, _)| bank_views),
    }}
}

another option is to create our own syntax with as

 pub(crate) fn event_view<'a>(
    endian: Endianness,
) -> impl Parser<&'a [u8], EventView<'a>, ContextError> {
    seq! {EventView {
        event_id: u16(endian),
        trigger_mask: u16(endian),
        serial_number: u32(endian),
        timestamp: u32(endian),
        _ as event_size u32(endian).verify(|&n| n >= 8),
        _ as banks_size: u32(endian).verify(&n| n == event_size - 8),
        bank_views: dispatch! {u32(endian);
                1 => length_and_then(success(banks_size), repeat_till0(padded_bank(bank_16_view(endian)), eof)),
                17 => length_and_then(success(banks_size), repeat_till0(padded_bank(bank_32_view(endian)), eof)),
                49 => length_and_then(success(banks_size), repeat_till0(padded_bank(bank_32a_view(endian)), eof)),
                _ => fail,
            }.map(|(bank_views, _)| bank_views),
    }}
}

The proposed is keyword also puts the binding on the left...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-combinator Area: combinators C-enhancement Category: Raise on the bar on expectations
Projects
None yet
Development

No branches or pull requests

4 participants