Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miette or Ariadne integration, at least in documentation #104

Closed
2 tasks done
epage opened this issue Jan 30, 2023 · 6 comments
Closed
2 tasks done

Miette or Ariadne integration, at least in documentation #104

epage opened this issue Jan 30, 2023 · 6 comments
Labels
A-docs Area: Documentation A-error Area: error reporting C-enhancement Category: Raise on the bar on expectations

Comments

@epage
Copy link
Collaborator

epage commented Jan 30, 2023

Please complete the following tasks

winnow version

0.2.0

Describe your use case

Good error messages are important Users should easily get messages like this

Describe the solution you'd like

Unsure

Alternatives, if applicable

No response

Additional Context

No response

@epage epage added A-error Area: error reporting A-docs Area: Documentation C-enhancement Category: Raise on the bar on expectations labels Jan 30, 2023
@epage epage added this to the 0.3 milestone Jan 30, 2023
@epage epage changed the title Ariadne integration, at least in documentation Miette or Ariadne integration, at least in documentation Feb 14, 2023
@epage
Copy link
Collaborator Author

epage commented Feb 14, 2023

Updated as Miette seems to be the more popular one, by download count

@epage epage modified the milestones: 0.3, 0.4 Feb 14, 2023
@epage epage modified the milestones: 0.4.x, 0.5.x Mar 22, 2023
@epage
Copy link
Collaborator Author

epage commented Mar 22, 2023

I'm assuming this should build on #103

@epage epage removed this from the 0.5.x milestone Jun 20, 2023
@dnaka91
Copy link
Contributor

dnaka91 commented Aug 2, 2023

I'm currently experimenting with combining miette and winnow, to create nice error messages. Maybe this can help as a starting point and basis for further discussion on rough edges.

My parser is for a custom schema language that is very Rust inspired. I'm doing a few things to create some context, to have something to print in case of an error.

  • Most parser functions add a StrContext::Label context, to create some sort of scope, a tree-structure that describes the structure that's being parsed.
  • In addition, I add a StrContext::Expected with StrContextValue::Description where it makes sense, to roughly describe what content was expected.

My integration with miette highly depends on these two context values being present, so without those, there isn't much to print.

Please suggest alternatives if this is not the way to use the StrContext and there are better alternatives (like custom errors maybe?).

use miette::{IntoDiagnostic, LabeledSpan, MietteDiagnostic, Report, Result, Severity};
use winnow::{error::StrContext, Parser};

impl<'a> Schema<'a> {
    pub fn parse(input: &'a str) -> Result<Self> {
        use winnow::Parser;

        parser::parse_schema.parse(input).map_err(|e| {
            let offset = e.offset();
            let inner = e.inner();

            let scopes = inner
                .context()
                .filter_map(|ctx| {
                    if let StrContext::Label(label) = ctx {
                        Some(label)
                    } else {
                        None
                    }
                })
                .collect::<Vec<_>>();

            // Very specific formatting of the labels, this forms messages like 
            // - "invalid struct name"
            // - "invalid enum name"
            // ...
            let message = match scopes.as_slice() {
                [&"name", element, ..] => format!("invalid {element} name"),
                [first, ..] => format!("invalid {first}"),
                [] => "invalid schema".to_owned(),
            };

            let mut diagnostic = MietteDiagnostic::new(message)
                .with_severity(Severity::Error)
                .with_code("stef::parse::invalid_schema");

            // Just take the most "inner" label at the moment
            // Would love to add them all, but still have to find a way
            // to carry around the offsets or byte ranges
            let label = inner.context().find_map(|ctx| {
                if let StrContext::Expected(value) = ctx {
                    Some(LabeledSpan::at_offset(offset, format!("expected {value}")))
                } else {
                    None
                }
            });

            if let Some(label) = label {
                diagnostic = diagnostic.with_label(label);
            }

            Report::new(diagnostic).with_source_code(input.to_owned())
        })
    }
}

This is one of my schemas, with an invalid struct name. It must start with an uppercase ASCII letter, but this doesn't.

struct sample {
    options: [bool; 5] @1,
}

Then I get this nice error output with the above logic (originally colored):

Error: parse::invalid_schema

  × failed parsing schema file
  ╰─▶ invalid struct name
   ╭─[1:1]
 1 │ struct sample {
   ·        ▲
   ·        ╰── expected uppercase ASCII letter, optionally followed by alphanumeric ASCII characters
 2 │     options: [bool; 5] @1,
   ╰────

I only have played with this for 1–2 hours now. Hopefully this gives some starting point for the docs or some deeper integration even.

A few points that I struggle with, currently:

  • It would be great to have an easy way to have some form of auto-context for parsers. For example, taken the above field definition options: [bool; 5] @1. Ff the colon : is missing, the field is invalid, but the default parsers won't report any context.
    • Would be superb to have some default context that says something like expected char : here.
    • Currently, it seems as I would have to wrap every default parser function from winnow, to provide some context, creating a lot of boilerplate.
    • Obviously, this could get very opinionated quickly. Which wording is best, which parsers should emit and which shouldn't?
    • There is the risk of having too contexts auto-generated, that be hard to filter out later.
    • Maybe a generic helper function could help just simplify the boilerplate?
  • I can't quite see yet where to fit in the offsets of parent parsers. For example, in the above schema, I'd like to add 2 labels, one for the whole struct sample { line that says this definition is somewhat wrong, and then the exact location for the wrong character.
    • It seems like some sort of custom context + playing with the Stream::checkpoint API might work? Meaning to create some context type that can carry some static label + the offset of the parser when it started.
    • How would I manage to get the offset when an error happened? The context function needs its input upfront, but eventually I want the offset when the error happened. Well, I guess that's the final offset that I can extract at the very end?

@epage
Copy link
Collaborator Author

epage commented Aug 2, 2023

Thats an interesting idea to use the pattern of Labels for providing a more specific message.

I only have played with this for 1–2 hours now. Hopefully this gives some starting point for the docs or some deeper integration even.

Thanks! When someone (me?) does write the example and put it in the Special Topic, I'm thinking it'll be json since that is a commonly understood format and can easily be compared to the existing implementations.

It would be great to have an easy way to have some form of auto-context for parsers. For example, taken the above field definition options: [bool; 5] @1. Ff the colon : is missing, the field is invalid, but the default parsers won't report any context.

If I'm understanding correctly, you are wanting one_of(':') to implicitly have a .context(StrLabel(":"))?

In addition to some of the thoughts you had on this,

  • This would dramatically slow down parsing because every error now allocates which you'll then throw away whenever you backtrack and try a new parser
  • Winnow is too generalized to do that. one_of doesn't even really know its working with char and can't then convert that into a &'static str

In toml_edit, I tend to do ':'.cut().context(...) where I can but likely haven't done it enough and likely have more places where I can't know for sure I can cut so I don't want to add context.

I can't quite see yet where to fit in the offsets of parent parsers. For example, in the above schema, I'd like to add 2 labels, one for the whole struct sample { line that says this definition is somewhat wrong, and then the exact location for the wrong character.

I agree. In toml, I want to have a span for both the element being parsed and the exact spot that failed. I was thinking if playing with that with #96 since some of that extra context is associated with where I'd most likely be doing error recovery at.

@dnaka91
Copy link
Contributor

dnaka91 commented Aug 3, 2023

This would dramatically slow down parsing because every error now allocates which you'll then throw away whenever you backtrack and try a new parser

I wasn't away it allocates. I thought, as it takes a &'static str and wraps the original parser, it would all live on the stack until an actual error happens. But good to know!

Winnow is too generalized to do that. one_of doesn't even really know its working with char and can't then convert that into a &'static str

True. I was just hoping to keep my parsers compact and easy to grep with the current version. Once I add some context to every little thing it'll be much harder to comprehend. But if that's the price to get nice detailed errors, it's probably okay.

In toml_edit, I tend to do ':'.cut().context(...) where I can but likely haven't done it enough and likely have more places where I can't know for sure I can cut so I don't want to add context.

Oh that is a nice alternative over my current cut_err wrapping that I do in many places. Will definitely give that a try.


So it seems like my best bet in extending the error reporting might be a custom error?

Currently, I'm not tracking any spans, but I'll have to attach them to each element at some point anyway. Then I have to wrap the parser in a Located to get spans and accept the performance loss (roughly 20-25% lats time I experimented with it).

My idea is to actually not use the context wrapper method, but instead build a custom error and collect all needed information in a map_err call, where I do pretty much nothing on a Backtrace error and only collect extra info on Cut errors.

I'll experiment with that over the next days and let you know if I could succeed in that 😄

@epage
Copy link
Collaborator Author

epage commented Aug 3, 2023

I wasn't away it allocates. I thought, as it takes a &'static str and wraps the original parser, it would all live on the stack until an actual error happens. But good to know!

To support multiple Context values being stored, ContextError stores them in a Vec. Wrapping values requires wrapping types which would be burdensome for the user to be typing out

Currently, I'm not tracking any spans, but I'll have to attach them to each element at some point anyway. Then I have to wrap the parser in a Located to get spans and accept the performance loss (roughly 20-25% lats time I experimented with it).

Was the performance loss with 0.4 or 0.5? 0.5 dramatically improved performance for Located though I can see there still being a loss

My idea is to actually not use the context wrapper method, but instead build a custom error and collect all needed information in a map_err call, where I do pretty much nothing on a Backtrace error and only collect extra info on Cut errors.

Be sure to watch the performance of your custom error. If you add too many fields, it can lead to a performance hit and you might end up needing to put some things on the heap to overcome it.

epage added a commit to epage/winnow that referenced this issue Jun 6, 2024
@epage epage closed this as completed in 6268cf4 Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: Documentation A-error Area: error reporting C-enhancement Category: Raise on the bar on expectations
Projects
None yet
Development

No branches or pull requests

2 participants