-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement load_from_bytes #156
base: master
Are you sure you want to change the base?
Conversation
Blocked on #139 |
src/yaml.rs
Outdated
// detect_utf16_endianness. | ||
let (res, _) = encoding::types::decode( | ||
&buffer, | ||
encoding::DecoderTrap::Replace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkmik I'd like to understand why this decoder trap is sufficient in all cases. I was expecting it to be necessary to allow the user to choose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point; probably strict is a better default and probably the user would want to be able to override this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(changed to Strict)
@glyn what's the sanest way to design an API that would let a user change the decoder error trapping behaviour?
a. Should we make it a mandatory parameter?
b. An Option<>?
c. Two methods, one with defaults and one with some config structure that includes the encoding trap?
d. a "decoder" struct with a "decode" method and a builder pattern to set options:
Decoder::read(file).encoding_trap(mytrap).decode()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite like (d) as it gives a clean API when you only want the default (strict) behaviour. I could live with (a) or (c). I don't like (b) as it requires an ugly None
parameter for the default behaviour.
since this MR introduce a new crate dependency, and users can easily implement this helper method in their code. I'm not sure wether we should provide this in yaml crate |
I think it would be worth including this to ensure consistency of behaviour between users. Also, providing it here makes it more likely that users will consider handling an initial BOM in the first place (rather than finding/fixing a bug). |
yes; more explicitly: it's likely that if this library doesn't do it, the users of this library won't do it either and thus effectively their application won't be compliant with the YAML specification. The reason I think that's likely, is that in my experience most people either are not aware of this detail of the spec and/or don't think UTF-16 or UTF-8 with BOMs is important at all (e.g. "I never saw some UTF-16 for years, so clearly nobody is using it"). I think adding a small crate dependency is a small price to pay for ensuring that the ecosystem built around this library follows the spec. |
pls rebase this MR to fix CI |
Closes chyh1990#155 Also helps in some cases with chyh1990#142, when the BOM is at the beginning of the file (common), but not in corner case where the BOM is at the start of a document which is not the first one.
@glyn I implemented (d), PTAL |
Looks reasonable. One downside of this approach now becomes apparent. In:
|
I'm not a rust expert but that's not how this builder pattern works, I followed https://doc.rust-lang.org/1.0.0/style/ownership/builders.html#consuming-builders: which doesn't use references. this means that I cannot call
using
but then I saw the official documentation example with |
CI fails because of a transient error
amending and force pushing to trigger a new CI run |
@mkmik
would have issues because self can't necessarily be copied. In particular, the |
Sorry, I don't understand. This code works:
if I understand correctly, it's not copying self, but it's transferring ownership to 'new', which is mutable so I can now modify the trap field and return the modified value. The caller of (sorry if I'm missing the point; I'm unfamiliar with rust in particular, although I have some familiarity with type systems and compiler internals, so almost kinda sorta grasp how this things works, but mostly I have no idea about what's "idiomatic") |
@mkmik I'm just getting back into Rust after a long break, so apologies for my lack of understanding. I'm not sure about:
You're quite right that |
src/yaml.rs
Outdated
} | ||
} | ||
|
||
pub fn encoding_trap(mut self, trap: encoding::types::DecoderTrap) -> YamlDecoder<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkmir @17cupsofcoffee suggested something like the following might be better for a builder:
pub fn encoding_trap(mut self, trap: encoding::types::DecoderTrap) -> YamlDecoder<T> { | |
pub fn encoding_trap(&mut self, trap: encoding::types::DecoderTrap) -> &mut Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks; PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable, thanks. I think it's now up to the project maintainers to decide whether they like the style.
Is this still being worked on? |
@SenseTime-Cloud @dtolnay Is anyone looking to merge PRs these days? |
Not me — this crate is no longer used by serde_yaml. |
FWIW I've merged (and various other PRs) this into my fork: https://github.com/davvid/yaml-rust/ |
Closes #155
Also helps in some cases with #142, when the BOM is at the beginning of the file (common),
but not in the corner case where the BOM is at the start of a document which is not the first one.
CC @glyn