Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diff chunks for certain unicode edits incorrect #5

Closed
kitsonk opened this issue Jan 18, 2021 · 2 comments
Closed

diff chunks for certain unicode edits incorrect #5

kitsonk opened this issue Jan 18, 2021 · 2 comments

Comments

@kitsonk
Copy link

kitsonk commented Jan 18, 2021

I am getting unexpected result with diff() when using some unicode chars. Specifically I think the problem is with the 🇺🇸 which is a combinatory sequence of combining 🇺 Regional Indicator Symbol Letter U and 🇸 Regional Indicator Symbol Letter S.

let a = " 🦕🇺🇸👍 ";
let b = " 🇺🇸👍 ";
let actual = diff(a, b);
assert_eq!(actual, vec![
  Chunk::Equal(" "),
  Chunk::Delete("🦕"),
  Chunk::Equal("🇺🇸👍 "),
]);

Which fails with:

thread panicked at 'assertion failed: `(left == right)`
  left: `[Equal(" "), Delete("🦕🇺"), Equal("🇸👍 ")]`,
 right: `[Equal(" "), Delete("🦕"), Equal("🇺🇸👍 ")]`'

I haven't dug into the code to see why it is breaking "incorrectly" across the unicode sequence, but wanted to raise the issue first.

@derekperkins
Copy link

Looks like this is also an issue with the original lib? google/diff-match-patch#59

@dtolnay
Copy link
Owner

dtolnay commented Feb 16, 2023

Fixed in 1.0.6.

@dtolnay dtolnay closed this as completed Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants