-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xgettext: Avoid collisions when splitting .pot files #243
base: main
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Ah, the PR is somwhat large so let me know if you want me to split it into 2-3 commits. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #243 +/- ##
==========================================
+ Coverage 84.87% 85.20% +0.33%
==========================================
Files 15 16 +1
Lines 3366 3455 +89
Branches 3366 3455 +89
==========================================
+ Hits 2857 2944 +87
Misses 422 422
- Partials 87 89 +2 ☔ View full report in Codecov by Sentry. |
PR google#77 added the first half of a functionality requested in google#67, i.e. being able to split the catalog across a tree of POT files. However, there are edge cases where the generated files can have identical path relative to the po/ directory, with the effect of collapsing messages for unrelated parts or chapters in the same POT fie. This commit handles deduplication through a "black box" struct `UniquePathBuilder`, and adds a doctest and a unit test for collision scenarios. The new version is backward-compatible for cases that are collision-free in the old version. Contextually, simplify the recursion logic in `get_subcontent_for_chapter`.
5ecba91
to
aa7f82f
Compare
i18n-helpers/src/xgettext.rs
Outdated
catalogs.insert(summary_destination, catalog); | ||
let mut unique_path_builder = UniquePathBuilder::new(depth); | ||
unique_path_builder.maybe_push("summary"); | ||
unique_path_builder.maybe_push("summary"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a duplicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually intended, to keep the previous behavior: for depth 2, the summary is stored within summary/summary.pot
(so the first push is for the directory, the second for the file name). For depth 1 instead this results in a top-level summary.pot
.
i18n-helpers/src/xgettext.rs
Outdated
None | Some(&0) => self.current_path.clone(), | ||
Some(&cnt) => { | ||
let mut file_name = self.current_path.file_name().unwrap().to_os_string(); | ||
file_name.push(format!("_{}", cnt)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer -
over _
in filenames, where possible. So could we do it like this instead:
file_name.push(format!("_{}", cnt)); | |
file_name.push(format!("-{cnt}")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, changed!
i18n-helpers/src/xgettext.rs
Outdated
use std::collections::HashMap; | ||
use std::path::PathBuf; | ||
|
||
pub struct UniquePathBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this idea, very elegant!
Two small things: could the maybe_push
and maybe_pop
methods be called just push
and pop
? I don't think why the maybe_
prefix is necessary since the push/pop calls have a real effect on the internal state, even if it might not be visible in the resulting string right away.
Another thing, could you attach the big doc string on the impl
block below to the struct here? That way the struct is nicely document.
A final small thing: please move the inline paths
module to it's own file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All done, thanks for the suggestions. 🙂
@cip999, could you rebase this PR now? There are some conflicts. |
Hey, I've been looking into #185 and its predecessor #67. While thinking of ways to implement the "reconstruction" of the catalog from smaller pieces, I noticed that the code doesn't handle well books where several part titles and/or chapters have the same name (or, more precisely, are equivalent up to
slug
ging).For example, consider the following
SUMMARY.md
:With a
depth
of 1, the preprocessor generates two POT filesfoo.pot
andbar.pot
(besidesummary.pot
), and the contents of the two Part Foo's (including Chapters 1 and 3) are merged. Since those are unrelated, if not for having the same title, the expected behavior would be to put them in separate files. Note that the titles don't have to be exactly equal, for example the same would happen withPart Foo 1
andPart Foo 2
(since numbers are discarded).This PR should fix this issue. I hope doc comments are explicative enough, otherwise feel free to request clarifications. :)