-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the language type within each tag independently determined to decide whether or not to translate the content inside that tag block. #514
Comments
Please describe your use case and detailed description what you expectations about how it would works. Answer next questions:
|
What problem do you have?When the page contains more than one language at the same time, the target language already in the page will be translated repeatedly For example, for YouTube: Translated original comments and recommended video names (there is a loss of information, you can notice that some words that were already in Simplified Chinese are missing): Why you need this change and how this change will solve your problem?Repeated translations can sometimes cause the meaning of the language to change (although in some cases, the translation from Traditional Chinese to Simplified Chinese usually does not change the meaning) What exactly behavior you expect? Describe as many details as possibleI hope that the parts of YouTube's comment text and recommended video name text that already conform to the target language will not be translated. I think the way to achieve this goal is to provide functions: "Specific web element tag names or specific web element classes for specific websites." Additional judgment is made to determine the language type of the included text, and these texts are translated separately from the existing process and then assembled back into the overall translation result." For example, the YouTube comment web element in the picture has the label name "ytd-comment-thread-renderer" and the class "style-scope ytd-item-section-renderer". We may be able to use this information to locate a specific website ( For example, a specific web element of https://www.youtube.com/*), then first check what language the text in the web element is, and then translate the text. All web elements that are defined as needing special treatment like this After processing, remove these elements from Linguist's existing workflow, as if they never existed on this web page, and then execute what Linguist is doing now. The configuration I am currently using (English->Chinese): Extension: Let the user click F12 to determine the web element where the text of interest is located, and enter the required web element tag name or web element class, which may not be user-friendly. So you can learn uBlock Origin and add a web element locator. To make this process user-friendly, but not necessary screenshot of uBlock Origin's "web element locator": Then all comments really disappear, which means that this rule can indeed target all similar web elements: As shown above, Linguist may also be able to add a UI to store rules like this, and then perform re-judgment of text types on such rules. Although the element locator seems to do something to allow the element to It is accurately positioned and is not exactly the same as the simple web element tag name or web element class. |
Ok, let's clarify if i correct understand you.
Is this correct? If yes, then i have question about your opinion for alternative ways how to solve a problem. As maintainer i have to prefer the most simple and straightforward solutions, so we should talk about all possible ways how to solve a problem, then pick the best one, but anyway the idea with rules picker may be implemented in future as part of other features, thanks for idea. So what you think, maybe there are other ways to solve a problem? For example, it is technically possible to implement option "do not translate texts on target language", that will force language detection on every text element and try to detect its language, then in case the language is the same as target language - text element will be ignored by translator. Keep in mind that we use language detection implemented by browser, so it may works bad for short texts, so in result text will not be detected and translator will consider such texts as subjects of translation |
@XM-8JD2 ok then current issue will a request for feature, to implement option "do not translate texts on target language". I think we may make this option enabled by default. About your question in last sentence, please create a new issue, to improve problem visibility and its tracking. |
Describe the enhancement
Make the language type within each tag independently determined to decide whether or not to translate the content inside that tag block.
For example, on YouTube, the language of each comment may not be the same, and the language of each recommended video may also differ. By independently identifying their language based on their separate ID blocks or class blocks, it can prevent the unnecessary translation of languages I am familiar with and avoid potential information loss caused by this.
The same page has different languages distributed in different blocks:
Each comment is in a different block:
Perhaps it could learn from uBlock Origin by adding a feature that allows selecting a block within the page and adjusting its coverage area, with the ability to quickly undo changes if a mistake is made.
The text was updated successfully, but these errors were encountered: