-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
English and Spanish optimized? #40
Comments
No, at least not that I know of. But this shouldn't be too hard to optimize. English character frequency is well documented. A big Spanish corpus was cleaned up and analyzed by Ian Doug with my help, see details here: #21 So by weighing the frequencies of characters, bigrams and trigrams you could use Arno's code to optimize for example a keyboard for 50/50 english/spanish, or some other ratio. |
Is there a Spanish letter frequency similar to the English "Letter frequencies (Norvig, 2012)"? |
Derived from https://zenodo.org/record/5501931 |
Hi, masters3d, |
Edit: moved this to a dedicated issueAn alternative solution: excerpt from my original comment
Full original comment
Hello, there! I'm a user of the (programmer's) Dvorak layout for almost a decade now, and it was a huge improvement over good ol' QWERTY to learn it. However, while it is really widespread and readily available on most current systems, its performance for the English language is sub-optimal. Also, its variations for languages with similar alphabets —like my dear Portuguese— are still "super-terrible" (a bit less terrible than QWERTY due to the vowels at the left home row). The elephant in the roomI took a look at some of these newer designs, including yours. Congratulations, by the way! Amazing work. But the OP touched a very important point that is still unaddressed by all of these: we live in an international, interconnected world now. Until the early 2000's, it wasn't a problem to have totally different keyboard layouts for every language. We even used different, incompatible text encodings! But now the most used encoding in both new devices and the Internet is Unicode. I believe the same transition should happen to keyboard layouts. But is there a need for it? Well, most professionals that type a lot (journalists, academics, programmers, etc.) will need to either create content in more than one language, usually in their native one and in English, or at least communicate with foreigners through text often. It applies even to countries that have English as their primary language, like the US, where there's more and more people speaking Spanish as a primary or secondary language each year (> 50 million today). Is an "international" keyboard layout possible?I know that many languages use completely different alphabets and, even when they use similar ones (like variations of the Latin or Cyrillic scripts), they have extra characters and wildly varying letter/n-gram frequencies. Therefore, there can't be a truly international base layout for keyboards. But can we do better? Starting from English, the de facto international language, a non-monolingual layout can't be much distant from ASCII. Looking at the languages with most speakers in the world that use a Latin script alphabet, we have in the top positions (Wikipedia/Ethnologue 2022):
I think it would be feasible to analyze these 5 languages, from two branches of the same language family —you already did it for two, and find a design that isn't awesome for one of them but sucks for all the others... A "Latin" or "Romance-Germanic" base keyboard layoutFor whoever is interested, I propose the development of a base layout using the Latin alphabet that is optimized for all of these 5 languages. It wouldn't be a simple weighted optimization though. What I would expect to achieve with this design is:
Steps necessary to achieve these goals:
Advantages
I'm seriously considering to learn once more a new keyboard layout, but it would have to be a killer layout. It would have to be one to rule them all. I am willing to dedicate some time to this idea if there are others interested. If not, maybe I'll end up trying to create my own Portuguese or Portuguese-English Engram layout. Greetings from Brazil! 🇧🇷 |
I think that this issue could be well addressed by an English-Spanish-French key layout -- see: binarybottle/engram-v2#58 (comment) |
I’m wondering for folks who write English and Spanish if there is common version that they can use.
The text was updated successfully, but these errors were encountered: