Skip to content

Commit

Permalink
Merge Test (#118)
Browse files Browse the repository at this point in the history
* Yiddish transliteration via submodules.

* Update checkout workflow.

* Change refs for Yiddish submodules.

* Fix WORKDIR in Dockerfile

* Do not remove yiddish module.

* Manually add yiddish submodules.

* Use git clone instead of submodule.

* Move ext checkout to github actions.

* Chinese numerals (#97)

* WIP Parse Chinese numerals.

* WIP complete number parsing.

* Complete Chinese numerals:

* Use standard table override instead of pre-config hooks.
* Add few test strings.

* Complete numerals:

* Transliterate all numeric examples correctly
* Modify hook return logic for consistency
* WIP partial spacing fix.

* Some cleanup; upgrade docker OS.

* Add dependency for uwsgi.

* Squashed commit of the following: (#98)

commit 30859a5
Author: scossu <[email protected]>
Date:   Wed Feb 28 22:17:36 2024 -0500

    Move ext checkout to github actions.

commit 6d8da6d
Author: scossu <[email protected]>
Date:   Wed Feb 28 21:45:01 2024 -0500

    Use git clone instead of submodule.

commit ade9da5
Author: scossu <[email protected]>
Date:   Wed Feb 28 21:42:45 2024 -0500

    Manually add yiddish submodules.

commit 77cb9ef
Author: scossu <[email protected]>
Date:   Wed Feb 28 21:23:37 2024 -0500

    Do not remove yiddish module.

commit e405b36
Author: scossu <[email protected]>
Date:   Wed Feb 28 09:11:41 2024 -0500

    Fix WORKDIR in Dockerfile

commit 95445ba
Author: scossu <[email protected]>
Date:   Wed Feb 28 09:07:50 2024 -0500

    Change refs for Yiddish submodules.

commit 208ea09
Author: scossu <[email protected]>
Date:   Wed Feb 28 08:45:58 2024 -0500

    Update checkout workflow.

* Add debug output to /trans response.

* Split docker files and requirements.

* Add bad request debug handler.

* Adjust CI workflows.

* Fix image name typo.

* Refine triggers.

* Fix typo on test workflow trigger.

* Use JSON in POST body.

* Also use JSON in feedback request; update docs.

* Return json data in 400 debug.

* Update Aksharamukha.

* Add new set of languages; separate pre and post options in Aksharamukha. (#102)

* Add all remaining Devanagari scripts. (#107)

* Add R2S for Kurdish, Persian, Pushto, Urdu, and bidirectional Divehi.

* Add R2S for Kurdish, Persian, Pushto, Urdu, and bidirectional Divehi. (#108)

* Fix YAML syntax errors.

* P3 legacy mappings (#109)

* Add R2S for Kurdish, Persian, Pushto, Urdu, and bidirectional Divehi.

* Fix YAML syntax errors.

* Fix table section for Divehi.

* P3 legacy mappings (#110)

* Add R2S for Kurdish, Persian, Pushto, Urdu, and bidirectional Divehi.

* Fix YAML syntax errors.

* Fix table section for Divehi.

* Fix mapping for Divehi.

* Add Thai from Randy's table

* Fix YAML errors in Thai alt.

* Fix Tamil YAML.

* Fix Malayalam config.

* Fix Greek numerals logic; add test strings.

* Add Malayalam to index.

* Better exception handling.

* Add CORS to all routes.

* Add MARC codes to language index.

* Fix Greek S2R table.

* Tolerate and normalize nested tokens.

* Add Assamese table.

* Fix char index misalignment after ignoring.
  • Loading branch information
scossu authored Jul 24, 2024
1 parent f74b1ba commit 0146fce
Show file tree
Hide file tree
Showing 15 changed files with 5,742 additions and 68 deletions.
74 changes: 74 additions & 0 deletions legacy/ScriptShifter and MARC language codes - Sheet1.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
ScriptShifter,MARC,Notes
abkhaz_cyrillic,abk,
altai_cyrillic,alt,
arabic,ara,S2R
armenian,arm,
asian_cyrillic,"abk, ady, alt, ava, bak, che, chv, dar, ale, esk, kbd, xal, krc, kaa, krl, kom, kum, lez, lit, chm, nog, oss, rum, rom, sel, udm, sah","No MARC codes found for: Abaza, Aisor, Altai, Azeri, Balkar, Buryat, Chukchi, Dungan, Even, Evenki, Gagauz, Ingush, Inuit, Karachay, Khakass, Khanty, Komi-Permyak, Koryak, Lak, Lapp, Mansi, Molodstov, Mordvin, Nanai, Nenets, Nivkh, Permyak, Shor, Tabasaran, Tat, Tuva, Udekhe"
azerbaijani_cyrillic,aze,
bashkir_cyrillic,bak,
belarusian,bel,
bengali,ben,
bulgarian,bul,
buriat,bua,
burmese,bur,
chinese,chi,
chukchi_cyrillic,?,
church_slavonic,chu,
chuvash_cyrillic,chv,
devanagari,"hin, san",Need to get complete list of languages
dungan_cyrillic,?,
ethiopic,"amh, eth",
even-evenki_cyrillic,?,
gagauz_cyrillic,?,
georgian,geo,
greek_classical,grc,
greek_modern,gre,
gurmukhi,pan,Punjabi (Gurmukhi script)
hebrew,heb,
hindi,hin,
hiragana,jpn,Hiragana
kalmyk_cyrillic,xal,
kara-kalpak_cyrillic,kaa,
karachai-balkar_cyrillic,krc,
karelian_cyrillic,krl,
katakana,jpn,Katakana
kazakh_cyrillic,kaz,
khakass_cyrillic,?,
khanty_cyrillic,?,
komi_cyrillic,kom,
korean_names,kor,Korean S2R for strings ONLY containing personal names formatted as last + first name. Separate multiple names with a comma or a center-dot (U+00B7).
korean_nonames,kor,Korean S2R for strings NOT containing any personal names.
koryak_cyrillic,?,
kyrgyz_cyrillic,kir,
lithuanian_cyrillic,lit,
macedonian,mac,
mansi_cyrillic,?,
moldovan_cyrillic,mol,
mongolian_cyrillic,mon,Cyrillic
mongolian_mongol_bichig,mon,Mongol bichig
mordvin_cyrillic,?,
nenets_cyrillic,?,
ossetic_cyrillic,oss,
pulaar,?,
romani_cyrillic,rom,
russian,rus,
serbian,srp,
shor_cyrillic,?,
syriac_cyrillic,syc,
tajik_cyrillic,tgk,
tamil,tam,
tamil_brahmi,tam,
tamil_extended,tam,
tatar-kryashen_cyrillic,?,
tatar_cyrillic,tat,
thai,tha,
tibetan,tib,
turkmen_cyrillic,tuk,
tuvinian_cyrillic,tyv,
udmurt_cyrillic,udm,
uighur_cyrillic,uig,
ukrainian,ukr,
uzbek_cyrillic,uzb,
yakut_cyrillic,sah,
yiddish,yid,
yuit_cyrillic,?,
Loading

0 comments on commit 0146fce

Please sign in to comment.