Skip to content

Commit

Permalink
Changes related to no-gloss, empty-gloss and no-sense tags. This should
Browse files Browse the repository at this point in the history
now mark words created from Thesaurus without gloss with no-gloss.
Eliminated no-sense tag and fixed markings to use no-gloss where applicable.

Defined a number of new regional tags for various language variants.
Also defined some mappings for new ways of writing language variants.
  • Loading branch information
tatuylonen committed Jul 4, 2022
1 parent dab04d7 commit bec533c
Show file tree
Hide file tree
Showing 4 changed files with 77 additions and 16 deletions.
12 changes: 12 additions & 0 deletions TODO
Original file line number Diff line number Diff line change
Expand Up @@ -449,3 +449,15 @@ PLAN for 7/2022:
"intelligence -- see intelligence" in wit/English/Translations)
- make it easier to control tag bleed in inflection tables
- code refactoring to make maintenance easier (especially page.py)

Moving tags to external files:
- create directory wiktextract/wiktextract/data
- language-specific data would go under
wiktextract/wiktextract/data/<editioncode> (e.g., "en" is an <editioncode>)
- non-language-specific data (e.g., tags) would go under
wiktextract/wiktextract/data/shared
- above directories need to be included in pypi distributions and
must be installed; when used, code must properly search for them
using pkg_resources package (see luaexec.py in wiktextract for an example)

-
8 changes: 6 additions & 2 deletions wiktextract/page.py
Original file line number Diff line number Diff line change
Expand Up @@ -958,7 +958,7 @@ def push_sense():
tags = sense_data.get("tags", ())
if (not sense_data.get("glosses") and
"translation-hub" not in tags and
"no-senses" not in tags):
"no-gloss" not in tags):
return False

if (("participle" in sense_data.get("tags", ()) or
Expand All @@ -979,6 +979,10 @@ def push_sense():
data_extend(ctx, sense_data, "alt_of", lst)
data_extend(ctx, sense_data, "tags", tags)

if (not sense_data.get("glosses") and
"no-gloss" not in sense_data.get("tags", ())):
data_append(ctx, sense_data, "tags", "no-gloss")

pos_datas.append(sense_data)
sense_data = {}
return True
Expand Down Expand Up @@ -1777,7 +1781,7 @@ def outer_template_fn(name, ht):
push_sense() # Make sure unfinished data pushed, and start clean sense
if not pos_datas:
data_extend(ctx, sense_data, "tags", common_tags)
data_append(ctx, sense_data, "tags", "no-senses")
data_append(ctx, sense_data, "tags", "no-gloss")
push_sense()

def parse_inflection(node, section, pos):
Expand Down
Loading

0 comments on commit bec533c

Please sign in to comment.