Skip to content

Commit

Permalink
Fix a stupid bug regarding "sounds" and pos
Browse files Browse the repository at this point in the history
Is there anything wrong than seeing a piece of code, thinking to yourself
"that's not right, that can't work?", searching for a keyword in there to
figure out what the original commit was that introduced this code, and then
seeing your own name on that commit?

Yeah, this piece of code of has a small error that basically invalidated the
whole purpose of the changes made. Happily, the error was to let through
everything, which is still *ok*, but the desired result was the filter
certain things.

The error was: I created a function to remove "pos" fields from "sounds"
data when adding that data to a list. The first part of the block was about
filtering out "sounds" data that didn't match the word or the forms of the
word we were being processed, and the *second* part of the block was about
filtering for wrong "pos" data and then removing the "pos" sections.. But I
used the pos-removing function in the first part of the block, which meant
there were no pos-sections to compare in the second. Because there was
not pos-data, all the sounds were let through.
  • Loading branch information
kristian-clausal committed Jan 4, 2024
1 parent 1a4f1de commit 8b3fbbb
Showing 1 changed file with 12 additions and 11 deletions.
23 changes: 12 additions & 11 deletions src/wiktextract/extractor/en/page.py
Original file line number Diff line number Diff line change
Expand Up @@ -773,7 +773,7 @@ def parse_language(
have_etym = False
stack: list[str] = [] # names of items on the "stack"

def merge_base(data, base):
def merge_base(data: WordData, base: WordData) -> None:
for k, v in base.items():
# Copy the value to ensure that we don't share lists or
# dicts between structures (even nested ones).
Expand All @@ -785,17 +785,18 @@ def merge_base(data, base):
if data[k] == v:
continue
if isinstance(data[k], (list, tuple)) or isinstance(
v, (list, tuple)
v, (list, tuple) # Should this be "and"?
):
data[k] = list(data[k]) + list(v)
data[k] = list(data[k]) + list(v) # type: ignore
elif data[k] != v:
wxr.wtp.warning(
"conflicting values for {} in merge_base: "
"{!r} vs {!r}".format(k, data[k], v),
sortid="page/904",
)

def complementary_pop(pron, key):
def complementary_pop(pron: WordData, key: str
) -> WordData:
"""Remove unnecessary keys from dict values
in a list comprehension..."""
if key in pron:
Expand All @@ -806,19 +807,19 @@ def complementary_pop(pron, key):
# does not match "word" or one of "forms"
if "sounds" in data and "word" in data:
accepted = [data["word"]]
accepted.extend(f["form"] for f in data.get("forms", ()))
accepted.extend(f["form"] for f in data.get("forms", dict())) # type:ignore
data["sounds"] = list(
complementary_pop(s, "pos")
for s in data["sounds"]
if "form" not in s or s["form"] in accepted
s
for s in data["sounds"] # type:ignore
if "form" not in s or s["form"] in accepted # type:ignore
)
# If the result has sounds, eliminate sounds that have a pos that
# does not match "pos"
if "sounds" in data and "pos" in data:
data["sounds"] = list(
s
for s in data["sounds"]
if "pos" not in s or s["pos"] == data["pos"]
complementary_pop(s, "pos") # type:ignore
for s in data["sounds"] # type: ignore
if "pos" not in s or s["pos"] == data["pos"] # type:ignore
)

def push_sense():
Expand Down

0 comments on commit 8b3fbbb

Please sign in to comment.