Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix: clean_value returns "|foo" from link with 3+ parameters
Issue found when parsing "[[Fichier:...|...|...]]" image links from French WikiPEDIA. Because 'Fichier' was not recognised as a skipped prefix (needs to be addressed), the link was handled by the default regex handler, which never used the value of `m.group(4)`, because only File: and Image: 'links' actually have that third parameter; this is why this was not caught. XXX: Localized data to replace "File:" and "Image:" prefixed with "Fichier:" etc. XXX: A toggle in clean_value and clean_node whether to collect image link data?
- Loading branch information