Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling data errors #17

Open
kasnerz opened this issue Dec 2, 2022 · 0 comments
Open

Handling data errors #17

kasnerz opened this issue Dec 2, 2022 · 0 comments
Labels
bug Something isn't working data Related to datasets

Comments

@kasnerz
Copy link
Owner

kasnerz commented Dec 2, 2022

Due to some data errors, the tables need not to be a perfect MxN rectangle.

It should be considered if these cases should be fixed with some heuristics and if not, how to handle them during export.

See e.g. example # 20 in ToTTo in which a cell containing a dash with a column span of 2 is missing in the original raw data:

table (cf. the row "Neftekhimik Nizhnekamsk")
screen-2022-12-02-16-16-37

ToTTo (excerpt from the example)

[{'column_span': 1, 'is_header': False, 'row_span': 3, 'value': 'Neftekhimik Nizhnekamsk (loan)'},
  {'column_span': 1, 'is_header': False, 'row_span': 1, 'value': '2012–13'},
  {'column_span': 1, 'is_header': False, 'row_span': 2, 'value': 'Russian FNL'},
  {'column_span': 1, 'is_header': False, 'row_span': 1, 'value': '6'},
  {'column_span': 1, 'is_header': False, 'row_span': 1, 'value': '0'},
  {'column_span': 1, 'is_header': False, 'row_span': 1, 'value': '0'},
  {'column_span': 1, 'is_header': False, 'row_span': 1, 'value': '0'},
  {'column_span': 2, 'is_header': False, 'row_span': 1, 'value': ''},
// here another cell of column_span 2 is missing
  {'column_span': 1, 'is_header': False, 'row_span': 1, 'value': '6'},
  {'column_span': 1, 'is_header': False, 'row_span': 1, 'value': '0'}],

output HTML
screen-2022-12-02-16-19-34

@kasnerz kasnerz added the bug Something isn't working label Dec 2, 2022
@kasnerz kasnerz mentioned this issue Jan 5, 2023
@kasnerz kasnerz added the data Related to datasets label Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data Related to datasets
Projects
None yet
Development

No branches or pull requests

1 participant