We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
当前跨页解析结果生成了多个表格,跨页的生成的表格没有表头数据。 希望跨页表格能生成一个表格或者可以生成多个表格,但是每个表格要有表头。
上传有跨页表格的文档
Linux
3.10
0.10.x
cuda
The text was updated successfully, but these errors were encountered:
这是minerU生成的表格内容 第一页
第二页
Sorry, something went wrong.
我加了个手动处理的逻辑,判断相邻表格,如果表格间没有换行符之外的其他符号,且表格的最大列数一致,则认为这两个表格应该合并。 如果可以的话再加上model.json里边,判断两个表格不在同一页再合并。 不过,最好还是希望框架层面可以解决这个问题
可否咨询一下,如何添加手动处理的逻辑?
No branches or pull requests
Description of the bug | 错误描述
当前跨页解析结果生成了多个表格,跨页的生成的表格没有表头数据。
希望跨页表格能生成一个表格或者可以生成多个表格,但是每个表格要有表头。
How to reproduce the bug | 如何复现
上传有跨页表格的文档
Operating system | 操作系统
Linux
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.10.x
Device mode | 设备模式
cuda
The text was updated successfully, but these errors were encountered: