Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MarkItDown in the benchmark #14

Open
hongbo-miao opened this issue Jan 8, 2025 · 2 comments
Open

Add MarkItDown in the benchmark #14

hongbo-miao opened this issue Jan 8, 2025 · 2 comments

Comments

@hongbo-miao
Copy link

it would be great to add the new https://github.com/microsoft/markitdown in the benchmark, thanks! ☺️

@ouyanglinke
Copy link
Collaborator

Hi, we tried to run the model infer for MarkItDown but got empty results. Please let us know if there are any issues in the infer code.

Here is the moder infer code:

from markitdown import MarkItDown
import os

img_folder = './OmniDocBench/images'
save_path = './result0106/markitdown'
md = MarkItDown()

for img_name in os.listdir(img_folder):
    result = md.convert(os.path.join(img_folder, img_name))
    response = result.text_content
    with open(os.path.join(save_path, img_name[:-4] + '.md'), 'w', encoding='utf-8') as output_file:
        output_file.write(response)

@hongbo-miao
Copy link
Author

Hi @ouyanglinke I haven't got chance to try markitdown myself. It is quite new.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants