Add MarkItDown in the benchmark #14

hongbo-miao · 2025-01-08T08:16:19Z

it would be great to add the new https://github.com/microsoft/markitdown in the benchmark, thanks! ☺️

ouyanglinke · 2025-01-09T09:16:47Z

Hi, we tried to run the model infer for MarkItDown but got empty results. Please let us know if there are any issues in the infer code.

Here is the moder infer code:

from markitdown import MarkItDown
import os

img_folder = './OmniDocBench/images'
save_path = './result0106/markitdown'
md = MarkItDown()

for img_name in os.listdir(img_folder):
    result = md.convert(os.path.join(img_folder, img_name))
    response = result.text_content
    with open(os.path.join(save_path, img_name[:-4] + '.md'), 'w', encoding='utf-8') as output_file:
        output_file.write(response)

hongbo-miao · 2025-01-20T03:10:24Z

Hi @ouyanglinke I haven't got chance to try markitdown myself. It is quite new.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MarkItDown in the benchmark #14

Add MarkItDown in the benchmark #14

hongbo-miao commented Jan 8, 2025

ouyanglinke commented Jan 9, 2025

hongbo-miao commented Jan 20, 2025

Add MarkItDown in the benchmark #14

Add MarkItDown in the benchmark #14

Comments

hongbo-miao commented Jan 8, 2025

ouyanglinke commented Jan 9, 2025

hongbo-miao commented Jan 20, 2025