Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writer adapter #77

Closed
jordane95 opened this issue Feb 1, 2024 · 4 comments
Closed

Writer adapter #77

jordane95 opened this issue Feb 1, 2024 · 4 comments

Comments

@jordane95
Copy link
Contributor

Do you think it is a good idea to also add a writer adapter for the jsonl writer?

The reader also has this functionality which greatly improves the flexibility when working with jsonl data with different keys.

@guipenedo
Copy link
Collaborator

Can you give me a specific use case?

@jordane95
Copy link
Contributor Author

I'm using a different format of json as training data. The structure is a little different from the format used in datatrove. The output of the current writer cannot directly fit in my latter pipeline.

For example, we may want to use the key 'meta' instead of 'metadata' and add a 'version' field to denote the version of the data.

@guipenedo
Copy link
Collaborator

Makes sense, will add

@guipenedo
Copy link
Collaborator

Please let me know if this is what you'd like @jordane95 #83

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants