Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support idempotent batch uploads from a directory of files #14

Open
joshed-io opened this issue Sep 4, 2014 · 1 comment
Open

Support idempotent batch uploads from a directory of files #14

joshed-io opened this issue Sep 4, 2014 · 1 comment

Comments

@joshed-io
Copy link
Contributor

A poor man's way to get checkpointing would be to chop up large files into smaller files, each with less then --batch-size events. There are probably tools for this, or could put it in the CLI. All the small files would go into a directory. The directory would be passed into the CLI and each file would be processed serially. After each file's events are acked by the API the file could be deleted or a marker file could be written out that indicates not to process it again. This would allow a user to re-run the same import command in the face of errors with idempotence.

If the files are larger than batch size this strategy still mostly works but duplicates would be added if a batch were to fail mid-file.

@joshed-io
Copy link
Contributor Author

Discussed further on this developer group post

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant