You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To get around file size limitations for up/download to TDAI, Data Commons, large files can be split into smaller files. E.g. a 5GB file called dummyfile_5GB.txt can be split into 100MB chunks with:
split -b 100M dummyfile_5GB.txt dummyfile_5GB_part_
mkdir dummyfile_5GB_parts
mv dummyfile_5GB_part_* dummyfile_5GB_parts
then upload the directory containing the split file parts to Data Commons:
dva upload dummyfile_5GB_parts <doi> --url https://datacommons.tdai.osu.edu/ # with API token as env variable
(the original file's md5 checksum should also be uploaded)
After a dataset consumer downloads, this could be rejoined into the original file with cat dummyfile_5GB_part_* > dummyfile_5GB.txt, and a checksum could be calculated and compared.
In a test, this led to the following uninformative error after 2 parts had been uploaded:
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Scripts\dva.exe\__main__.py", line 7, in <module>
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\dva\cli.py", line 93, in main
cli()
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\click\core.py", line 1130, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\click\core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\dva\cli.py", line 74, in upload
api.upload_file(doi, path)
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\dva\api.py", line 67, in upload_file
raise APIException(f"Uploading failed with status {status}.")
dva.api.APIException: Uploading failed with status ERROR.
The text was updated successfully, but these errors were encountered:
To get around file size limitations for up/download to TDAI, Data Commons, large files can be split into smaller files. E.g. a 5GB file called
dummyfile_5GB.txt
can be split into 100MB chunks with:then upload the directory containing the split file parts to Data Commons:
(the original file's md5 checksum should also be uploaded)
After a dataset consumer downloads, this could be rejoined into the original file with
cat dummyfile_5GB_part_* > dummyfile_5GB.txt
, and a checksum could be calculated and compared.In a test, this led to the following uninformative error after 2 parts had been uploaded:
The text was updated successfully, but these errors were encountered: