You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation uses a single PUT operation to sore a checkpoint blob into S3 bucket. During initial testing with DEVNET data an issue was encountered when storing the checkpoint 0. Its size is around 12GB.
Upload an object in a single operation by using the AWS SDKs, REST API, or AWS CLI – With a single PUT operation, you can upload a single object up to 5 GB in size.
To mitigate this limitation we could use a multipart upload, it involves splitting the checkpoint blob into separate chunks and send them to the S3 bucket.
The current implementation uses a single PUT operation to sore a checkpoint blob into S3 bucket. During initial testing with DEVNET data an issue was encountered when storing the checkpoint 0. Its size is around 12GB.
Upload an object in a single operation by using the AWS SDKs, REST API, or AWS CLI – With a single PUT operation, you can upload a single object up to 5 GB in size.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html
To mitigate this limitation we could use a multipart upload, it involves splitting the checkpoint blob into separate chunks and send them to the S3 bucket.
The following AWS SDK example explores this solution: https://github.com/awsdocs/aws-doc-sdk-examples/blob/main/rustv1/examples/s3/src/bin/s3-multipart-upload.rs
since the current implementation uses the object-store crate as a unified interface there's an alternative put operation which uses multipart upload: https://docs.rs/object_store/latest/object_store/multipart/trait.MultipartStore.html#tymethod.put_part
The text was updated successfully, but these errors were encountered: