Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

record-stream-uploader periodically throws error "You did not provide the number of bytes specified by the Content-Length HTTP header.", OOMKilled #1131

Open
alex-kuzmin-hg opened this issue Jan 8, 2025 · 0 comments
Labels
Bug A error that causes the feature to behave differently than what was expected based on design docs P1 High priority issue. Required to be completed in the assigned milestone. P2 Required to be completed in the assigned milestone, but may or may not impact release schedule.

Comments

@alex-kuzmin-hg
Copy link
Contributor

Describe the bug

During long load test, record-stream-uploader may be OOMed after throwing stream of errors "You did not provide the number of bytes specified" like this below

kubectl -n solo-alex-kuzmin-n2 logs network-node1-0 -c record-stream-uploader --previous

Cloud Copy Initiated [ service = 'S3', bucket = 'solo-streams', remote_path = 'recordstreams/record0.0.3', filename = '2025-01-08T02_48_16.053066207Z.rcd_sig' ] 
Cloud Copy Initiated [ service = 'S3', bucket = 'solo-streams', remote_path = 'recordstreams/record0.0.3', filename = '2025-01-08T02_48_16.053066207Z.rcd.gz' ] 
Cloud Copy Complete [ service = 'S3', bucket = 'solo-streams', remote_path = 'recordstreams/record0.0.3', filename = '2025-01-08T02_47_50.040672600Z.rcd_sig' ] 
Cloud Copy Timing [ duration_ms = '30506.100', upload_duration_ms = '30316.840', service = 'S3', bucket = 'solo-streams', remote_path = 'recordstreams/record0.0.3', filename = '2025-01-08T02_47_50.040672600Z.rcd_sig' ] 
Cloud Copy Complete [ service = 'S3', bucket = 'solo-streams', remote_path = 'recordstreams/record0.0.3', filename = '2025-01-08T02_47_58.000914340Z.rcd_sig' ] 
Cloud Copy Error [ service = 'S3', watch_directory = '/opt/hgcapp/recordStreams', bucket = 'solo-streams', remote_path = 'recordstreams/record0.0.3', filename = '2025-01-08T02_47_48.038146733Z.rcd_sig' ]: Failed to upload /opt/hgcapp/recordStreams/2025-01-08T02_47_48.038146733Z.rcd_sig to solo-streams/recordstreams/record0.0.3/2025-01-08T02_47_48.038146733Z.rcd_sig: An error occurred (IncompleteBody) when calling the PutObject operation: You did not provide the number of bytes specified by the Content-Length HTTP header. 
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/boto3/s3/transfer.py", line 279, in upload_file
    future.result()
  File "/usr/local/lib/python3.7/dist-packages/s3transfer/futures.py", line 106, in result
    return self._coordinator.result()
  File "/usr/local/lib/python3.7/dist-packages/s3transfer/futures.py", line 265, in result
    raise self._exception
  File "/usr/local/lib/python3.7/dist-packages/s3transfer/tasks.py", line 126, in __call__
    return self._execute_main(kwargs)
  File "/usr/local/lib/python3.7/dist-packages/s3transfer/tasks.py", line 150, in _execute_main
    return_value = self._main(**kwargs)
  File "/usr/local/lib/python3.7/dist-packages/s3transfer/upload.py", line 692, in _main
    client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args)
  File "/usr/local/lib/python3.7/dist-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.7/dist-packages/botocore/client.py", line 661, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (IncompleteBody) when calling the PutObject operation: You did not provide the number of bytes specified by the Content-Length HTTP header.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/mirror.py", line 516, in cloud_copy
    storage.put(local_path, key)
  File "/usr/local/bin/mirror.py", line 1555, in put
    self.__bucket.upload_file(local_path, key, Config=self.__transfer_config)
  File "/usr/local/lib/python3.7/dist-packages/boto3/s3/inject.py", line 209, in bucket_upload_file
    ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
  File "/usr/local/lib/python3.7/dist-packages/boto3/s3/inject.py", line 131, in upload_file
    extra_args=ExtraArgs, callback=Callback)
  File "/usr/local/lib/python3.7/dist-packages/boto3/s3/transfer.py", line 287, in upload_file
    filename, '/'.join([bucket, key]), e))
boto3.exceptions.S3UploadFailedError: Failed to upload /opt/hgcapp/recordStreams/2025-01-08T02_47_48.038146733Z.rcd_sig to solo-streams/recordstreams/record0.0.3/2025-01-08T02_47_48.038146733Z.rcd_sig: An error occurred (IncompleteBody) when calling the PutObject operation: You did not provide the number of bytes specified by the Content-Length HTTP header.
Cloud Copy Complete [ service = 'S3', bucket = 'solo-streams', remote_path = 'recordstreams/record0.0.3', filename = '2025-01-08T02_48_04.057291549Z.rcd_sig' ] 
Cloud Copy Complete [ service = 'S3', bucket = 'solo-streams', remote_path = 'recordstreams/record0.0.3', filename = '2025-01-08T02_47_56.048000973Z.rcd_sig' ] 
Cloud Copy Complete [ service = 'S3', bucket = 'solo-streams', remote_path = 'recordstreams/record0.0.3', filename = '2025-01-08T02_48_06.033271868Z.rcd_sig' ] 

Describe the expected behavior

Should not thro any errors, not OOM-Killed

To Reproduce

Run 6-hours NFT test on Latitude

Additional Context

No response

@alex-kuzmin-hg alex-kuzmin-hg added Bug A error that causes the feature to behave differently than what was expected based on design docs Pending Triage New issue that needs to be triaged by the team labels Jan 8, 2025
@jeromy-cannon jeromy-cannon added P1 High priority issue. Required to be completed in the assigned milestone. P2 Required to be completed in the assigned milestone, but may or may not impact release schedule. and removed Pending Triage New issue that needs to be triaged by the team labels Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug A error that causes the feature to behave differently than what was expected based on design docs P1 High priority issue. Required to be completed in the assigned milestone. P2 Required to be completed in the assigned milestone, but may or may not impact release schedule.
Projects
None yet
Development

No branches or pull requests

2 participants