You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Scylla version: 2025.1.0~dev-20250117.1ef2d9d07692 with build-id 4f745f675a915d8e8f5658b3cd49497f5ee65c50
Kernel Version: 6.8.0-1021-aws
Issue description
Such an error occurred during disrupt_decommission_streaming_err nemesis. Need to increase timeouts in FailedDecommissionOperationMonitoring class
2025-01-19 00:28:25.850: (DisruptionEvent Severity.ERROR) period_type=end event_id=6def3916-9dd4-4e37-8431-e4a0cf85c2bc duration=51m18s: nemesis_name=DecommissionStreamingErr target_node=Node longevity-twcs-48h-master-db-node-e42ded24-4 [3.254.53.253 | 10.4.8.101] errors=Wait for: Waiting decommission is finished for longevity-twcs-48h-master-db-node-e42ded24-4...: timeout - 1500 seconds - expired
Traceback (most recent call last):
File "/home/ubuntu/scylla-cluster-tests/sdcm/nemesis.py", line 4096, in start_and_interrupt_decommission_streaming
ParallelObject(objects=[trigger, watcher], timeout=full_operations_timeout).call_objects()
File "/home/ubuntu/scylla-cluster-tests/sdcm/utils/common.py", line 524, in call_objects
return self.run(lambda x: x(), ignore_exceptions=ignore_exceptions)
File "/home/ubuntu/scylla-cluster-tests/sdcm/utils/common.py", line 503, in run
raise ParallelObjectException(results=results)
sdcm.utils.common.ParallelObjectException: functools.partial(<bound method BaseNode.run_nodetool of <sdcm.cluster_aws.AWSNode object at 0x710f00114250>>, sub_cmd='decommission', timeout=900, warning_event_on_exception=(<class 'Exception'>,), long_running=True, retry=0):
Traceback (most recent call last):
File "/home/ubuntu/scylla-cluster-tests/sdcm/utils/common.py", line 487, in run
result = future.result(time_out)
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 460, in result
raise TimeoutError()
concurrent.futures._base.TimeoutError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/scylla-cluster-tests/sdcm/wait.py", line 70, in wait_for
res = retry(func, **kwargs)
File "/usr/local/lib/python3.10/site-packages/tenacity/__init__.py", line 404, in __call__
do = self.iter(retry_state=retry_state)
File "/usr/local/lib/python3.10/site-packages/tenacity/__init__.py", line 360, in iter
raise retry_exc.reraise()
File "/usr/local/lib/python3.10/site-packages/tenacity/__init__.py", line 194, in reraise
raise self
tenacity.RetryError: RetryError[<Future at 0x710eecd6b370 state=finished returned bool>]
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ubuntu/scylla-cluster-tests/sdcm/nemesis.py", line 5497, in wrapper
result = method(*args[1:], **kwargs)
File "/home/ubuntu/scylla-cluster-tests/sdcm/nemesis.py", line 4175, in disrupt_decommission_streaming_err
self.start_and_interrupt_decommission_streaming()
File "/home/ubuntu/scylla-cluster-tests/sdcm/nemesis.py", line 4090, in start_and_interrupt_decommission_streaming
with ignore_stream_mutation_fragments_errors(), ignore_raft_topology_cmd_failing(), \
File "/home/ubuntu/scylla-cluster-tests/sdcm/utils/topology_ops.py", line 58, in __exit__
wait_for(func=lambda: not self.is_node_decommissioning(), step=15,
File "/home/ubuntu/scylla-cluster-tests/sdcm/wait.py", line 86, in wait_for
raise raising_exc from ex
sdcm.exceptions.WaitForTimeoutError: Wait for: Waiting decommission is finished for longevity-twcs-48h-master-db-node-e42ded24-4...: timeout - 1500 seconds - expired
Impact
No direct impact, SCT issue
How frequently does it reproduce?
Describe the frequency with how this issue can be reproduced.
OS / Image: ami-0d3649fe0e81d5c8d (NO RUNNER: NO RUNNER)
Test: longevity-twcs-48h-test
Test id: e42ded24-b9d6-4a5b-a6af-eb412c12ce5c
Test name: scylla-master/tier1/longevity-twcs-48h-test
Test method: longevity_twcs_test.TWCSLongevityTest.test_custom_time
Test config file(s):
Packages
Scylla version:
2025.1.0~dev-20250117.1ef2d9d07692
with build-id4f745f675a915d8e8f5658b3cd49497f5ee65c50
Kernel Version:
6.8.0-1021-aws
Issue description
Such an error occurred during disrupt_decommission_streaming_err nemesis. Need to increase timeouts in FailedDecommissionOperationMonitoring class
Impact
No direct impact, SCT issue
How frequently does it reproduce?
Describe the frequency with how this issue can be reproduced.
Installation details
Cluster size: 4 nodes (i3en.2xlarge)
Scylla Nodes used in this run:
OS / Image:
ami-0d3649fe0e81d5c8d
(NO RUNNER: NO RUNNER)Test:
longevity-twcs-48h-test
Test id:
e42ded24-b9d6-4a5b-a6af-eb412c12ce5c
Test name:
scylla-master/tier1/longevity-twcs-48h-test
Test method:
longevity_twcs_test.TWCSLongevityTest.test_custom_time
Test config file(s):
Logs and commands
$ hydra investigate show-monitor e42ded24-b9d6-4a5b-a6af-eb412c12ce5c
$ hydra investigate show-logs e42ded24-b9d6-4a5b-a6af-eb412c12ce5c
Logs:
Jenkins job URL
Argus
The text was updated successfully, but these errors were encountered: