Skip to content

Commit

Permalink
Force Pyro to timeout quickly and introduce retries
Browse files Browse the repository at this point in the history
We observe that sometimes pyro remote calls are with
time gaps in between which should not happen. We guess
it might be due to network issues. However the packets
still arrive and might be due to TCP retransmission.

Pyro has a timeout property which is not exposed publicly.
This property is set and looped over if timeout error is
returned to fix the issue.

Signed-off-by: Lukas Pukenis <[email protected]>
  • Loading branch information
LukasPukenis committed Jan 31, 2025
1 parent d05c5b4 commit 7b94f4e
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 7 deletions.
Empty file.
34 changes: 27 additions & 7 deletions nat-lab/tests/uniffi/libtelio_proxy.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,33 @@ def __init__(self, name: str, object_uri: str, features: libtelio.Features):

def _handle_remote_error(self, f):
with Proxy(self._uri) as remote:
fn_res = f(remote)
if fn_res is None:
return None
(res, err) = fn_res
if err is not None:
raise Exception(err)
return res
# Pyro does not provide built-in options to configure timeouts and max retries.
# However, in some cases, Pyro may experience delays in reporting events.
# This is likely due to TCP retries, which can be triggered by changes in
# the network interface configuration. To mitigate this issue, we set a
# short timeout and a higher number of retries to bypass the exponential
# growth of TCP retry intervals.
remote._pyroTimeout = 1
total_retries = 10
for i in range(total_retries):
try:
fn_res = f(remote)
if fn_res is None:
return None
(res, err) = fn_res
if err is not None:
raise Exception(err)
return res
except Pyro5.errors.TimeoutError:
print(
f"[{self._name}]: Pyro5 timeout. Retries left: {total_retries-i-1}"
)

# Sleep to avoid immediate retries and allow transient issues to resolve
time.sleep(0.5)
raise RuntimeError(
f"Pyro couldn't complete a remote call after {total_retries} retries"
)

@move_to_async_thread
def shutdown(self, container_or_vm_name: Optional[str] = None):
Expand Down

0 comments on commit 7b94f4e

Please sign in to comment.