You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your paper,firstly.The pFedSD is a great case for FKD.When I run your code for pFedSD, it always show erros about process communication such as "Process Process-4:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/content/pFedSD/run_gloo.py", line 82, in main
process.run()
File "/content/pFedSD/pcode/workers/worker_pFedSD.py", line 47, in run
self._send_model_to_master()
File "/content/pFedSD/pcode/workers/worker_base.py", line 304, in _send_model_to_master
dist.send(tensor=flatten_model.buffer, dst=0)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/distributed_c10d.py", line 1295, in send
default_pg.send([tensor], dst, tag).wait()
RuntimeError: [../third_party/gloo/gloo/transport/tcp/pair.cc:598] Connection closed by peer [172.28.0.12]:43185".
I will appreciate it if you can give me some tips about this error. Thanks.
The text was updated successfully, but these errors were encountered:
Thanks for your paper,firstly.The pFedSD is a great case for FKD.When I run your code for pFedSD, it always show erros about process communication such as "Process Process-4:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/content/pFedSD/run_gloo.py", line 82, in main
process.run()
File "/content/pFedSD/pcode/workers/worker_pFedSD.py", line 47, in run
self._send_model_to_master()
File "/content/pFedSD/pcode/workers/worker_base.py", line 304, in _send_model_to_master
dist.send(tensor=flatten_model.buffer, dst=0)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/distributed_c10d.py", line 1295, in send
default_pg.send([tensor], dst, tag).wait()
RuntimeError: [../third_party/gloo/gloo/transport/tcp/pair.cc:598] Connection closed by peer [172.28.0.12]:43185".
I will appreciate it if you can give me some tips about this error. Thanks.
The text was updated successfully, but these errors were encountered: