when doing exp 40-30-10-20 #26

Ultimate-Storm · 2023-02-13T15:59:06Z

Since the dataset on each host is unbalanced, should find out a better way to do sync since on host3 with the smallest dataset always finish the epochs faster and will loop in the local small dataset. Causing overfitting.
Approaches could be:

Try different sync intervals according to different size of ds
try larger epochs for hosts with smaller ds
try different weight learning for hosts. Host with smaller ds should weigh less(gain less weight) during local training
put on_batch_end to on_epoch_end
it would be better that host would be able to wait for others to reach epoch end then do sync(currently I think it would initiate the sync when it reaches the sync frequency with

Ultimate-Storm added this to the Swarm learning ducumentation and trouble-shooting milestone Feb 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

when doing exp 40-30-10-20 #26

when doing exp 40-30-10-20 #26

Ultimate-Storm commented Feb 13, 2023

when doing exp 40-30-10-20 #26

when doing exp 40-30-10-20 #26

Comments

Ultimate-Storm commented Feb 13, 2023