Problem with loss function when training with 512x512 data #56

playmakerbugger · 2022-01-10T13:24:15Z

I am training on a 512 x 512 set of data, using 4 gpus. I am having a stall during the loss process where it locks during the calculation of the loss. It gives a user warning of "was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector."

Is there a way to split the loss function to work on multiple gpus with the DataParallelwithCallback? I noticed that the loss is only calculating on gpu 0.

AliaksandrSiarohin · 2022-01-10T14:04:51Z

I guess bs is too large.

playmakerbugger · 2022-01-10T17:59:47Z

I was using a batch size of 6. Should I lower it? Because it says it even on a batch size as low as 2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with loss function when training with 512x512 data #56

Problem with loss function when training with 512x512 data #56

playmakerbugger commented Jan 10, 2022

AliaksandrSiarohin commented Jan 10, 2022

playmakerbugger commented Jan 10, 2022 •

edited

Loading

Problem with loss function when training with 512x512 data #56

Problem with loss function when training with 512x512 data #56

Comments

playmakerbugger commented Jan 10, 2022

AliaksandrSiarohin commented Jan 10, 2022

playmakerbugger commented Jan 10, 2022 • edited Loading

playmakerbugger commented Jan 10, 2022 •

edited

Loading