You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
The PyTorchModel class in pfl.model.pytorch automatically uploads the model to CUDA if CUDA is available on the system, regardless of whether the user intends to use CPU for FL training. This behavior is unexpected and deviates from standard PyTorch practices, where models remain on CPU unless explicitly moved to CUDA.
Steps to Reproduce:
Ensure CUDA is available on your system.
Initialize a PyTorchModel instance:
fromtorchvision.modelsimportresnet18frompfl.model.pytorchimportPyTorchModelpytorch_model=resnet18(pretrained=False)
pytorch_model.loss=Nonepytorch_model.metrics=Noneprint(f"Original model device: {next(pytorch_model.parameters()).device}")
# Initialize the PFL PyTorch model pfl_pt_model=PyTorchModel(pytorch_model, local_optimizer_create=None, central_optimizer=None)
print(f"PFL model device: {next(pfl_pt_model.pytorch_model.parameters()).device}")
Output of the above code:
Original model device: cpu
PFL model device: cuda:0
Observe that pytorch_model is moved to CUDA despite intending to use CPU.
Expected Behavior:
Similar to centralized training in PyTorch, the model should remain on CPU by default and only move to CUDA when the developer explicitly specifies.
Proposed Solution:
Introduce a device parameter to the PyTorchModel class, allowing users to specify the desired device ('cpu' or 'cuda'). Additionally, provide a .to(device) method to facilitate moving the model as needed.
Suggested Fix: Modify PyTorchModel to accept a device parameter and adjust the get_default_device function to respect user preferences more accurately.
Additional Information:
pfl version: 0.2.0
PyTorch version: 2.0.1
I am willing to work on a Pull Request to implement this fix if the maintainers agree.
The text was updated successfully, but these errors were encountered:
Description:
The
PyTorchModel
class inpfl.model.pytorch
automatically uploads the model to CUDA if CUDA is available on the system, regardless of whether the user intends to use CPU for FL training. This behavior is unexpected and deviates from standard PyTorch practices, where models remain on CPU unless explicitly moved to CUDA.Steps to Reproduce:
Ensure CUDA is available on your system.
Initialize a
PyTorchModel
instance:Output of the above code:
Observe that
pytorch_model
is moved to CUDA despite intending to use CPU.Expected Behavior:
Similar to centralized training in PyTorch, the model should remain on CPU by default and only move to CUDA when the developer explicitly specifies.
Proposed Solution:
Introduce a
device
parameter to thePyTorchModel
class, allowing users to specify the desired device ('cpu'
or'cuda'
). Additionally, provide a.to(device)
method to facilitate moving the model as needed.Affected Code:
In
pfl/model/pytorch.py
:In
pfl/internal/ops/pytorch_ops.py
:Suggested Fix: Modify PyTorchModel to accept a device parameter and adjust the get_default_device function to respect user preferences more accurately.
Additional Information:
I am willing to work on a Pull Request to implement this fix if the maintainers agree.
The text was updated successfully, but these errors were encountered: