Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding multiple augmentation processes server and client #4

Open
naba89 opened this issue Jul 15, 2021 · 0 comments
Open

Regarding multiple augmentation processes server and client #4

naba89 opened this issue Jul 15, 2021 · 0 comments

Comments

@naba89
Copy link

naba89 commented Jul 15, 2021

Taking the serve-imagenet-shards as an example, I implemented for my own WebDataset. There are a few points I would like to highlight:

  • using multiple workers in the dataloader and providing an address range as zpub://0.0.0.0:788{0..4} results in daemonic processes are not allowed to have children error. To circumvent this, I used from concurrent.futures import ProcessPoolExecutor as Pool instead of multiprocessing.Pool. This worked fine, though there is risk of zombie processes on exit of the main script if the grandchildren processes are still running.
  • On the client side, I am able to get the data from the address range, however, I am trying to run a multiprocess based webdataset.WebLoader as below:
def identity(x):
   """Return the argument."""
   return x


dataset = wds.Processor(tensorcom.Connection("zsub://<server_name>:788{0..4}", converters="torch"), identity)
dataloader = wds.WebLoader(dataset, num_workers=1, batch_size=None)
collate_fn = MyCollate()
dataloader = dataloader.unbatched().shuffle(1000).batched(batchsize=64, collation_fn=collate_fn, partial=False)

The above code hangs and is unable to get any data if I use num_workers > 0.

Is there a way to do this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant