Regarding multiple augmentation processes server and client #4

naba89 · 2021-07-15T02:27:25Z

Taking the serve-imagenet-shards as an example, I implemented for my own WebDataset. There are a few points I would like to highlight:

using multiple workers in the dataloader and providing an address range as zpub://0.0.0.0:788{0..4} results in daemonic processes are not allowed to have children error. To circumvent this, I used from concurrent.futures import ProcessPoolExecutor as Pool instead of multiprocessing.Pool. This worked fine, though there is risk of zombie processes on exit of the main script if the grandchildren processes are still running.
On the client side, I am able to get the data from the address range, however, I am trying to run a multiprocess based webdataset.WebLoader as below:

def identity(x):
   """Return the argument."""
   return x


dataset = wds.Processor(tensorcom.Connection("zsub://<server_name>:788{0..4}", converters="torch"), identity)
dataloader = wds.WebLoader(dataset, num_workers=1, batch_size=None)
collate_fn = MyCollate()
dataloader = dataloader.unbatched().shuffle(1000).batched(batchsize=64, collation_fn=collate_fn, partial=False)

The above code hangs and is unable to get any data if I use num_workers > 0.

Is there a way to do this?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding multiple augmentation processes server and client #4

Regarding multiple augmentation processes server and client #4

naba89 commented Jul 15, 2021

Regarding multiple augmentation processes server and client #4

Regarding multiple augmentation processes server and client #4

Comments

naba89 commented Jul 15, 2021