-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fail to "vectorize" velocity_from_position #68
Comments
It should work if you apply the function to one trajectory at a time. I think the function could be made to correctly handle positions and times as 2-d arrays. |
Indeed, it does work as shown above but it should be able to be fed to |
I haven't used |
I am not sure that's the solution. The example linked above examines the case of a function that takes 1-d arrays as an input, like our function. |
The issue here when using our function is that |
To be more useful and widely applicable the function should take an argument indicating the dimension along which to conduct the time derivative operation so that it is applicable to arrays of any dimension. In addition, it should also be able to handle ragged arrays of position. |
It's been a few months since I played with I believe the easiest would be to use Awkward Array and perform a simple loop:
|
I'll try that. But will it be fast and use dask/parallel computing? As an example I have 5.5M 60-day long hourly trajectories ... so perhaps if I chunk per trajectories? In general we need the analysis function of clouddrift to be "vectorizable" in a seamless way for users of xarray and awkward. |
I haven't used Dask, but this looks like should do the job: https://examples.dask.org/applications/embarrassingly-parallel.html. On its own, I don't think Philippe's for-loop snippet will run in parallel. |
More developments: I found that the following works if the input DataArrays are not chunked:
but if the dataArrays are chunked then we need the following : (note the option dask="allowed" and the .load() function)
Anyone understand why the load function is needed? Note that the above applies to a subset of my example dataset, 10000 out of 593297 trajectories. If I try to apply this to the entire dataset, the first option succeeds on my desktop but the chunked option fails with a local cluster, which I am finding odd ... |
Note that the chunked case above works only if the size of the chunks in the trajectory dimension is 1. That is if the function is fed 1-d array arguments I think. So the way forward appears to modify our function to handle n-d array. |
This is now implemented in the main branch. Give it a try. And I will try running directly with Dask (without the Xarray wrapper) and see if that produces the expected result. |
FWIW, I ran |
@milancurcic Can you share that test code please? |
I think it should more efficient to use Dask.bag instead of delayed due to the the high number of items. See Avoid too many tasks in the Dask best practices. |
Here's my original snippet. from clouddrift.analysis import velocity_from_position
import dask
import numpy as np
num_obs = 1000
num_traj = 6000
lon = np.reshape(np.tile(np.linspace(-180, 180, num_obs), num_traj), (num_traj, num_obs))
lat = np.zeros((lon.shape))
for n in range(num_traj):
lat[n] = n / num_traj * 60 # from 0 to 60N
time = np.reshape(np.tile(np.linspace(0, 1e7, num_obs), num_traj), (num_traj, num_obs))
client = dask.distributed.Client(threads_per_worker=6, n_workers=1)
velocity_from_position_parallel = dask.delayed(velocity_from_position)
res = velocity_from_position_parallel(lon, lat, time)
u, v = res.compute() Too many tasks is not the problem here, actually the opposite. I make only one function call with I'll experiment with other dask idioms as well as dask+xarray. |
I initially thought you were calling the function So something like this could replace the calculation part:
I don't have super large dataset to test this with, but maybe @selipot could give it a try? |
You're right. I naively assumed that Dask would chunk the data for me under the hood. |
With |
Yes I think this issue is old before |
The function is executed on a different thread for each trajectory and the results are reassembled afterwards. See this section,
|
We can benchmark, but my understanding is that the concurrency (different from parallelism) used in I think this issue intended to ask how to run things truly in parallel, i.e. distribute the computation on different CPUs (whether via Dask, MPI, multiprocessing, or otherwise). I don't think we have a solution yet, but getting it to run with Dask is probably the lowest-hanging fruit among the possible approaches. (and vectorization is yet a different concept from concurrency and parallelization; it's about putting multiple array elements into a long register and running one operation on all of them.) |
I am attempting to apply
velocity_from_position
to xarray.DataArrays of lon, lat, and time. I have been following a tutorial for a similar situation. With the followingds
Dataset:I can easily do:
u,v = velocity_from_position(ds.lon.isel(trajectory=0),ds.lat.isel(trajectory=0),ds.time.isel(trajectory=0))
or
but the following fails:
and the bottom line of the error is
So yes, I get the error but I don't understand if the fix is to apply
ufunc
differently or makevelocity_from_position
more flexible?The text was updated successfully, but these errors were encountered: