Would there be a way to programmatically get task metrics from the information that Coiled collects? #290

rsignell · 2024-07-24T13:23:51Z

I often have workflows that are mostly thousands of reads to s3 for chunks of the same size. I can see some of the variability on the Dask dashboard, but was wondering:

Is there might be a way to get the metrics for the tasks programmatically as a dataframe or something so I could look at the distribution, the tails of the distribution, etc. ?

hendrikmakait · 2024-07-24T13:47:18Z

Hi, @rsignell! I'm curious about the intent behind your request. What problem(s) would you like to solve using task metrics? Since there's a plethora of possible metrics, which ones would you be interested in?

rsignell · 2024-07-25T10:42:11Z

I would like to look at the variability of the time it takes to retrieve many chunks of identically-sized chunks of data from s3.

hendrikmakait · 2024-07-25T11:07:41Z

I would like to look at the variability of the time it takes to retrieve many chunks of identically-sized chunks of data from s3.

That would mean you'd be interested in the distribution of task durations for those tasks reading your chunks, or something else? Is there a specific problem that understanding the variability would help you with?

rsignell · 2024-07-25T11:45:25Z

Yes, I'm trying to figure out how many chunks at a time I should request for each task, and it would be good to know the distribution of s3 access times within the task

rsignell changed the title ~~Would there be a way to get a histogram of s3 access times from the information that Coiled collects?~~ Would there be a way to programmatically get task metrics from the information that Coiled collects? Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Would there be a way to programmatically get task metrics from the information that Coiled collects? #290

Would there be a way to programmatically get task metrics from the information that Coiled collects? #290

rsignell commented Jul 24, 2024

hendrikmakait commented Jul 24, 2024

rsignell commented Jul 25, 2024

hendrikmakait commented Jul 25, 2024 •

edited

Loading

rsignell commented Jul 25, 2024

Would there be a way to programmatically get task metrics from the information that Coiled collects? #290

Would there be a way to programmatically get task metrics from the information that Coiled collects? #290

Comments

rsignell commented Jul 24, 2024

hendrikmakait commented Jul 24, 2024

rsignell commented Jul 25, 2024

hendrikmakait commented Jul 25, 2024 • edited Loading

rsignell commented Jul 25, 2024

hendrikmakait commented Jul 25, 2024 •

edited

Loading