Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index issue with generating data script [discussion] #7

Open
ncclementi opened this issue Sep 2, 2022 · 4 comments
Open

Index issue with generating data script [discussion] #7

ncclementi opened this issue Sep 2, 2022 · 4 comments

Comments

@ncclementi
Copy link
Contributor

See dask/dask#8983

@jrbourbeau
Copy link
Member

Thanks for bringing this up, I had forgotten about this issue. Do you have a sense for how big a problem this is for us when running these benchmarks? My guess is that if, for example, partitions are all 103 MB instead of 100 MB, that won't impact us significantly.

@ncclementi
Copy link
Contributor Author

When we took a look at this with Ian, it was between 5-10% increase in memory footprint. See comment dask/dask#8983 (comment)

@jrbourbeau
Copy link
Member

Okay, good to know. My question is still does this lead to problems when running h2o benchmarks? Do we need to devote time to resolving dask/dask#8983 in order to make progress on h2o benchmarks, or is it just a related issue to be aware of (i.e. a valid bug, but not mission critical)? I think it's not a blocking issue, but I could be wrong

@ncclementi
Copy link
Contributor Author

My question is still does this lead to problems when running h2o benchmarks?

Hard to know, we haven't compared runs with and without this problem. Things will still run, as seen in the runtime, but we don't know how it affects.

Do we need to devote time to resolving dask/dask#8983 in order to make progress on h2o benchmarks, or is it just a related issue to be aware of (i.e. a valid bug, but not mission critical)? I think it's not a blocking issue, but I could be wrong

I don't think it's a blocking issue either, to get started, but it's something to be aware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants