Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First benchmarks for kmeans #1

Merged
merged 37 commits into from
Oct 11, 2023
Merged

First benchmarks for kmeans #1

merged 37 commits into from
Oct 11, 2023

Conversation

fcharras
Copy link
Collaborator

@fcharras fcharras commented Sep 12, 2023

This setup the file tree for benchmarking KMeans using benchopt.

TODOs:

  • enable returning the opencl device in the result table (should work with all devices using pyopencl at the cost of making pyopencl a mandatory dependency of the project) maybe use lspci instead ? instead design a system to do it manually when aggregating the benchmark results
  • add kmeans_dpcpp benchmark
  • add cupy Kmeans benchmark
  • sklearn intelex: use public sklearnex frontend rather than calling kmeans from daal4py in sklearnex internals
  • sklearn intelex: use new SYCL versions ? seems there are newer code paths to use KMeans, maybe we're currently using deprecated implementations ? maybe future PR
  • run the benchmarks on all available platforms and collect the benchopt parquet benchmark data
  • use the data to display a table that is sortable / searchable in a github text file (see https://stackoverflow.com/a/73820969) solution: use gspread to synchronize with google spreadsheet
  • expose a command to easily increment the table from a new parquet file
  • setup CI that runs all benchmarks on CPU (except those that can't run on CPU)
    • sklearn numba dpex
    • scikit-learn
    • scikit-learn-intelex
    • sklearn-pytorch-engine
    • kmeans_dpcpp
  • write a README that explain how to install environments, how to run the benchmarks, and how to add to the result table,
  • address last review suggestion

Currently adding:

  • PCA benchmarks

In follow-up PRs:

  • matmul benchmark ?
  • topk benchmark ?
  • bfKNN benchmark ?

@fcharras fcharras force-pushed the first_kmeans_benchmarks branch from 4dcd241 to 27bd685 Compare September 12, 2023 18:23
Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the CI should run the fast variant of the benchmark for all CPU-compatible engines.

EDIT: I see there is already an item in the TODO list:

setup CI that runs all benchmarks on CPU (except those that can't run on CPU)

Other than that, LGTM!

LICENSE.txt Outdated Show resolved Hide resolved
benchmarks/kmeans/objective.py Outdated Show resolved Hide resolved
setup.cfg Outdated
[flake8]
# max line length for black
max-line-length = 88
target-version = ['py37']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can already target 38 or 39 :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also nowadays I would tend to use ruff instead of flake8 but not big deal, especially on a small code base.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mindlessly copied files that we copied from sklearn at the time when sklearn-numba-dpex was created but it's been evolving since then, we should indeed update and apply ruff on all our repos now...

@fcharras fcharras force-pushed the first_kmeans_benchmarks branch from a23f683 to a345b04 Compare September 13, 2023 12:52
@fcharras fcharras force-pushed the first_kmeans_benchmarks branch 2 times, most recently from 677f367 to dd4a1a3 Compare September 14, 2023 08:45
@fcharras fcharras force-pushed the first_kmeans_benchmarks branch 11 times, most recently from 784002f to 75407b0 Compare September 14, 2023 15:00
@fcharras fcharras force-pushed the first_kmeans_benchmarks branch from 75407b0 to 5d9c3a0 Compare September 14, 2023 17:09
@fcharras fcharras force-pushed the first_kmeans_benchmarks branch 3 times, most recently from 8ad78fd to 6f9622b Compare September 15, 2023 14:18
@fcharras fcharras force-pushed the first_kmeans_benchmarks branch from 115e118 to cae8cbc Compare September 18, 2023 15:20
for running the benchmarks from a benchmark file tree, and refer to the documentation
of the dependencies of the solvers you're interested in running to gather prerequisite
installation instructions.

Copy link
Contributor

@ogrisel ogrisel Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should give one or two examples of canonical commands (adapted to the folder structure of this repo) to get started here and then refer to the benchopt doc for variations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a link to the github workflow for testing on cpu that's better for a complete practical guide I think ?

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@fcharras fcharras force-pushed the first_kmeans_benchmarks branch from 414b7d2 to 2854cb6 Compare October 7, 2023 08:59
@fcharras fcharras force-pushed the first_kmeans_benchmarks branch from 2854cb6 to 24bf8c0 Compare October 7, 2023 22:33
Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback.

Also please include Array API with PyTorch for scikit-learn.


# if tol == 0:
# tol = 1e-16
# self.tol = tol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed.

benchmarks/pca/objective.py Outdated Show resolved Hide resolved
parameters = dict(
n_components=[10],
whiten=[False],
tol=[0.0],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which solver requires a tol parameter?

Copy link
Collaborator Author

@fcharras fcharras Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 3 current solvers expose the parameters and it's only used by sklearn(/ex)'s arpack and cupy's jacobi

random_state,
verbose,
):
if self.device == "cpu":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think cuml.decomposition.PCA can ever run on CPU. It can accept host-allocated inputs but it will do the device-allocation + copy automatically in that case.

In my opinion, let's not waste benchmark time and reporting readability to measure this: for the cuml case, I would only run the device == "gpu" case and remove the fake device == "cpu" case.

Copy link
Collaborator Author

@fcharras fcharras Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I don't really now the general ins and outs but cuml is introducing experimental cpu / gpu device selection and PCA is compatible so maybe at some point it could be included.)

benchmarks/pca/solvers/scikit_learn_intelex.py Outdated Show resolved Hide resolved
benchmarks/pca/datasets/simulated_blobs.py Outdated Show resolved Hide resolved

parameters = dict(
svd_solver=["full", "arpack", "randomized"],
power_iteration_normalizer=["QR", "LU", "none"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather not bench all those combinations.

For randomized we could set power_iteration_normalizer="LU" when using numpy and power_iteration_normalizer="QR" when using Array API (PyTorch).

benchmarks/pca/solvers/scikit_learn_intelex.py Outdated Show resolved Hide resolved
benchmarks/pca/solvers/scikit_learn_intelex.py Outdated Show resolved Hide resolved
benchmarks/pca/solvers/scikit_learn_intelex.py Outdated Show resolved Hide resolved
@fcharras
Copy link
Collaborator Author

fcharras commented Oct 11, 2023

For KMeans the result spreadsheet is starting to look good after the latest debug.

Still missing:

  • sklearn_pytorch_engine on cpu and xpu
  • sklearn_pytorch_engine on M1 for large dataset (10_000_000 and 50_000_000)
  • benchmarks on my laptop (8 cores / iGPU)

@fcharras fcharras merged commit 88357ca into main Oct 11, 2023
@fcharras fcharras deleted the first_kmeans_benchmarks branch October 24, 2023 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants