Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DensMAP support #2946

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open

Conversation

keller-mark
Copy link

@keller-mark keller-mark commented Mar 22, 2024

I tried to follow the feedback described in a previous PR that contributed DensMAP #2684 (comment) but re-implemented on top of the state of the current scanpy main branch.

I did not add release notes because the contributor guide says to wait for PR feedback https://scanpy.readthedocs.io/en/latest/dev/documentation.html#adding-to-the-docs

  • Release notes not necessary because:

Copy link

codecov bot commented Mar 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.47%. Comparing base (05ab694) to head (1e4ac26).
Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2946      +/-   ##
==========================================
+ Coverage   75.43%   75.47%   +0.04%     
==========================================
  Files         112      114       +2     
  Lines       13235    13262      +27     
==========================================
+ Hits         9984    10010      +26     
- Misses       3251     3252       +1     
Files with missing lines Coverage Δ
src/scanpy/tools/__init__.py 91.30% <ø> (ø)
src/scanpy/tools/_types.py 100.00% <100.00%> (ø)
src/scanpy/tools/_umap.py 73.52% <100.00%> (+4.17%) ⬆️

... and 2 files with indirect coverage changes

@keller-mark keller-mark changed the title DensMAP support 2 DensMAP support Dec 10, 2024
@keller-mark
Copy link
Author

Hi, it seems the Check if milestone or "no milestone" label is present GH Actions check fails but I am not able to add a PR label

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for you contribution! Please add a plot test that shows sufficiently different results from umap.

# Convenience function for densMAP


def densmap(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d rather not maintain that. Just method='densmap' should be sufficient, but if densmap = partial(umap, method='densmap') works and displays in the docs as expected, I’d consider it.

If you want to make it more visible, you could add an example to the sc.pl.umap docs and maybe even a tutorial about what it’s for and how to best use it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the partial option, but there seem to be issues with inference of the types from the parent umap function when generating the autodoc for sphinx that are not straightforward to resolve.

@flying-sheep flying-sheep added this to the 1.11.0 milestone Dec 16, 2024
@ilan-gold
Copy link
Contributor

@keller-mark I'm not super familiar with the azure but I know you can manually upload an artifact. Also, have you tried locally? In the hatch.toml you can find how to create a min environment on your machine.

@ilan-gold
Copy link
Contributor

Relatedly, I wonder if there is something with you machine vs. remote. You can output the densmap results as a zarr zip store and upload that as an artifact as well.

@keller-mark
Copy link
Author

Thanks, I installed the min-versions dependencies locally in a python 3.10 environment and was able to reproduce the issue.

The same random_state parameter value (I tried both zero and non-zero values) is producing different results between the min-versions environment and all others. I tried random_state=np.random.RandomState(1) based on the numpy legacy random docs in case it was due to numpy v1 vs v2 but it does not resolve the issue.

Please add a plot test that shows sufficiently different results from umap.

Since ultimately the plotting is just using the standard sc.pl.embedding, is a visual test really necessary here? That code path is already extensively tested. Instead I could add a numerical (as opposed to visual) test that checks that the method="densmap" produces sufficiently different results than method="umap"

@keller-mark
Copy link
Author

This test also fails, confirming the issue is not specific to DensMAP: 8f8787a

@keller-mark keller-mark mentioned this pull request Jan 21, 2025
@flying-sheep
Copy link
Member

Would be interesting to figure out which dependency causes this, then you could skip the image comparison if that dependency’s version is lower than a the offending version. But that could be difficult.

If you’re not up to it, we could help or just give some marker for “this is the min-deps job” that you could use.

@ilan-gold
Copy link
Contributor

@flying-sheep totally agree

@ilan-gold
Copy link
Contributor

ilan-gold commented Jan 22, 2025

Thanks @keller-mark I think you're right for sure - it's not numpy 2 specific.

The following requirements.txt causes the test to fail under python 3.10:

``` anndata==0.11.3 array-api-compat==1.10.0 asciitree==0.3.3 attrs==24.3.0 click==8.1.8 cloudpickle==3.1.1 contourpy==1.3.1 coverage==7.6.10 cycler==0.12.1 dask==2024.7.1 exceptiongroup==1.2.2 fasteners==0.19 fonttools==4.55.4 fsspec==2024.12.0 h5py==3.12.1 igraph==0.11.8 imageio==2.37.0 importlib-metadata==8.6.1 iniconfig==2.0.0 joblib==1.4.2 kiwisolver==1.4.8 lazy-loader==0.4 legacy-api-wrap==1.4.1 leidenalg==0.10.2 llvmlite==0.40.1 locket==1.0.0 matplotlib==3.10.0 natsort==8.4.0 networkx==3.4.2 numba==0.57.0 numcodecs==0.13.1 numpy==1.24.4 packaging==24.2 pandas==1.5.3 partd==1.4.2 patsy==1.0.1 pbr==6.1.0 pillow==11.1.0 pluggy==1.5.0 profimp==0.1.0 pynndescent==0.5.5 pyparsing==3.2.1 pytest==8.3.4 pytest-cov==6.0.0 pytest-mock==3.14.0 pytest-nunit==1.0.7 python-dateutil==2.9.0.post0 pytz==2024.2 pyyaml==6.0.2 scikit-image==0.22.0 scikit-learn==1.5.2 scipy==1.8.1 seaborn==0.13.2 session-info2==0.1.2 setuptools==75.8.0 six==1.17.0 statsmodels==0.14.4 texttable==1.7.0 threadpoolctl==3.5.0 tifffile==2025.1.10 tomli==2.2.1 toolz==1.0.0 tqdm==4.67.1 typing-extensions==4.12.2 tzdata==2025.1 umap-learn==0.5.1 zarr==2.18.3 zipp==3.21.0 -e file:///Users/ilangold/Projects/Theis/scanpy ```
But this one:
``` anndata==0.11.3 array-api-compat==1.10.0 asciitree==0.3.3 attrs==24.3.0 click==8.1.8 cloudpickle==3.1.1 contourpy==1.3.1 coverage==7.6.10 cycler==0.12.1 dask==2024.7.1 exceptiongroup==1.2.2 fasteners==0.19 fonttools==4.55.4 fsspec==2024.12.0 h5py==3.12.1 igraph==0.11.8 imageio==2.37.0 importlib-metadata==8.6.1 iniconfig==2.0.0 joblib==1.4.2 kiwisolver==1.4.8 lazy-loader==0.4 legacy-api-wrap==1.4.1 leidenalg==0.10.2 llvmlite==0.40.1 locket==1.0.0 matplotlib==3.10.0 natsort==8.4.0 networkx==3.4.2 numba==0.57.0 numcodecs==0.13.1 numpy==1.24.4 packaging==24.2 pandas==1.5.3 partd==1.4.2 patsy==1.0.1 pbr==6.1.0 pillow==11.1.0 pluggy==1.5.0 profimp==0.1.0 pynndescent==0.5.5 pyparsing==3.2.1 pytest==8.3.4 pytest-cov==6.0.0 pytest-mock==3.14.0 pytest-nunit==1.0.7 python-dateutil==2.9.0.post0 pytz==2024.2 pyyaml==6.0.2 scikit-image==0.22.0 scikit-learn==1.5.2 scipy==1.8.1 seaborn==0.13.2 session-info2==0.1.2 setuptools==75.8.0 six==1.17.0 statsmodels==0.14.4 texttable==1.7.0 threadpoolctl==3.5.0 tifffile==2025.1.10 tomli==2.2.1 toolz==1.0.0 tqdm==4.67.1 typing-extensions==4.12.2 tzdata==2025.1 umap-learn==0.5.1 zarr==2.18.3 zipp==3.21.0 -e file:///Users/ilangold/Projects/Theis/scanpy ```
causes the tests to pass (both installed as `uv pip install -r requirements.txt`).

So I've narrowed it down a bit...will start diffing now

EDIT: sorry about the details, here are the files:

constraints_failing.txt
constraints_working.txt

@ilan-gold
Copy link
Contributor

Ok the issue is numba and llvmlite - going from the version in the passing requirements to the failing requirements version causes the failure so that is likely the cause.

@ilan-gold
Copy link
Contributor

Specifically going from numba 0.60.0 to 0.61.0 does it

@keller-mark
Copy link
Author

Thank you very much for debugging this @ilan-gold . Should I add a flag to skip this image comparison test with numba < 0.61.0?

@ilan-gold
Copy link
Contributor

For now let's see if that makes things go green here and in the meantime I'll do some more digging.

@ilan-gold
Copy link
Contributor

@keller-mark Something interesting is that numba 0.61.0 did not exist until a week ago, but your densemap issue predates that. So either that is not the issue, or I am misreading the history of this issue.

@ilan-gold
Copy link
Contributor

ilan-gold commented Jan 22, 2025

@keller-mark I think that like most plotting tests in this repo (unfortunately), this test is actually erroneously passing with latest deps (including latest numba, for reasons I don't understand), at least for me locally. Here are actual/expected for me locally that "pass" as densemap:

actual
expected

@ilan-gold
Copy link
Contributor

@keller-mark I'm curious if you're seeing locally what I'm seeing #2946 (comment) in the actual images produced.

@keller-mark
Copy link
Author

I do see the same actual/expected locally. I still cannot seem to find how to download the generated actual from Azure CI #2946 (comment)

@ilan-gold
Copy link
Contributor

I think something along the lines of

  - task: PublishBuildArtifacts@1
    inputs:
      pathToPublish: '$(Pipeline.Workspace)/s/scanpy/tests/_images'
      artifactName: drop

worked for me. Have you tried this? Maybe it's a permissions thing?

@keller-mark
Copy link
Author

Thanks @ilan-gold for getting the image artifacts working on Azure CI. I now see that the actual/expected for both umap and densmap differ there despite passing, even in the Python 3.12 environment.

I believe having easy access to DensMAP in Scanpy will be valuable to the community and would love to see this feature merged.
Based on the discussion in #3422 (comment) I think it is unreasonable to expect random results, even with the same seed, to match between my local environment and the CI environment. Therefore I reiterate my earlier request to replace the visual test with a numerical test #2946 (comment) Alternatively, we could store a precomputed DensMAP result and plot that, but this would just test the sc.pl.embedding code path which is already tested and untouched in this PR.

@ilan-gold
Copy link
Contributor

Since ultimately the plotting is just using the standard sc.pl.embedding, is a visual test really necessary here? That code path is already extensively tested. Instead I could add a numerical (as opposed to visual) test that checks that the method="densmap" produces sufficiently different results than method="umap"

I would be happy with a numerical test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

densmap
3 participants