Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test against mindeps #1314

Merged
merged 19 commits into from
Feb 16, 2024
Merged

Test against mindeps #1314

merged 19 commits into from
Feb 16, 2024

Conversation

ivirshup
Copy link
Member

@ivirshup ivirshup commented Jan 16, 2024

TODO:

  • Get all tests running

Won't do in this PR:

  • GPU support (it's going to be more difficult with the self-hosted runner + gotta get cuda versions working right
  • Fancier minimum version detection

Current problems:

  • Warnings are different in earlier versions...
  • Pandas has different results for pd.value_counts depending on version
  • Should this be collecting coverage to check on version bounded things?

@ivirshup ivirshup added this to the 0.10.5 milestone Jan 16, 2024
Copy link

codecov bot commented Jan 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (b9f346c) 85.83% compared to head (9b8816b) 85.87%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1314      +/-   ##
==========================================
+ Coverage   85.83%   85.87%   +0.03%     
==========================================
  Files          35       35              
  Lines        5584     5584              
==========================================
+ Hits         4793     4795       +2     
+ Misses        791      789       -2     
Flag Coverage Δ
gpu-tests 52.38% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
anndata/experimental/merge.py 91.28% <ø> (ø)

... and 1 file with indirect coverage changes

@ivirshup
Copy link
Member Author

ivirshup commented Jan 16, 2024

This last doctest is a pain (concat_on_disk), because pandas actually changed the output for value_counts between v1 and v2.

I'd rather not just skip it, but unsure if there's a good solution here. Ideally we'd just ignore the lines that are different or skip this bit only when pandas is older.

We could maybe skip doc tests for minimum dependencies since I would expect more things like this to come up (especially with warnings), but they are very good for catching bugs.

@@ -1350,7 +1351,7 @@ def test_concat_size_0_dim(axis, join_type, merge_strategy, shape):
FutureWarning,
match=r"The behavior of DataFrame concatenation with empty or all-NA entries is deprecated",
)
if shape[axis] == 0
if shape[axis] == 0 and Version(pd.__version__) >= Version("2.1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if shape[axis] == 0 and Version(pd.__version__) >= Version("2.1")
if shape[axis] == 0 and Version(version("pandas")) >= Version("2.1")

same as above

@flying-sheep
Copy link
Member

This last doctest is a pain (concat_on_disk), because pandas actually changed the output for value_counts between v1 and v2. I'd rather not just skip it, but unsure if there's a good solution here.

What’s the difference in the format? maybe we can modify it in a way that smooths over the differences, e.g. dict(sorted(x.items())) if the order is different or something.

@ivirshup
Copy link
Member Author

The name of the series and the name of the index are different. For example:

>>> pd.DataFrame({"a": [1, 1, 2, 3, 4, 4, 4]})["a"].value_counts()
4    3
1    2
2    1
3    1
Name: a, dtype: int64
>>> pd.__version__
'1.4.4'
>>> pd.DataFrame({"a": [1, 1, 2, 3, 4, 4, 4]})["a"].value_counts()
a
4    3
1    2
2    1
3    1
Name: count, dtype: int64
>>> pd.__version__
'2.1.2'

I don't like the idea of modifying the command just to get it reproducible since it's meant to be an example for the user, and it's a weird thing to select as an example without the context.

If we had a different example to show, that could be useful. We could even just remove this line.

@ivirshup ivirshup marked this pull request as ready for review January 16, 2024 14:52
@flying-sheep
Copy link
Member

I don't like the idea of modifying the command just to get it reproducible since it's meant to be an example for the user, and it's a weird thing to select as an example without the context.

Sure, only if it makes sense. a trailing .to_dict() might!

@ivirshup ivirshup marked this pull request as draft January 16, 2024 15:09
@ivirshup
Copy link
Member Author

I'm converting back to draft while I get this working for scanpy. I think there will be a few changes that it'd be good to get in here too.

@flying-sheep
Copy link
Member

OK! Thanks for the hard work figuring out the deps.

I guess we should make the script a standalone package or something if you want to reuse it?

@flying-sheep flying-sheep linked an issue Jan 18, 2024 that may be closed by this pull request
@ivirshup ivirshup modified the milestones: 0.10.5, 0.10.6 Jan 26, 2024
@ivirshup ivirshup marked this pull request as ready for review February 15, 2024 16:45
@ivirshup ivirshup enabled auto-merge (squash) February 16, 2024 14:36
@ivirshup ivirshup merged commit 0de445a into main Feb 16, 2024
16 checks passed
@ivirshup ivirshup deleted the mindeps branch February 16, 2024 14:46
meeseeksmachine pushed a commit to meeseeksmachine/anndata that referenced this pull request Feb 16, 2024
ivirshup added a commit that referenced this pull request Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Minimum dependencies test job
2 participants