Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix redundancy in test WSI image #5878

Closed
wants to merge 7 commits into from

Conversation

bhashemian
Copy link
Member

Description

The test WSI image is downloaded to a file with redundant extension .tiff.tiff and also the code has unused variables defined like basename. This PR fix that and simplify the WSI name.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).

@bhashemian bhashemian requested review from wyli and Nic-Ma January 19, 2023 17:54
@bhashemian bhashemian requested a review from ericspod January 20, 2023 01:59
@bhashemian bhashemian enabled auto-merge (squash) January 20, 2023 02:00
@Nic-Ma
Copy link
Contributor

Nic-Ma commented Jan 20, 2023

Hi @drbeh ,

May I know which line in the original code causes "redundant extension .tiff.tiff" as you described?
I didn't find it immediately.

Thanks in advance.

@bhashemian
Copy link
Member Author

Hi @drbeh ,

May I know which line in the original code causes "redundant extension .tiff.tiff" as you described? I didn't find it immediately.

Thanks in advance.

Hi @Nic-Ma, this is the line that generates extra extension:

base_name, extension = os.path.basename(f"{FILE_URL}"), ".tiff"
FILE_PATH = os.path.join(os.path.dirname(__file__), "testing_data", FILE_NAME + extension)

os.path.basename return the base part of the URL and not the base part of filename. To separate extension, one should use os.path.splitext. Since the basename already have .tiff, the second line add another extension to it.

base_name, extension = os.path.splitext(os.path.basename(f"{FILE_URL}"))

But it is not needed here as we don't need to extract the extension on the first place.

@@ -29,9 +29,8 @@

FILE_KEY = "wsi_img"
FILE_URL = testing_data_config("images", FILE_KEY, "url")
base_name, extension = os.path.basename(f"{FILE_URL}"), ".tiff"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to me the filename ending with .tiff.tiff is somewhat a better test case than a regular image_name.tiff, showing that the reader's file extension matching works fine

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something, but in this new way also the extension is explicitly set:

FILE_NAME = f"temp_{FILE_KEY}.tiff"

Copy link
Member Author

@bhashemian bhashemian Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wyli reading your comment again, I realized what you mean and it's a good point. I think a better way to do it is to explicitly mentioning it in the FILE_KEY (i.e. wsi_img.tiff instead of wsi_img) and not adding any additional extension. In this way, the developer who is writing the test knows that it is using a tiff file. Otherwise, it is not visible to the developer what the actual format of the file is unless check the file that is created in the testing_data.

I will update the PR to address your point.

Copy link
Member Author

@bhashemian bhashemian Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR. Let me know if you have any comment. Thanks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wyli please let me know if you have any other comment. Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the updates, I checked out the code and ran the test cases, still I don't fully understand why these code changes are needed. were you seeing some unexpected outcomes of the code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wyli I believe this PR now makes improvement to two concerns:

  1. the developer who knows what image is being used (CMU-1.tiff from Carnegie Melon University) doesn't expect to deal with .tiff.tiff (the original goal of this PR).
  2. the users that just use the testing_data_config API to write unittests don't know what's the format of wsi_img is. Also WSI images are in multiple format (.tiff, .svs, .mrxs, etc.) so making it explicit about wsi_image.tiff would be helpful to avoid confusions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I think in general it's good to simulate some corner cases as the user's input to detect potential issues. if multiple format support is important, shall we focus on generating test cases in different formats (and may be combined with different variants of file extension names)?

auto-merge was automatically disabled February 9, 2023 14:48

Merge queue setting changed

@wyli wyli mentioned this pull request Apr 3, 2023
6 tasks
@bhashemian
Copy link
Member Author

closing in favor of #6244.

@bhashemian bhashemian closed this Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants