-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tool for analysing images using Bioimage AI models #1391
Conversation
</command> | ||
<inputs> | ||
<param name="input_imaging_model" type="data" format="zip" label="BioImage model" help="Please load a BioImage model from file uploader"/> | ||
<param name="input_image_file" type="data" format="npz" label="Image to be analysed" help="Please provide an image"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NPZ is numpy isn't it? How do people create such image? Should we create an NPZ file from a tiff, png as part of this tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, NPZ is a numpy zipped file, I think. To show images as PNG or Tiff, we have to reduce the dimensionality of the original multi-dimensional image. Sometimes, these are 3 or 4 dimensional, containing several channels. To display as PNG, I am taking only the first channels where we lose some information from images. But my idea is to save these originally predicted matrices with all dimensions. I also need to think about taking PNG and Tiff as inputs and producing the original matrix along side displayable formats such as PNG, Tiff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@beatrizserrano do you know how these NPY input/output files are generated to be used with bioimage models (e.g. https://bioimage.io/#/?id=10.5281%2Fzenodo.5764892)? Do we use some Galaxy tool to generate it? I tried to use TIFF files as input to a few models, but they do not contain a pixel matrix of images in the correct dimensions for the model to take as input. However, NPY files work fine with the models but could be tricky for the end users of this tool to generate such test/input NPY files. Or do you know someone who worked on such models?
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chiming in on this on @beatrizserrano 's request:
I'm unaware of the context of this PR, but can report from the bioimage.io side: We demand .npy
test files to be provided alongside every model contributed to bioimage.io. Among the utility functions we provide in bioimageio.core there is
https://github.com/bioimage-io/core-bioimage-io-python/blob/f798344213c179a6c836938aff9e4f6c46f8d23c/bioimageio/core/utils/image_helper.py#L136
-- a util function using imageio to attempt to load any given image and then attempting to guess it's axes, see
https://github.com/bioimage-io/core-bioimage-io-python/blob/f798344213c179a6c836938aff9e4f6c46f8d23c/bioimageio/core/utils/image_helper.py#L42
Software using the model zoo (or even specifically bioimageio.core) should try to delegate this guesswork to the end user instead.
note: the links above are from a development branch. These updated functions will be available in the coming bioimageio.core release, currently this functionality is implemented here
Fix suggestion in name Co-authored-by: Leonid Kostrykin <[email protected]>
Fix suggestion to include edam annotation Co-authored-by: Leonid Kostrykin <[email protected]>
@anuprulez is this still WIP or can we merge it. @beatrizserrano needs it :) |
Guys, my impression of the comments above is that there has been some confusion. So in order to clear things up, I want to try to summarize the main concerns: Concern 1: Support of TIFF/PNG@bgruening was addressing the input file format, which currently is only NPZ. For those people who are the target audience of this tool, this isn't a well-established standard file format such as TIFF or PNG. However, @anuprulez explained the conversion to TIFF or PNG of the output files, pointing out that it is not straightforward. Lets consider the questions of the input file formats and the output file formats separately. 1.1) Inputs: The NPY/NPZ formats are more general than TIFF and PNG, since they can store arbitrary numpy arrays, with arbitrary data types (even mixed) and number of dimensions. Thus, conversion from TIFF/PNG to NPY/NPZ should be straightforward. Given the concerns above, I think the tool wrapper really should also accept TIFF and PNG input files and do this conversion automatically. This should be as simple as something like this: im = imread('tiff or png image file path')
np.save('input.npy', im) # to produce an NPY
np.savez('input.npz', im) # to produce an NPZ 1.2) Outputs: It was pointed out that the intention of outputting the originally predicted arrays as NPZ was that no information is lost. However, I'm wondering whether converting the data to TIFF is even capable of losing information? My gut feeling is that using TIFF as the output file format should be safe, at least as long as we restrict the wrapper to NPY instead of NPZ (see below). Concern 2: NPY or NPZ?Please keep in mind that NPY is "the standard binary file format in NumPy for persisting a single arbitrary NumPy array on disk", and NPZ is "the standard format for persisting multiple NumPy arrays on disk", which can be compressed, but not necessarily (docs). The key difference here is that NPY is for single arrays, NPZ is for multiple arrays. At this point I'm somewhat confused. I hope that @FynnBe can maybe add something regarding the following two concerns: 2.1) What happens if the NPZ contains more than one array? Do the bioimage.io models process them independently and yield another NPZ, which contains predictions for each array in the NPZ? Or do the models bluntly fail to process an NPZ with more than one array? 2.2.) As far as I understand the comment made by @FynnBe (link), the bioimage.io models are only guaranteed to work with NPY inputs. Doesn't that mean, that the wrapper should actually take/produce NPY instead of NPZ? If this is right, then we should forget about NPZ and restrict our discussions here to PNG, TIFF, and NPY. |
I was on holidays. I will have a look at it this week. Can you let me know when you require this feature, I can prioritise it @beatrizserrano thanks! @kostrykin thank you for the clarification. |
Thank you @anuprulez! 🤗 We want to develop tutorials at the Biohackathon in November, so it would be perfect to be able to test the tool well before that. Thanks again! |
@kostrykin thank you for the explanation of NPY and NPZ. I would like to elaborate the issue of using TIFF or PNG files as input. When these files are read (using OpenCV), they are represented as 2 dimensional image. Example from Neuron Segmentation in EM model TIFF input file shape: However, the NPY file has shape When I use the TIFF file as the test input for prediction, I receive the following error:
The model expects an input with 5 dimensions but the TIFF file is only 2 dimensional. For other models, the input shape varies - sometimes it is 3,4 or 5. When I use the NPY file as input, it works fine and the model predicts the segmentations. As suggested by @FynnBe, I tried to use The issue is to find a way to transform TIFF or PNG images to be represented in the valid dimensions so that the respective AI model can consume the input files and produce desired output. Any suggestions here would be really helpful! Thank you! |
Thanks for the follow-up @anuprulez I see the challenge that comes with conversion of PNG/TIFF to NPY. However, by accepting only NPY and dropping support for PNG/TIFF, we would simply shift that challenge to the user of the tool, which IMO we shouldn't do. The core issue is to determine the expected number of dimensions of the input, the conversion then simply is some reshaping, right? I see a couple of options here, in descending order of preference: Option 1. My suspicion is that the only way to get the dimensions of the TIFF right automatically is to rely on what the model declares as the required input dimension — if it does declare anything. You do Option 2. You quoted the error "RuntimeError: Expected 5-dimensional input for 5-dimensional weight [64, 1, 3, 3, 3], but got 2-dimensional input of size [360, 360] instead", where is that raised from? By looking into the code for where this check is being performed, we might be able to reverse-engineer how the expected dimension of the models is stored. Option 3. In the worst-case (i.e. if Option 2 leads to no solution for some reason), I'd fallback to the documentation, like: Point the user to the documentation of the models, where, hopefully, the required number of dimensions is stated. Add an input field to the tool UI, where the user has to write down the number that was looked up from the model documentation. |
@kostrykin thank you for the interesting pointers. I will explore how to get the input shape of the models dynamically. |
Fix review comments Co-authored-by: Beatriz Serrano-Solano <[email protected]> Co-authored-by: Leonid Kostrykin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much thanks for adding the PNG support and everything!
Just very few comments left, sorry for being a bit picky maybe.
<token name="@VERSION_SUFFIX@">0</token> | ||
</macros> | ||
<creator> | ||
<organization name="BioImage.IO" url="https://bioimage.io/#/" email="" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you affiliated with BioImage.IO? @anuprulez Otherwise I wouldn't list them as a creator of this tool wrapper.
<param name="input_image_file" type="data" format="tiff,png" label="Input image" help="Please provide an input image for the analysis."/> | ||
<param name="input_image_input_size" type="text" label="Shape of the input image" help="Provide shape of input image. See chosen model's RDF file to find correct input shape. For example: for the BioImage.IO model MitochondriaEMSegmentationBoundaryModel, the input shape is 256 x 256 x 32 x 1. Enter the shape as '256,256,32,1'."/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix indentation, replace x
by ⨉
multiplication symbol, add some missing articles:
<param name="input_image_file" type="data" format="tiff,png" label="Input image" help="Please provide an input image for the analysis."/> | |
<param name="input_image_input_size" type="text" label="Shape of the input image" help="Provide shape of input image. See chosen model's RDF file to find correct input shape. For example: for the BioImage.IO model MitochondriaEMSegmentationBoundaryModel, the input shape is 256 x 256 x 32 x 1. Enter the shape as '256,256,32,1'."/> | |
<param name="input_image_file" type="data" format="tiff,png" label="Input image" help="Please provide an input image for the analysis."/> | |
<param name="input_image_input_size" type="text" label="Shape of the input image" help="Provide the shape of the input image. See the chosen model's RDF file to find the correct input shape. For example, for the BioImage.IO model MitochondriaEMSegmentationBoundaryModel, the input shape is 256 ⨉ 256 ⨉ 32 ⨉ 1. Enter the shape as '256,256,32,1'."/> |
I also believe that avoiding the genitive s is a good practice in technical writing, since it helps keeping things clearly structured, and the apostrophe can easily be misperceived as a quotation mark, so I suggested it in my previous review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I replaced the x
symbol by *
. For this sentence Enter the shape as '256,256,32,1'
, I have left out any quotation or symbols to avoid confusion. It should be entered just like any other text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anuprulez I just looked into the RDF file because I thought it would be good to use their notation and saw that they use the x
notation :) much sorry! I didn't know that you sticked to the RDF notation… so actually using the x
notation which you used originally, then is the best option, I think.
Fix review comments Co-authored-by: Leonid Kostrykin <[email protected]>
Fix review comments Co-authored-by: Leonid Kostrykin <[email protected]>
Add missing articles and replace shape by size Co-authored-by: Leonid Kostrykin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @anuprulez!
<creator> | ||
<organization name="BioImage.IO" url="https://bioimage.io/#/" email="" /> | ||
<person name="Beatriz Serrano-Solano" email="" /> | ||
<person name="Leonid Kostrykin" email="[email protected]" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anuprulez you forgot yourself here.
@beatrizserrano your email?
I would probably use <person givenName="" familyName="" email="" />
</xrefs> | ||
<requirements> | ||
<requirement type="package" version="3.9.12">python</requirement> | ||
<requirement type="package" version="2.3.1">pytorch</requirement> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<requirement type="package" version="2.3.1">pytorch</requirement> | |
<requirement type="package" version="@TOOL_VERSION@">pytorch</requirement> |
was that the intention behind the version number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I actually forgot to add this in my review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think tool's version should not replace Pytorch's version. Something else changes in the tool (making the tool's version to change) while Pytorch remains that same - could be problematic in future. Since, this is the first tool, I would keep the tool's version as 1.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wrapper is rather thin and is mostly "just" a wrapper for torch.load
and the invocation of the model (the rest is pre- and post-processing). This suggests that the wrapper should have the same version as the wrapped tool, PyTorch in this case. If the wrapper changes in the future, you would usually increment @VERSION_SUFFIX@
. This is also what I had suggested in #1391 (comment) and it is more in line with IUC recommendations https://galaxy-iuc-standards.readthedocs.io/en/latest/best_practices/tool_xml.html#tool-versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I misinterpreted the previous comment. It's restored based on your comment. Thanks!!
Fix review Co-authored-by: Björn Grüning <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor consistency issue
Co-authored-by: Leonid Kostrykin <[email protected]>
Thanks everyone! |
Thanks @bgruening for merging. Thank you very much @kostrykin for spending so much time reviewing it |
Test files:
https://bioimage.io/#/?id=10.5281%2Fzenodo.5764892
https://zenodo.org/api/records/6647674
Remaining tasks: