-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added the IO functions removed from PR #141 #150
base: main
Are you sure you want to change the base?
Conversation
pyirf/io/gadf.py
Outdated
|
||
for this_file in files: | ||
# [0] because there the IRFs are written as a single row of the table | ||
irfs_all.append(QTable.read(this_file, hdu=extname)[field_name][0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transpose is missing here, this is why you have to do non-trivial transposes in edisp.
This goes back to the point that I really don't understand why we need this read_irf_grid
function. Simple functions that read a table once and return the arrays. And stacking can then just be edisp = np.stack(edisps)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, keeping the table around is good, since it contains the metadata. So this function is just not as general as you might think it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched to np.stack.
Please note @maxnoe that the function is also making the required transposition that is there for all the IRFs and extracting the 0-th row - all the operations to convert between the FITS format and the one used inside pyirf. And it is working fine on either individual files or lists of files so I think it is very useful and it makes the custom functions that read individual IRFs simpler.
I agree that the table includes additional important information via metadata, but I think it would be more useful to extract this metadata and return such concrete information. Nevertheless, first we would need to agree (in #126 ) what information should be put there, with what names, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevertheless, first we would need to agree (in #126 ) what information should be put there, with what names, etc.
Actually no. Since at no point pyirf will use this metadata, users of pyirf can fill and read any metadata they like and use it for whatever purpose.
That's also the problem with this read_irf_grid
function. You would have to open all the files again to also read the metadata.
So, please, write functions that read each IRF type, validating it's the correct type by looking at HDUCLASX header keywords and returning also the metadata.
This functions can then easily be called for multiple files and the results stacked / compared.
I think this has much better advantages than the little code repetition you avoid by using this read_irf_grid function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevertheless, first we would need to agree (in #126 ) what information should be put there, with what names, etc.
Actually no. Since at no point pyirf will use this metadata, users of pyirf can fill and read any metadata they like and use it for whatever purpose.
well, I would not be so sure about this. If we really have standardized metadata we can use them inside pyirf to do e.g. standarized interpolation over those values.
That's also the problem with this
read_irf_grid
function. You would have to open all the files again to also read the metadata.
as I explained above the metadata processing can be also added to this function. Please note that they should be likely the same for all the types of IRFs (no matter if you want to read Aeff or energy migration you still want to know at which zenith it was produced) so it takes perfect sense to read them in standarized way
So, please, write functions that read each IRF type, validating it's the correct type by looking at HDUCLASX header keywords and returning also the metadata.
I just implemented checking of all the HDUCLASX headers.
Again because of using one universal function it was pretty easy, especially that many of those fields are the same for different IRFs.
This functions can then easily be called for multiple files and the results stacked / compared.
I think this has much better advantages than the little code repetition you avoid by using this read_irf_grid function.
With every thing that we discuss in this thread the amount of code repetition in the approach that you suggested would grow larger and larger. Already now we have 3 issues that are repeated for all IRFs:
- swapping of axis sequence
- checks of HDUCLASX headers
- combining IRFs produced for different parameters
and we have two IRFs already implemented, and a few more that can be added easily. So in the end I think we are avoiding a lot of code repetition
modified the tests on reading classes that they do not use actual data
Codecov Report
@@ Coverage Diff @@
## master #150 +/- ##
==========================================
+ Coverage 89.04% 89.72% +0.68%
==========================================
Files 42 42
Lines 1579 1713 +134
==========================================
+ Hits 1406 1537 +131
- Misses 173 176 +3
Continue to review full report at Codecov.
|
Hi @maxnoe Coming back to this old story of the I/O classes. I implemented your comments back in May. We have a difference of opinion about the code repetition/generic methods, please check my latest replies from May, and let's see if we can converge and merge the I/O classes. |
@jsitarek I think we should revisit this in a more global refactoring, now that gammapy was chosen as science tool for cta. I think it makes sense to use the gammapy data structures inside pyirf, that will make many things a lot easier. As for the IO parts, LST already is using gammapy, so you should also just use gammapy to load the irfs. |
Hi @maxnoe, ok, so what is the proposal? You want to close the IO PR? or leave it open, but without updates for the time being? As we were discussing some months ago, all those IRF developements were urgent in LST, hence the IO was implemented separatelly, but if the IO functions are inside pyirf, lstchain would definitely profit from those |
Since #141 is already merged, as announced there I moved into this separate PR the IO functions.
the summary of the discussion with @maxnoe from the previous PR was that Max was not in favor of having functions that read arrays of IRFs, while I think this is very useful because it is not trivial as some transformations have to be done (separate lo/high bins vs a single list of bins, change of sequence of axes) and they are common in all the IRFs, so if not implemented this way, would result in quite some code repetition. There are some two generic functions in this PR: read_irf_grid and read_fits_bins_lo_hi that read in the actual IRFs (whatever type, including the transformation), and bins values (including their consistency), and two wrapper functions for Aeff and migration matrix.
Please let me know of any modifications of those functions that you would like