-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support under-/overflows in YODA2 writer #56
Conversation
Hi Graeme, apologies for the delay, I had a few more urgent deadlines squeezing in, but I've finally managed to take a look at this. This changes set ensures that under/overflow bins are supported with the default YODA writer. For the YODA1 writer this is less trivial: In Rivet we'd work out the binning from the reference-data objects, so adding two more "dummy points" to the Scatters will likely cause mayhem. In the YODA1 series we would also typically drop the under/overflow bins from the MC histograms when converting them to Scatters, so not adding the overflows to the reference data seems more consistent with that, even though we'd be removing "good data points". This is what I've done in this PR now. Hopefully an additional incentive to upgrade to the Rivet4/YODA series where this is properly supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is great! After I merge this PR, I'll look into making similar changes for the CSV and ROOT writers, but I've also got other things to work on first, so it might take a while.
I made some separate comments about the naming of the new test functions and that it would be good to reduce the size of the test YAML data file. Looking at the YODA2 output, I don't see any explicit labels for the underflow/overflow bins, so how does the YODA code identify them? Does it just count the number of edges (n_e) and values (n_v) and assume that the first and last bins are underflow and overflow if n_v = n_e + 1? Does this mean that the case of an overflow bin without an underflow bin (or vice versa) is not supported (i.e. n_v = n_e)? If so, I think the YODA2 code needs to handle this case in the conversion from YAML, perhaps by just omitting the single underflow/overflow bin as in the YODA1 case or maybe by writing a value of zero for the missing underflow/overflow bin.
hepdata_converter/testsuite/testdata/yaml_inf/figure-aux_001.yaml
Outdated
Show resolved
Hide resolved
Hi Graeme, continuous axes in YODA2 will always range from -inf to +inf and the list of edges that define the visible range will then partition the infinitely long axis. As a result a continuous axis will always have an under- and overflow bins, irrespective of whether these are needed in the end, so the cases you're concerned about are implicitly supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to nitpick, but there is still an issue with the test function naming (see separate comments).
Also, I don't think you understood my previous comment, so let me be more explicit with a minimal example. Consider two YAML data files with only two bins:
- An underflow bin and a finite bin (0-100):
independent_variables:
- header:
name: x
values:
- low: -.inf
high: 0.0
- low: 0.0
high: 100.0
dependent_variables:
- header:
name: data
values:
- value: 0.1
- value: 0.2
- A finite bin (0-100) and an overflow bin.
independent_variables:
- header:
name: x
values:
- low: 0.0
high: 100.0
- low: 100.0
high: .inf
dependent_variables:
- header:
name: data
values:
- value: 0.1
- value: 0.2
I checked that both these examples give the same YODA output with the current code. I think it is a legitimate use case to give an overflow bin without an underflow bin (or vice versa) in the YAML format, so the YODA output would need to write a zero value
(with no errors
) for the missing underflow/overflow bin to remove the ambiguity.
Hi Graeme, I'm afraid I'm still a bit confused as to what the issue actually is, but it's probably just me. :-) Let me try elaborate on my original response: All continuous axes in YODA will always come with an underflow and overflow whether they're needed or not. For the reference data we use
Here, there first one corresponds to an object with underflow but no overflow (and hence the |
* test_parse_no_independent is more accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, I must have made a mistake when checking the YODA output yesterday. Checking again more carefully, the two YAML examples from my previous message give the following YODA output:
- An underflow bin and a finite bin (0-100):
BEGIN YODA_ESTIMATE1D_V3 /REF/RIVET_ANALYSIS_NAME/d01-x01-y01
IsRef: 1
Path: /REF/RIVET_ANALYSIS_NAME/d01-x01-y01
Title: "Figure-aux_001"
Type: Estimate1D
---
Edges(A1): [0.000000e+00, 1.000000e+02]
# value
1.000000e-01
2.000000e-01
nan
END YODA_ESTIMATE1D_V3
- A finite bin (0-100) and an overflow bin.
BEGIN YODA_ESTIMATE1D_V3 /REF/RIVET_ANALYSIS_NAME/d01-x01-y01
IsRef: 1
Path: /REF/RIVET_ANALYSIS_NAME/d01-x01-y01
Title: "Figure-aux_001"
Type: Estimate1D
---
Edges(A1): [0.000000e+00, 1.000000e+02]
# value
nan
1.000000e-01
2.000000e-01
END YODA_ESTIMATE1D_V3
So the nan
value is in a different position for the two cases as you kindly explained. Sorry, the misunderstanding was on my side!
I'll merge now. Many thanks again for your contribution!
Add support for under/overflow bins in YODA writer + unit test.
Skips points with infinity edges in YODA1 writer.
Needed for #55