-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RDataFrame to the primer #724
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build/724/primer/index.html:3417:<a href="https://root.cern.ch/doc/master/classROOT_1_1RDataFrame.html#parallel-execution">in the RDataFrame documentation</a>.</p>
build/724/primer/index.html:3430:<a href="https://root.cern.ch/doc/master/classROOT_1_1RDataFrame.html#python">in the RDataFrame documentation</a>.</p>
Found 2 links to root.cern.ch. Please change them to link to root.cern (no '.ch') instead.
I see the NTuple paragraph showing how to use TNtuple has been completely removed. I understand we want to remove the PROOF part (as PROOF is now old software and has been declared as such). But TNtuple has not been marked as legacy code and many tutorials are using it. Removing it from the primer is a "political" (I do not have better word) choice, which is perfectly understandable. I think it is fair this choice if done by the new young generation of the ROOT developers and not by an old developper like me. So I approve this PR. |
Also, the original markdown version of the primer is still in the ROOT repository. It allows to generate printable versions of the primer (pdf, epub ...). With this PR this markdown primer will not be in sync with the html one in the ROOT web site. My question is: Should we completely remove this markdown primer for the ROOT repo ? This PR removes it: root-project/root#9619 |
One more comment: Original Authors: D. Piparo, G. Quast, M. Zeise Given the major changes this PR provides, I guess it is worth mentioning the new authors (Ivan Enrico Vincenzo) |
There is also the version of the Primer (an "interactive one") which is also tout of sync. Should it be removed also ? |
You tell us 😬 |
I made two PRs to remove the Markdown version And, yes, I will be in favour to remove the "interactive one" also (need to check first from where it is accessible) |
4f2b280
to
576a767
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; just some minor comments.
primer/index.md
Outdated
RDataFrame reads collections as the special type | ||
[ROOT::RVec](https://root.cern/doc/master/classROOT_1_1VecOps_1_1RVec.html): | ||
for example, a column where each element is an array of floating point numbers | ||
can be read as a ROOT::RVecF. C-style arrays (with variable or static size), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we talked about the RVec
type (and not RVecF
), I would prefer to stick to that, i.e.
can be read as a ROOT::RVecF. C-style arrays (with variable or static size), | |
can be read as a `ROOT::VecOps::RVec<float>`. C-style arrays (with variable or static size), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should advertise the short aliases though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I can write a short sentence describing the existence of the aliases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a small sentence there. If it looks ok this can be merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to figure out what to do with the other versions of the primer before merging
Do we have a chance to keep a notebook version of the primer, maybe even sort of automatic? Or do yous think that it's not worth it? |
I think it will be much simpler to maintain only one version of the Primer. We have plenty of tutorials in Notebook format. The Primer is really a manual. I am not sure it makes sense to have it as a Notebook. Moreover, this PR highlight the fact the Primer is something important to have, it evolves, it is not frozen. Several versions will be a nightmare to maintain. I would go toward simplification. |
How so - it seems to have lot of code? My question isn't whether we can re-introduce duplication but whether we should really drop the notebook. I.e. can we either generate the notebook from the single source, or should the notebook be the single source? |
Yes but in the "Manual" and in the "reference Guide"there is a lot of code too. I must admit I do not really know how the Notebook is generated. I need to check. I do not even know where it can seen. I try to find it online yesterday without success. |
This commit adds RDataFrame content to the ROOT primer. It also removes obsolete content about TNTuple, TSelector and PROOF Lite to discuss data analysis in terms of RDataFrame. Parts of the old content that were referring to the underlying data format have been kept and reworded under a new section about the ROOT dataset Co-authored-by: Ivan Kabadzhov <[email protected]> Co-authored-by: Enrico Guiraud <[email protected]>
576a767
to
518fe39
Compare
Should this go in: https://github.com/root-project/NotebookPrimer ? |
This commit adds RDataFrame content to the ROOT primer. It also removes obsolete content about TNTuple, TSelector and PROOF Lite to discuss data analysis in terms of RDataFrame. Parts of the old content that were referring to the underlying data format have been kept and reworded under a new section about the ROOT dataset
Co-authored-by: Ivan Kabadzhov [email protected]
Co-authored-by: Enrico Guiraud [email protected]