Add RDataFrame to the primer #724

vepadulano · 2022-01-18T17:45:56Z

This commit adds RDataFrame content to the ROOT primer. It also removes obsolete content about TNTuple, TSelector and PROOF Lite to discuss data analysis in terms of RDataFrame. Parts of the old content that were referring to the underlying data format have been kept and reworded under a new section about the ROOT dataset

Co-authored-by: Ivan Kabadzhov [email protected]
Co-authored-by: Enrico Guiraud [email protected]

eguiraud

build/724/primer/index.html:3417:<a href="https://root.cern.ch/doc/master/classROOT_1_1RDataFrame.html#parallel-execution">in the RDataFrame documentation</a>.</p>
build/724/primer/index.html:3430:<a href="https://root.cern.ch/doc/master/classROOT_1_1RDataFrame.html#python">in the RDataFrame documentation</a>.</p>

Found 2 links to root.cern.ch. Please change them to link to root.cern (no '.ch') instead.

couet · 2022-01-19T09:06:19Z

I see the NTuple paragraph showing how to use TNtuple has been completely removed. I understand we want to remove the PROOF part (as PROOF is now old software and has been declared as such). But TNtuple has not been marked as legacy code and many tutorials are using it.

Removing it from the primer is a "political" (I do not have better word) choice, which is perfectly understandable. I think it is fair this choice if done by the new young generation of the ROOT developers and not by an old developper like me. So I approve this PR.

couet · 2022-01-19T09:30:19Z

Also, the original markdown version of the primer is still in the ROOT repository. It allows to generate printable versions of the primer (pdf, epub ...). With this PR this markdown primer will not be in sync with the html one in the ROOT web site.

My question is: Should we completely remove this markdown primer for the ROOT repo ?

This PR removes it: root-project/root#9619
This PR removes the Primer build: root-project/rootspi#108

couet · 2022-01-19T09:46:18Z

One more comment:
At the top of the html Primer we have:

Original Authors: D. Piparo, G. Quast, M. Zeise

Given the major changes this PR provides, I guess it is worth mentioning the new authors (Ivan Enrico Vincenzo)

couet · 2022-01-19T16:05:17Z

There is also the version of the Primer (an "interactive one") which is also tout of sync. Should it be removed also ?
https://github.com/root-project/rootspi/tree/master/ROOT-Primer

eguiraud · 2022-01-19T16:15:53Z

My question is: Should we completely remove this markdown primer for the ROOT repo ?

There is also the version of the Primer (an "interactive one") which is also tout of sync. Should it be removed also ?

You tell us 😬

couet · 2022-01-19T17:14:34Z

You tell us

I made two PRs to remove the Markdown version

And, yes, I will be in favour to remove the "interactive one" also (need to check first from where it is accessible)

jalopezg-git

LGTM; just some minor comments.

primer/index.md

jalopezg-git · 2022-01-20T11:49:10Z

primer/index.md

+RDataFrame reads collections as the special type
+[ROOT::RVec](https://root.cern/doc/master/classROOT_1_1VecOps_1_1RVec.html):
+for example, a column where each element is an array of floating point numbers
+can be read as a ROOT::RVecF. C-style arrays (with variable or static size),


Given that we talked about the RVec type (and not RVecF), I would prefer to stick to that, i.e.

Suggested change

can be read as a ROOT::RVecF. C-style arrays (with variable or static size),

can be read as a `ROOT::VecOps::RVec<float>`. C-style arrays (with variable or static size),

we should advertise the short aliases though

Maybe I can write a short sentence describing the existence of the aliases

I've added a small sentence there. If it looks ok this can be merged

I'd like to figure out what to do with the other versions of the primer before merging

primer/index.md

Axel-Naumann · 2022-01-20T17:04:58Z

Do we have a chance to keep a notebook version of the primer, maybe even sort of automatic? Or do yous think that it's not worth it?

couet · 2022-01-21T07:34:20Z

I think it will be much simpler to maintain only one version of the Primer. We have plenty of tutorials in Notebook format. The Primer is really a manual. I am not sure it makes sense to have it as a Notebook. Moreover, this PR highlight the fact the Primer is something important to have, it evolves, it is not frozen. Several versions will be a nightmare to maintain. I would go toward simplification.

Axel-Naumann · 2022-01-21T07:42:48Z

The Primer is really a manual.

How so - it seems to have lot of code?

My question isn't whether we can re-introduce duplication but whether we should really drop the notebook. I.e. can we either generate the notebook from the single source, or should the notebook be the single source?

couet · 2022-01-21T08:00:50Z

How so - it seems to have lot of code?

Yes but in the "Manual" and in the "reference Guide"there is a lot of code too.

I must admit I do not really know how the Notebook is generated. I need to check. I do not even know where it can seen. I try to find it online yesterday without success.

This commit adds RDataFrame content to the ROOT primer. It also removes obsolete content about TNTuple, TSelector and PROOF Lite to discuss data analysis in terms of RDataFrame. Parts of the old content that were referring to the underlying data format have been kept and reworded under a new section about the ROOT dataset Co-authored-by: Ivan Kabadzhov <[email protected]> Co-authored-by: Enrico Guiraud <[email protected]>

couet · 2022-03-07T15:04:39Z

Should this go in: https://github.com/root-project/NotebookPrimer ?

vepadulano self-assigned this Jan 18, 2022

eguiraud requested changes Jan 19, 2022

View reviewed changes

eguiraud requested review from Axel-Naumann and couet January 19, 2022 08:34

couet requested a review from eguiraud January 19, 2022 08:52

couet approved these changes Jan 19, 2022

View reviewed changes

couet self-assigned this Jan 19, 2022

couet mentioned this pull request Jan 19, 2022

Remove the Primer in Markdown format root-project/root#9619

Closed

vepadulano force-pushed the rdf-in-primer branch 2 times, most recently from 4f2b280 to 576a767 Compare January 19, 2022 19:20

jalopezg-git reviewed Jan 20, 2022

View reviewed changes

vepadulano force-pushed the rdf-in-primer branch from 576a767 to 518fe39 Compare January 21, 2022 08:30

couet merged commit 0ae075b into root-project:main Sep 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RDataFrame to the primer #724

Add RDataFrame to the primer #724

vepadulano commented Jan 18, 2022 •

edited

Loading

eguiraud left a comment

couet commented Jan 19, 2022 •

edited

Loading

couet commented Jan 19, 2022 •

edited

Loading

couet commented Jan 19, 2022

couet commented Jan 19, 2022

eguiraud commented Jan 19, 2022 •

edited

Loading

couet commented Jan 19, 2022 •

edited

Loading

jalopezg-git left a comment

jalopezg-git Jan 20, 2022

eguiraud Jan 20, 2022

vepadulano Jan 21, 2022

vepadulano Jan 21, 2022

eguiraud Jan 21, 2022

Axel-Naumann commented Jan 20, 2022 •

edited

Loading

couet commented Jan 21, 2022

Axel-Naumann commented Jan 21, 2022 •

edited

Loading

couet commented Jan 21, 2022

couet commented Mar 7, 2022

	can be read as a ROOT::RVecF. C-style arrays (with variable or static size),
	can be read as a `ROOT::VecOps::RVec<float>`. C-style arrays (with variable or static size),

Add RDataFrame to the primer #724

Add RDataFrame to the primer #724

Conversation

vepadulano commented Jan 18, 2022 • edited Loading

eguiraud left a comment

Choose a reason for hiding this comment

couet commented Jan 19, 2022 • edited Loading

couet commented Jan 19, 2022 • edited Loading

couet commented Jan 19, 2022

couet commented Jan 19, 2022

eguiraud commented Jan 19, 2022 • edited Loading

couet commented Jan 19, 2022 • edited Loading

jalopezg-git left a comment

Choose a reason for hiding this comment

jalopezg-git Jan 20, 2022

Choose a reason for hiding this comment

eguiraud Jan 20, 2022

Choose a reason for hiding this comment

vepadulano Jan 21, 2022

Choose a reason for hiding this comment

vepadulano Jan 21, 2022

Choose a reason for hiding this comment

eguiraud Jan 21, 2022

Choose a reason for hiding this comment

Axel-Naumann commented Jan 20, 2022 • edited Loading

couet commented Jan 21, 2022

Axel-Naumann commented Jan 21, 2022 • edited Loading

couet commented Jan 21, 2022

couet commented Mar 7, 2022

vepadulano commented Jan 18, 2022 •

edited

Loading

couet commented Jan 19, 2022 •

edited

Loading

couet commented Jan 19, 2022 •

edited

Loading

eguiraud commented Jan 19, 2022 •

edited

Loading

couet commented Jan 19, 2022 •

edited

Loading

Axel-Naumann commented Jan 20, 2022 •

edited

Loading

Axel-Naumann commented Jan 21, 2022 •

edited

Loading