-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[io] Extract tkey walk logic from TFile::Map() #17575
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea! In this approach, all keys are loaded in memory before printing the information. Do we need a piecewise / cursor based API?
This is possible, although a file containing 1 million keys would only occupy 68 MiB of memory for the TKeyMapNodes (for reference, the 3.8 GB ttjet_13tev benchmark dataset has 278470 keys - about 38 MiB of memory). This is not counting the classname/keyname/key title strings; with them the figure is likely doubled or so. |
6f4457a
to
0491d72
Compare
0491d72
to
45adcda
Compare
The current observed maximum number of baskets in TTree is 50 millions baskets ... and only because it reaches the 1Gb limit for the TTree object. It will/can grow larger once we lift the 1Gb limit and can already reach larger size with RNTuple (probably not quite as easily due to page size being larger than basket sizes). Nonetheless that is 3.1 GiB of memory for the TKeyMapNodes .... so indeed I would recommend some sort of iterators mechanism (other-wise the code simply 'crash/out-of-memory' for large files. |
Test Results 16 files 16 suites 4d 8h 50m 19s ⏱️ For more details on these failures, see this check. Results for commit 45adcda. |
ef10a54
to
cde8d52
Compare
This Pull request:
refactors
TFile::Map
into 2 methods:Map
andWalkTKeys
. The latter contains the logic of traversing the TKeys in the file and returns an array with information about keys, gaps and errors.Map
now simply calls that method and prints out the relevant information, in the same format as before.The main advantage of splitting
WalkTKeys
is that it can be used by other places (like unit tests or client code) that are interested in the internal TKey structure.Checklist: