Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ntuple] RMiniFile: write StreamerInfo with the correct compression #17565

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

silverweed
Copy link
Contributor

Currently RMiniFile doesn't honor the compression settings set in the RNTupleWriteOptions when writing out the StreamerInfo, but instead it always ZLIB-compresses it.
With this change, it will use the same compression as it was specified in the options (or 505 if unspecified).
This makes the behavior more consistent with the TFile, where the TFile compression determines also the StreamerInfo's compression (whereas now RNTuple may export a compressed StreamerInfo even if the TFile's compression is 0).

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

@silverweed silverweed self-assigned this Jan 29, 2025
@silverweed silverweed marked this pull request as ready for review January 29, 2025 14:33
@silverweed silverweed requested a review from jblomer as a code owner January 29, 2025 14:33
@pcanal
Copy link
Member

pcanal commented Jan 29, 2025

For context, the same PR (in spirit) was applied to TFile by commit b9a212a (i.e. it used to be the case for TFile that the StreamerInfo record was always compressed with zlib but this was removed)

@@ -1220,7 +1220,7 @@ void ROOT::Experimental::Internal::RNTupleFileWriter::Commit()

WriteTFileNTupleKey();
WriteTFileKeysList();
WriteTFileStreamerInfo();
WriteTFileStreamerInfo(compression);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd that the compression argument is only needed here. What compression setting is used for the list of keys, free list and for the RNTuple key?

Copy link
Contributor Author

@silverweed silverweed Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am under the impression that they are always uncompressed, (the first two are uncompressed even in regular TFiles as far as I saw).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcanal let me know if you have other questions/comments or if this thread is resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am under the impression that they are always uncompressed,

What lead you to this conclusion?

the first two are uncompressed even in regular TFiles as far as I saw

Indeed the list of keys is not compressed (not sure why though) but I don't see a mechanism why the RNTuple anchor would not be compressed (besides maybe the compression failing to save space)

Copy link
Contributor Author

@silverweed silverweed Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What lead you to this conclusion?

Reading the code in RMiniFile the only time we call Zip is in the WriteStreamerInfo function. Also, all RNTuple files I've inspected so far had both sections uncompressed, regardless of whether they have been written through RMiniFile or TFile.

but I don't see a mechanism why the RNTuple anchor would not be compressed

I suppose it may be compressed if we manually wrote a RNTuple anchor through the TFile API, but if we pass through RMiniFile reader it will always be uncompressed (which is a behavior we may or may not want to change - though not in this PR I'd argue)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, all RNTuple files I've inspected so far had both sections uncompressed, regardless of whether they have been written through RMiniFile or TFile.

That was really the question at hand. Is it compressed in TFile? It should have been but if not, why/how is it not compressed.

(which is a behavior we may or may not want to change - though not in this PR I'd argue)

As we are touching that code, I would include the change also in this PR. The new code is:

   WriteTFileNTupleKey();
   WriteTFileKeysList();
   WriteTFileStreamerInfo(compression);

with `compression being a new variable. This leaves 'ambiguous' whether the other two are 'not compressed' or 'compressed with an arbitrary value ignoring the user specified compression for undisclosed reasons'.
At the very least we would have:

   WriteTFileNTupleKey(); // Not compressed yet
   WriteTFileKeysList();     // Intentionally never compressed.
   WriteTFileStreamerInfo(compression);

or better

   WriteTFileNTupleKey(compress /* or 0 if one wants to keep the current erroneous behavior */ );
   WriteTFileKeysList();     // Intentionally never compressed.
   WriteTFileStreamerInfo(compression);

Copy link

Test Results

    18 files      18 suites   5d 1h 40m 45s ⏱️
 2 683 tests  2 681 ✅ 0 💤 2 ❌
46 598 runs  46 596 ✅ 0 💤 2 ❌

For more details on these failures, see this check.

Results for commit 4aeae98.

@hahnjo
Copy link
Member

hahnjo commented Jan 30, 2025

So our first RNTuple.root is not fully uncompressed 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants