Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ntuple] RMiniFile: write StreamerInfo with the correct compression #17565

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions tree/ntuple/v7/inc/ROOT/RMiniFile.hxx
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
#include <ROOT/RNTuple.hxx>
#include <ROOT/RNTupleSerialize.hxx>
#include <ROOT/RSpan.hxx>
#include <Compression.h>
#include <string_view>

#include <cstdint>
Expand Down Expand Up @@ -202,7 +203,7 @@ private:
/// Write the TList with the RNTuple key
void WriteTFileKeysList();
/// Write the compressed streamer info record with the description of the RNTuple class
void WriteTFileStreamerInfo();
void WriteTFileStreamerInfo(int compression);
/// Last record in the file
void WriteTFileFreeList();
/// For a bare file, which is necessarily written by a C file stream, write file header
Expand Down Expand Up @@ -245,7 +246,7 @@ public:
/// Ensures that the streamer info records passed as argument are written to the file
void UpdateStreamerInfos(const RNTupleSerializer::StreamerInfoMap_t &streamerInfos);
/// Writes the RNTuple key to the file so that the header and footer keys can be found
void Commit();
void Commit(int compression = RCompressionSetting::EDefaults::kUseGeneralPurpose);
};

} // namespace Internal
Expand Down
8 changes: 4 additions & 4 deletions tree/ntuple/v7/src/RMiniFile.cxx
Original file line number Diff line number Diff line change
Expand Up @@ -1188,7 +1188,7 @@ void ROOT::Experimental::Internal::RNTupleFileWriter::UpdateStreamerInfos(
fStreamerInfoMap.insert(streamerInfos.cbegin(), streamerInfos.cend());
}

void ROOT::Experimental::Internal::RNTupleFileWriter::Commit()
void ROOT::Experimental::Internal::RNTupleFileWriter::Commit(int compression)
{
if (fFileProper) {
// Easy case, the ROOT file header and the RNTuple streaming is taken care of by TFile
Expand Down Expand Up @@ -1220,7 +1220,7 @@ void ROOT::Experimental::Internal::RNTupleFileWriter::Commit()

WriteTFileNTupleKey();
WriteTFileKeysList();
WriteTFileStreamerInfo();
WriteTFileStreamerInfo(compression);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd that the compression argument is only needed here. What compression setting is used for the list of keys, free list and for the RNTuple key?

Copy link
Contributor Author

@silverweed silverweed Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am under the impression that they are always uncompressed, (the first two are uncompressed even in regular TFiles as far as I saw).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcanal let me know if you have other questions/comments or if this thread is resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am under the impression that they are always uncompressed,

What lead you to this conclusion?

the first two are uncompressed even in regular TFiles as far as I saw

Indeed the list of keys is not compressed (not sure why though) but I don't see a mechanism why the RNTuple anchor would not be compressed (besides maybe the compression failing to save space)

Copy link
Contributor Author

@silverweed silverweed Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What lead you to this conclusion?

Reading the code in RMiniFile the only time we call Zip is in the WriteStreamerInfo function. Also, all RNTuple files I've inspected so far had both sections uncompressed, regardless of whether they have been written through RMiniFile or TFile.

but I don't see a mechanism why the RNTuple anchor would not be compressed

I suppose it may be compressed if we manually wrote a RNTuple anchor through the TFile API, but if we pass through RMiniFile reader it will always be uncompressed (which is a behavior we may or may not want to change - though not in this PR I'd argue)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, all RNTuple files I've inspected so far had both sections uncompressed, regardless of whether they have been written through RMiniFile or TFile.

That was really the question at hand. Is it compressed in TFile? It should have been but if not, why/how is it not compressed.

(which is a behavior we may or may not want to change - though not in this PR I'd argue)

As we are touching that code, I would include the change also in this PR. The new code is:

   WriteTFileNTupleKey();
   WriteTFileKeysList();
   WriteTFileStreamerInfo(compression);

with `compression being a new variable. This leaves 'ambiguous' whether the other two are 'not compressed' or 'compressed with an arbitrary value ignoring the user specified compression for undisclosed reasons'.
At the very least we would have:

   WriteTFileNTupleKey(); // Not compressed yet
   WriteTFileKeysList();     // Intentionally never compressed.
   WriteTFileStreamerInfo(compression);

or better

   WriteTFileNTupleKey(compress /* or 0 if one wants to keep the current erroneous behavior */ );
   WriteTFileKeysList();     // Intentionally never compressed.
   WriteTFileStreamerInfo(compression);

WriteTFileFreeList();

// Update header and TFile record
Expand Down Expand Up @@ -1365,7 +1365,7 @@ void ROOT::Experimental::Internal::RNTupleFileWriter::WriteBareFileSkeleton(int
fFileSimple.fKeyOffset = fFileSimple.fFilePos;
}

void ROOT::Experimental::Internal::RNTupleFileWriter::WriteTFileStreamerInfo()
void ROOT::Experimental::Internal::RNTupleFileWriter::WriteTFileStreamerInfo(int compression)
{
// The streamer info record is a TList of TStreamerInfo object. We cannot use
// RNTupleSerializer::SerializeStreamerInfos because that uses TBufferIO::WriteObject.
Expand Down Expand Up @@ -1400,7 +1400,7 @@ void ROOT::Experimental::Internal::RNTupleFileWriter::WriteTFileStreamerInfo()

RNTupleCompressor compressor;
auto zipStreamerInfos = MakeUninitArray<unsigned char>(lenPayload);
auto szZipStreamerInfos = compressor.Zip(bufPayload, lenPayload, 1, zipStreamerInfos.get());
auto szZipStreamerInfos = compressor.Zip(bufPayload, lenPayload, compression, zipStreamerInfos.get());

fFileSimple.WriteKey(zipStreamerInfos.get(), szZipStreamerInfos, lenPayload,
fFileSimple.fControlBlock->fHeader.GetSeekInfo(), RTFHeader::kBEGIN, "TList", "StreamerInfo",
Expand Down
2 changes: 1 addition & 1 deletion tree/ntuple/v7/src/RPageStorageFile.cxx
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,7 @@ void ROOT::Experimental::Internal::RPageSinkFile::CommitDatasetImpl(unsigned cha
auto szFooterZip = fCompressor->Zip(serializedFooter, length, GetWriteOptions().GetCompression(),
RNTupleCompressor::MakeMemCopyWriter(bufFooterZip.get()));
fWriter->WriteNTupleFooter(bufFooterZip.get(), szFooterZip, length);
fWriter->Commit();
fWriter->Commit(GetWriteOptions().GetCompression());
}

////////////////////////////////////////////////////////////////////////////////
Expand Down
Loading