Skip to content

Commit

Permalink
[df] Fix interaction between RSampleInfo and redirected EOS paths
Browse files Browse the repository at this point in the history
When the input files are paths to FUSE-mounted EOS files, during the event loop
these paths will be redirected to the corresponding xroot EOS URL, in
TFile::Open.

This was causing a bad interaction with the sample metadata and the subsequent
usage in the event loop, e.g. through DefinePerSample. Specifically, we fill a
map with the metadata at construction time, which includes the input file paths
(without redirection). These will then be irretrievable during the event loop
since the map key will not correspond to the redirected path. Fix this by also
adding the redirected map during the filling of the map in ChangeSpec.
  • Loading branch information
vepadulano committed Jan 28, 2025
1 parent 4a19ce0 commit b3bfd23
Showing 1 changed file with 48 additions and 0 deletions.
48 changes: 48 additions & 0 deletions tree/dataframe/src/RLoopManager.cxx
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,17 @@
#include "ROOT/RNTupleDS.hxx"
#endif

// Functions needed to perform EOS XRootD redirection in ChangeSpec
#ifdef R__UNIX
#include "TEnv.h"
#include "TSystem.h"
#endif
#ifndef R__FBSD
#include <sys/xattr.h>
#else
#include <sys/extattr.h>
#endif

#include <algorithm>
#include <atomic>
#include <cassert>
Expand Down Expand Up @@ -403,6 +414,38 @@ RLoopManager::RLoopManager(ROOT::RDF::Experimental::RDatasetSpec &&spec)
ChangeSpec(std::move(spec));
}

#ifdef R__UNIX
namespace {
std::optional<std::string> GetRedirectedSampleId(std::string_view path, std::string_view datasetName)
{
// Mimick the redirection done in TFile::Open to see if the path points to a FUSE-mounted EOS path.
// If so, we create a redirected sample ID with the full xroot URL.
TString expandedUrl(path.data());
gSystem->ExpandPathName(expandedUrl);
if (gEnv->GetValue("TFile.CrossProtocolRedirects", 1) == 1) {
TUrl fileurl(expandedUrl, /* default is file */ kTRUE);
if (strcmp(fileurl.GetProtocol(), "file") == 0) {
ssize_t len = getxattr(fileurl.GetFile(), "eos.url.xroot", nullptr, 0);

Check failure on line 428 in tree/dataframe/src/RLoopManager.cxx

View workflow job for this annotation

GitHub Actions / mac13 ARM64 LLVM_ENABLE_ASSERTIONS=On, builtin_zlib=ON

no matching function for call to 'getxattr'

Check failure on line 428 in tree/dataframe/src/RLoopManager.cxx

View workflow job for this annotation

GitHub Actions / mac15 ARM64 LLVM_ENABLE_ASSERTIONS=On, CMAKE_CXX_STANDARD=20

no matching function for call to 'getxattr'
if (len > 0) {
std::string xurl(len, 0);
std::string fileNameFromUrl{fileurl.GetFile()};
if (getxattr(fileNameFromUrl.c_str(), "eos.url.xroot", &xurl[0], len) == len) {

Check failure on line 432 in tree/dataframe/src/RLoopManager.cxx

View workflow job for this annotation

GitHub Actions / mac13 ARM64 LLVM_ENABLE_ASSERTIONS=On, builtin_zlib=ON

no matching function for call to 'getxattr'

Check failure on line 432 in tree/dataframe/src/RLoopManager.cxx

View workflow job for this annotation

GitHub Actions / mac15 ARM64 LLVM_ENABLE_ASSERTIONS=On, CMAKE_CXX_STANDARD=20

no matching function for call to 'getxattr'
// Sometimes the `getxattr` call may return an invalid URL due
// to the POSIX attribute not being yet completely filled by EOS.
if (auto baseName = fileNameFromUrl.substr(fileNameFromUrl.find_last_of("/") + 1);
std::equal(baseName.crbegin(), baseName.crend(), xurl.crbegin())) {
return xurl + '/' + datasetName.data();
}
}
}
}
}

return std::nullopt;
}
} // namespace
#endif

/**
* @brief Changes the internal TTree held by the RLoopManager.
*
Expand Down Expand Up @@ -441,6 +484,11 @@ void RLoopManager::ChangeSpec(ROOT::RDF::Experimental::RDatasetSpec &&spec)
// is exposed to users via RSampleInfo and DefinePerSample).
const auto sampleId = files[i] + '/' + trees[i];
fSampleMap.insert({sampleId, &sample});
#ifdef R__UNIX
// Also add redirected EOS xroot URL when available
if (auto redirectedSampleId = GetRedirectedSampleId(files[i], trees[i]))
fSampleMap.insert({redirectedSampleId.value(), &sample});
#endif
}
}
SetTree(std::move(chain));
Expand Down

0 comments on commit b3bfd23

Please sign in to comment.