-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add buffered reader #4018
base: main
Are you sure you want to change the base?
feat: Add buffered reader #4018
Conversation
WalkthroughA new Changes
Sequence DiagramsequenceDiagram
participant DR as Downstream Reader
participant BR as BufferedReader
participant ES as Event Store
DR->>BR: Provide Events
BR->>BR: Buffer Events
BR->>BR: Sample Random Event
BR->>ES: Share Event Data
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp (1)
32-34
: Document the impact of selectionSeed, we should.The purpose and impact of selectionSeed on event sampling, explain we must. Help future maintainers understand the randomization process, this will.
Examples/Framework/src/Framework/WhiteBoard.cpp (1)
88-97
: Exception safety improve, we must.If exception thrown during sharing, partial state possible it is. Consider:
- Transaction-like approach for atomic operations
- Document sharing semantics (deep vs shallow)
- Roll back on failure
void ActsExamples::WhiteBoard::shareDataWith(WhiteBoard &other) const { + // Create temporary map for atomic operation + std::map<std::string, std::any> temp; for (auto &[key, val] : m_store) { - auto [it, success] = other.m_store.insert({key, val}); + auto [it, success] = temp.insert({key, val}); if (!success) { throw std::runtime_error("Cannot share key '" + key + "', is already present"); } } + // Atomic operation + other.m_store.insert(temp.begin(), temp.end()); }Examples/Python/tests/test_reader.py (1)
440-483
: Good start with testing, but more coverage needed, I sense!Well structured, the basic test is. But additional test cases, beneficial they would be:
- Different buffer sizes, test you should
- Error handling for invalid configurations, verify you must
- Behavior with empty files, understand we need
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
Examples/Framework/CMakeLists.txt
(1 hunks)Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp
(1 hunks)Examples/Framework/include/ActsExamples/Framework/SequenceElement.hpp
(1 hunks)Examples/Framework/include/ActsExamples/Framework/WhiteBoard.hpp
(1 hunks)Examples/Framework/src/Framework/BufferedReader.cpp
(1 hunks)Examples/Framework/src/Framework/WhiteBoard.cpp
(1 hunks)Examples/Python/src/Input.cpp
(2 hunks)Examples/Python/tests/test_reader.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: macos
- GitHub Check: missing_includes
- GitHub Check: linux_ubuntu_extra (ubuntu2204_clang, 20)
- GitHub Check: linux_ubuntu_extra (ubuntu2204, 20)
- GitHub Check: linux_ubuntu
- GitHub Check: build_debug
🔇 Additional comments (7)
Examples/Framework/include/ActsExamples/Framework/SequenceElement.hpp (1)
54-54
: Careful with friend declarations, we must be! Hmmmm.Access to private members through friendship, this grants. A powerful tool it is, but use it wisely we must. Consider documenting why this friendship necessary it is.
Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp (1)
59-63
: Empty initialize and finalize methods, concerning they are.Consider if initialization or cleanup needed it is. At minimum, document why these methods empty they are.
Examples/Framework/src/Framework/BufferedReader.cpp (1)
43-51
: Memory management concerns, I sense.Reserve buffer size upfront, good practice it is. But consider:
- Large buffer sizes, out of memory errors cause they might
- Progress logging for large buffers, helpful it would be
- Error handling for failed reads, improve we should
Examples/Python/src/Input.cpp (1)
11-11
: Approve the registration of BufferedReader, I do!Properly declared, the new BufferedReader is. Follow the established pattern for reader declarations, it does. Clear and well-structured, the parameters are.
Also applies to: 43-46
Examples/Framework/include/ActsExamples/Framework/WhiteBoard.hpp (2)
44-45
: Wise addition of move semantics, this is!Efficient transfer of WhiteBoard resources, it enables. Default implementations, sufficient they are, given the member types we have.
49-54
: Elegant solution for data sharing, you have created!Share data between WhiteBoard instances with minimal overhead, this method does. Well documented, the behavior is. Through shared pointers, efficiency achieved it has.
Examples/Framework/CMakeLists.txt (1)
18-18
: Correctly integrated into build system, the new source file is!Follow project conventions for source organization, it does. Simple yet effective, the change is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
/// shared pointers. | ||
/// Throws an exception if the other whiteboard already contains one of | ||
/// the keys in this whiteboard. | ||
void shareDataWith(WhiteBoard& other) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a matter of taste but I would personally implement this the other way round, where the method is called on the receiver of the contents and gets the source as an argument.
public: | ||
struct Config { | ||
/// The downstream reader that should be used | ||
std::shared_ptr<IReader> downstreamReader; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick, but in my mind, the source reader is upstream of the buffered reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah right, now where I think of it this would make more sense
|
||
/// Buffer size. The reader will throw and exception if the downstream | ||
/// reader does not provide enough events | ||
std::size_t bufferSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not guaranteed to be initialized. Should we default this to some (low-ish) number?
Also maybe add a comment that this is cached in memory and therefore has memory footprint implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. I think it should be defaulted to 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
Examples/Python/tests/test_reader.py (2)
447-460
: Hmm. Verify the event count for data generation, we must.Only
eventsInBuffer
events are written, buteventsToProcess
events we expect to read. Document this intentional design choice in comments, we should, to prevent confusion for future padawans.# need to write out some particles first +# Note: We intentionally write fewer events than we process +# to demonstrate the BufferedReader's resampling capability eventsInBuffer = 5 eventsToProcess = 10
479-486
: Incomplete, the test coverage is. Additional assertions, we need.Only verify the number of events seen, we do. Add assertions to verify the content of resampled events and proper buffer management, we should.
s2.run() assert alg.events_seen == eventsToProcess + # Verify that events are properly resampled from buffer + events = [ev.particles for ev in s2.events()] + assert len(set(str(ev) for ev in events)) <= eventsInBuffer
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
Examples/Python/tests/test_reader.py
(1 hunks)
🧰 Additional context used
📓 Learnings (1)
📓 Common learnings
Learnt from: benjaminhuth
PR: acts-project/acts#4018
File: Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp:51-54
Timestamp: 2025-01-10T10:35:16.967Z
Learning: The BufferedReader in ActsExamples is designed to serve an unlimited number of events by reusing/resampling from its fixed-size buffer, which helps reduce I/O overhead in timing measurements. Therefore, its availableEvents() correctly returns std::numeric_limits<std::size_t>::max() as the upper bound.
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: linux_physmon
- GitHub Check: linux_examples_test
- GitHub Check: missing_includes
- GitHub Check: linux_ubuntu_extra (ubuntu2204_clang, 20)
- GitHub Check: build_debug
🔇 Additional comments (2)
Examples/Python/tests/test_reader.py (2)
440-446
: Clear and descriptive test setup, it is!Well-defined variables for buffer size and total events to process, I see. Good practice for test readability and maintainability, this is.
471-477
: Strong with the Force, this configuration is!Correctly configured, the BufferedReader is. Match the retrieved learning about unlimited event serving through resampling, it does.
@pytest.mark.root | ||
def test_buffered_reader(tmp_path, conf_const, ptcl_gun): | ||
# Test the buffered reader with the ROOT particle reader | ||
# need to write out some particles first | ||
eventsInBuffer = 5 | ||
eventsToProcess = 10 | ||
|
||
s = Sequencer(numThreads=1, events=eventsInBuffer, logLevel=acts.logging.WARNING) | ||
evGen = ptcl_gun(s) | ||
|
||
file = tmp_path / "particles.root" | ||
s.addWriter( | ||
conf_const( | ||
RootParticleWriter, | ||
acts.logging.WARNING, | ||
inputParticles=evGen.config.outputParticles, | ||
filePath=str(file), | ||
) | ||
) | ||
|
||
s.run() | ||
|
||
# reset sequencer for reading | ||
s2 = Sequencer(events=eventsToProcess, numThreads=1, logLevel=acts.logging.WARNING) | ||
|
||
reader = acts.examples.RootParticleReader( | ||
level=acts.logging.WARNING, | ||
outputParticles="particles_input", | ||
filePath=str(file), | ||
) | ||
|
||
s2.addReader( | ||
acts.examples.BufferedReader( | ||
level=acts.logging.WARNING, | ||
downstreamReader=reader, | ||
bufferSize=eventsInBuffer, | ||
) | ||
) | ||
|
||
alg = AssertCollectionExistsAlg( | ||
"particles_input", "check_alg", acts.logging.WARNING | ||
) | ||
s2.addAlgorithm(alg) | ||
|
||
s2.run() | ||
|
||
assert alg.events_seen == eventsToProcess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Missing edge cases, I sense in this test.
Test additional scenarios, we must:
- Buffer size of 1
- Buffer size equal to events to process
- Error cases for invalid buffer sizes
Help you write these additional test cases, shall I?
Quality Gate passedIssues Measures |
This reader can wrap an exisiting reader, preload some events in a buffer, and randomly picks events upon execution in the sequencer. Should help to mitigate I/O bottlenecks in throughput measurements.
--- END COMMIT MESSAGE ---
Any further description goes here, @-mentions are ok here!
feat
,fix
,refactor
,docs
,chore
andbuild
types.Summary by CodeRabbit
New Features
BufferedReader
for enhanced event data handling with buffering capabilities.WhiteBoard
.Improvements
WhiteBoard
.BufferedReader
.Tests
BufferedReader
functionality.