feat: Add buffered reader #4018

benjaminhuth · 2025-01-10T10:21:21Z

This reader can wrap an exisiting reader, preload some events in a buffer, and randomly picks events upon execution in the sequencer. Should help to mitigate I/O bottlenecks in throughput measurements.

--- END COMMIT MESSAGE ---

Any further description goes here, @-mentions are ok here!

Use a conventional commits prefix: quick summary
- We mostly use feat, fix, refactor, docs, chore and build types.
A milestone will be assigned by one of the maintainers

Summary by CodeRabbit

New Features
- Introduced BufferedReader for enhanced event data handling with buffering capabilities.
- Added data sharing functionality for WhiteBoard.
Improvements
- Implemented move semantics for WhiteBoard.
- Added Python interface support for BufferedReader.
Tests
- Added new test case for BufferedReader functionality.

coderabbitai · 2025-01-10T10:21:29Z

Walkthrough

A new BufferedReader component introduced to the ACTS Examples Framework, it is. Designed to enhance event data handling through buffered reading, the implementation spans multiple files. A new class implementing the IReader interface, with capabilities for configuring event sampling, buffer management, and downstream reader integration, it provides.

Changes

File	Change Summary
`Examples/Framework/CMakeLists.txt`	Added `BufferedReader.cpp` to build configuration
`Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp`	New header defining `BufferedReader` class with configuration and reading methods
`Examples/Framework/include/ActsExamples/Framework/SequenceElement.hpp`	Added `BufferedReader` as a friend class
`Examples/Framework/include/ActsExamples/Framework/WhiteBoard.hpp`	Added move semantics and `shareDataWith` method
`Examples/Framework/src/Framework/BufferedReader.cpp`	Implementation of `BufferedReader` class methods
`Examples/Framework/src/Framework/WhiteBoard.cpp`	Added `shareDataWith` method implementation
`Examples/Python/src/Input.cpp`	Integrated `BufferedReader` into Python interface
`Examples/Python/tests/test_reader.py`	Added test for `BufferedReader` functionality

Sequence Diagram

sequenceDiagram
    participant DR as Downstream Reader
    participant BR as BufferedReader
    participant ES as Event Store
    
    DR->>BR: Provide Events
    BR->>BR: Buffer Events
    BR->>BR: Sample Random Event
    BR->>ES: Share Event Data

Possibly related PRs

chore: Clean event generator includes in Examples #3964: Potential connection to event processing pipeline modifications.

Suggested labels

automerge

Suggested reviewers

paulgessinger

Poem

Buffered bytes flow like the Force, 🌟
Events dance in memory's course, 🎲
Random seeds of wisdom sown, 🌱
Through readers, knowledge has grown! 📚
Wisdom in data, calm and bright ✨

Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp (1)

32-34: Document the impact of selectionSeed, we should.

The purpose and impact of selectionSeed on event sampling, explain we must. Help future maintainers understand the randomization process, this will.
Examples/Framework/src/Framework/WhiteBoard.cpp (1)
88-97: Exception safety improve, we must.

If exception thrown during sharing, partial state possible it is. Consider:

Transaction-like approach for atomic operations

Document sharing semantics (deep vs shallow)

Roll back on failure
 void ActsExamples::WhiteBoard::shareDataWith(WhiteBoard &other) const {
+    // Create temporary map for atomic operation
+    std::map<std::string, std::any> temp;
     for (auto &[key, val] : m_store) {
-        auto [it, success] = other.m_store.insert({key, val});
+        auto [it, success] = temp.insert({key, val});
         if (!success) {
             throw std::runtime_error("Cannot share key '" + key +
                                    "', is already present");
         }
     }
+    // Atomic operation
+    other.m_store.insert(temp.begin(), temp.end());
 }
Examples/Python/tests/test_reader.py (1)

440-483: Good start with testing, but more coverage needed, I sense!

Well structured, the basic test is. But additional test cases, beneficial they would be:

Different buffer sizes, test you should

Error handling for invalid configurations, verify you must

Behavior with empty files, understand we need

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6eb834 and ffcc912.

📒 Files selected for processing (8)

Examples/Framework/CMakeLists.txt (1 hunks)
Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp (1 hunks)
Examples/Framework/include/ActsExamples/Framework/SequenceElement.hpp (1 hunks)
Examples/Framework/include/ActsExamples/Framework/WhiteBoard.hpp (1 hunks)
Examples/Framework/src/Framework/BufferedReader.cpp (1 hunks)
Examples/Framework/src/Framework/WhiteBoard.cpp (1 hunks)
Examples/Python/src/Input.cpp (2 hunks)
Examples/Python/tests/test_reader.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (6)

GitHub Check: macos
GitHub Check: missing_includes
GitHub Check: linux_ubuntu_extra (ubuntu2204_clang, 20)
GitHub Check: linux_ubuntu_extra (ubuntu2204, 20)
GitHub Check: linux_ubuntu
GitHub Check: build_debug

🔇 Additional comments (7)

Examples/Framework/include/ActsExamples/Framework/SequenceElement.hpp (1)

54-54: Careful with friend declarations, we must be! Hmmmm.

Access to private members through friendship, this grants. A powerful tool it is, but use it wisely we must. Consider documenting why this friendship necessary it is.

Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp (1)

59-63: Empty initialize and finalize methods, concerning they are.

Consider if initialization or cleanup needed it is. At minimum, document why these methods empty they are.

Examples/Framework/src/Framework/BufferedReader.cpp (1)

43-51: Memory management concerns, I sense.

Reserve buffer size upfront, good practice it is. But consider:

Large buffer sizes, out of memory errors cause they might

Progress logging for large buffers, helpful it would be

Error handling for failed reads, improve we should

Examples/Python/src/Input.cpp (1)

11-11: Approve the registration of BufferedReader, I do!

Properly declared, the new BufferedReader is. Follow the established pattern for reader declarations, it does. Clear and well-structured, the parameters are.

Also applies to: 43-46

Examples/Framework/include/ActsExamples/Framework/WhiteBoard.hpp (2)

44-45: Wise addition of move semantics, this is!

Efficient transfer of WhiteBoard resources, it enables. Default implementations, sufficient they are, given the member types we have.

49-54: Elegant solution for data sharing, you have created!

Share data between WhiteBoard instances with minimal overhead, this method does. Well documented, the behavior is. Through shared pointers, efficiency achieved it has.

Examples/Framework/CMakeLists.txt (1)

18-18: Correctly integrated into build system, the new source file is!

Follow project conventions for source organization, it does. Simple yet effective, the change is.

Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp

Examples/Framework/src/Framework/BufferedReader.cpp

github-actions · 2025-01-10T11:16:45Z

📊: Physics performance monitoring for `dec9e86`

Full contents

physmon summary

paulgessinger

👍

paulgessinger · 2025-01-10T11:56:52Z

Examples/Framework/include/ActsExamples/Framework/WhiteBoard.hpp

+  /// shared pointers.
+  /// Throws an exception if the other whiteboard already contains one of
+  /// the keys in this whiteboard.
+  void shareDataWith(WhiteBoard& other) const;


This is a matter of taste but I would personally implement this the other way round, where the method is called on the receiver of the contents and gets the source as an argument.

Examples/Framework/include/ActsExamples/Framework/SequenceElement.hpp

paulgessinger · 2025-01-10T11:58:31Z

Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp

+ public:
+  struct Config {
+    /// The downstream reader that should be used
+    std::shared_ptr<IReader> downstreamReader;


Nitpick, but in my mind, the source reader is upstream of the buffered reader.

yeah right, now where I think of it this would make more sense

Examples/Framework/src/Framework/BufferedReader.cpp

paulgessinger · 2025-01-10T12:00:51Z

Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp

+
+    /// Buffer size. The reader will throw and exception if the downstream
+    /// reader does not provide enough events
+    std::size_t bufferSize;


This is not guaranteed to be initialized. Should we default this to some (low-ish) number?

Also maybe add a comment that this is cached in memory and therefore has memory footprint implications.

good point. I think it should be defaulted to 1.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

Examples/Python/tests/test_reader.py (2)
447-460: Hmm. Verify the event count for data generation, we must.

Only eventsInBuffer events are written, but eventsToProcess events we expect to read. Document this intentional design choice in comments, we should, to prevent confusion for future padawans.
 # need to write out some particles first
+# Note: We intentionally write fewer events than we process
+# to demonstrate the BufferedReader's resampling capability
 eventsInBuffer = 5
 eventsToProcess = 10
479-486: Incomplete, the test coverage is. Additional assertions, we need.

Only verify the number of events seen, we do. Add assertions to verify the content of resampled events and proper buffer management, we should.
     s2.run()
 
     assert alg.events_seen == eventsToProcess
+    # Verify that events are properly resampled from buffer
+    events = [ev.particles for ev in s2.events()]
+    assert len(set(str(ev) for ev in events)) <= eventsInBuffer

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ffcc912 and a83827e.

📒 Files selected for processing (1)

Examples/Python/tests/test_reader.py (1 hunks)

🧰 Additional context used

📓 Learnings (1)

📓 Common learnings

Learnt from: benjaminhuth
PR: acts-project/acts#4018
File: Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp:51-54
Timestamp: 2025-01-10T10:35:16.967Z
Learning: The BufferedReader in ActsExamples is designed to serve an unlimited number of events by reusing/resampling from its fixed-size buffer, which helps reduce I/O overhead in timing measurements. Therefore, its availableEvents() correctly returns std::numeric_limits<std::size_t>::max() as the upper bound.

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: linux_physmon
GitHub Check: linux_examples_test
GitHub Check: missing_includes
GitHub Check: linux_ubuntu_extra (ubuntu2204_clang, 20)
GitHub Check: build_debug

🔇 Additional comments (2)

Examples/Python/tests/test_reader.py (2)

440-446: Clear and descriptive test setup, it is!

Well-defined variables for buffer size and total events to process, I see. Good practice for test readability and maintainability, this is.

471-477: Strong with the Force, this configuration is!

Correctly configured, the BufferedReader is. Match the retrieved learning about unlimited event serving through resampling, it does.

coderabbitai · 2025-01-10T12:13:48Z

Examples/Python/tests/test_reader.py

+@pytest.mark.root
+def test_buffered_reader(tmp_path, conf_const, ptcl_gun):
+    # Test the buffered reader with the ROOT particle reader
+    # need to write out some particles first
+    eventsInBuffer = 5
+    eventsToProcess = 10
+
+    s = Sequencer(numThreads=1, events=eventsInBuffer, logLevel=acts.logging.WARNING)
+    evGen = ptcl_gun(s)
+
+    file = tmp_path / "particles.root"
+    s.addWriter(
+        conf_const(
+            RootParticleWriter,
+            acts.logging.WARNING,
+            inputParticles=evGen.config.outputParticles,
+            filePath=str(file),
+        )
+    )
+
+    s.run()
+
+    # reset sequencer for reading
+    s2 = Sequencer(events=eventsToProcess, numThreads=1, logLevel=acts.logging.WARNING)
+
+    reader = acts.examples.RootParticleReader(
+        level=acts.logging.WARNING,
+        outputParticles="particles_input",
+        filePath=str(file),
+    )
+
+    s2.addReader(
+        acts.examples.BufferedReader(
+            level=acts.logging.WARNING,
+            downstreamReader=reader,
+            bufferSize=eventsInBuffer,
+        )
+    )
+
+    alg = AssertCollectionExistsAlg(
+        "particles_input", "check_alg", acts.logging.WARNING
+    )
+    s2.addAlgorithm(alg)
+
+    s2.run()
+
+    assert alg.events_seen == eventsToProcess


🛠️ Refactor suggestion

Missing edge cases, I sense in this test.

Test additional scenarios, we must:

Buffer size of 1

Buffer size equal to events to process

Error cases for invalid buffer sizes

Help you write these additional test cases, shall I?

sonarqubecloud · 2025-01-10T15:03:33Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

benjaminhuth and others added 2 commits January 10, 2025 11:14

add buffered reader wrapper

f88815a

add test

ffcc912

benjaminhuth requested a review from paulgessinger January 10, 2025 10:21

github-actions bot added this to the next milestone Jan 10, 2025

github-actions bot added the Component - Examples Affects the Examples module label Jan 10, 2025

coderabbitai bot reviewed Jan 10, 2025

View reviewed changes

Examples/Framework/include/ActsExamples/Framework/BufferedReader.hpp Show resolved Hide resolved

Examples/Framework/src/Framework/BufferedReader.cpp Show resolved Hide resolved

update

a83827e

paulgessinger reviewed Jan 10, 2025

View reviewed changes

coderabbitai bot reviewed Jan 10, 2025

View reviewed changes

Merge branch 'main' into feature/buffered-reader

dec9e86

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add buffered reader #4018

feat: Add buffered reader #4018

benjaminhuth commented Jan 10, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 10, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

github-actions bot commented Jan 10, 2025 •

edited

Loading

paulgessinger left a comment

paulgessinger Jan 10, 2025

paulgessinger Jan 10, 2025

benjaminhuth Jan 10, 2025

paulgessinger Jan 10, 2025

benjaminhuth Jan 10, 2025

coderabbitai bot left a comment

coderabbitai bot Jan 10, 2025

sonarqubecloud bot commented Jan 10, 2025

feat: Add buffered reader #4018

Are you sure you want to change the base?

feat: Add buffered reader #4018

Conversation

benjaminhuth commented Jan 10, 2025 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Jan 10, 2025 • edited Loading

Walkthrough

Changes

Sequence Diagram

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Finishing Touches

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 10, 2025 • edited Loading

📊: Physics performance monitoring for dec9e86

physmon summary

paulgessinger left a comment

Choose a reason for hiding this comment

paulgessinger Jan 10, 2025

Choose a reason for hiding this comment

paulgessinger Jan 10, 2025

Choose a reason for hiding this comment

benjaminhuth Jan 10, 2025

Choose a reason for hiding this comment

paulgessinger Jan 10, 2025

Choose a reason for hiding this comment

benjaminhuth Jan 10, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 10, 2025

Choose a reason for hiding this comment

sonarqubecloud bot commented Jan 10, 2025

Quality Gate passed

benjaminhuth commented Jan 10, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 10, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

github-actions bot commented Jan 10, 2025 •

edited

Loading

📊: Physics performance monitoring for `dec9e86`