-
Notifications
You must be signed in to change notification settings - Fork 19
ReadComputeWrite (RCW)
C++ Guide Series
Architecture | Knowledge Base | Networking | Containers | RCW | Threads | Optimizations | KaRL | Encryption | Checkpointing | Knowledge Performance | Logging
RCW is a powerful paradigm for writing multithreaded code that is fast, efficient, and race-condition free when it comes to working with the thread-safe knowledge base. In this wiki, we'll explain why RCW is necessary, what it provides developers, and how to effectively use RCW tools like Staged containers in MADARA.
The RCW Concept consists of separating access and updating thread-safe contexts into distinctive read, compute and write phases. These phases help to prevent writing access by other threads and is vital in multi-threaded and multi-processed applications, and is especially helpful in any situation where many readers are looking at a variable that may be updated by one or more other threads.
- Read values from the thread-safe context into local variables (e.g., int64, double, or string)
- Perform any computation steps on the local variables to ensure data consistency during computation on the local variables
- Write updated values from the local variables to the thread-safe context
- Fast as possible computation
- Data integrity throughout computation phase
- Controlled execution and fewer mutex calls
-
You always execute a read phase for any variable in the computation. You only really need to write data products of the computation.
-
Never write data to variables you do not own. You're likely overwriting an update from the thread/process that owns the data. Consider the following:
Agent 0 has two threads. The first thread updates its GPS position. The second reads its own position and a position of Agent 1 and determines if the two agents are in a potential collision course. Agent 1 does the same thing.
Now, consider the situation where you always read positions into local variables, and you also always call write on all containers when you are cleaning up a compute phase. Here's how the execution would look if you do that.
Incorrect RCW
agent_0.gps.read()
agent_1.gps.read()
agent_0.can_collide = compute_collision(agent_0, agent_1)
agent_1.can_collide = *agent_0.can_collide // note another process owns this variable
agent_0.gps.write() // note another thread owns this variable
agent_1.gps.write() // note another process owns this variable
agent_0.can_collide.write() // literally the only proper RCW write
agent_1.can_collide.write() // note another process owns this variable
The result of the above would be that you would not only overwrite your own GPS read thread's position updates, but also agent 1's position updates, and it's own collision logic.
Correct RCW
agent_0.gps.read()
agent_1.gps.read()
agent_0.can_collide = compute_collision(agent_0, agent_1)
agent_0.can_collide.write() // literally the only proper RCW write
The above is how you assure that the GPS values are local and unchangeable to your current computation by any external entity. If your thread is what is supposed to keep track of collision likelihood for agent_0, then the only data product that makes sense for writing is the can_collide boolean check above.
We currently support a few base RCW classes that simplify RCW mechanics.
- IntegerStaged
- DoubleStaged
- StringStaged
Each of these contains a read() and write() method and an internal local variable which is extremely fast to use.
All containers and variables in MADARA can be made into RCW phases. To do this, simply consider the following example with NativeDoubleVector:
// read phase
madara::knowledge::containers::NativeDoubleVector gps("agent.0.position", knowledge);
std::vector<double> agent_0_position = gps.to_doubles();
// compute phase
agent_0_position[0] = 55.5555;
agent_0_position[1] = 11.1111;
// write phase
gps.set(agent_0_position);
The double vector is slightly more complicated than other classes like Integer, Double, and String because it can have multiple elements. There are faster ways to do the above if you only update one of many elements in a vector (e.g., an image feature embedding). Only update the elements you need to update, whenever possible.
C++ Guide Series
Architecture | Knowledge Base | Networking | Containers | RCW | Threads | Optimizations | KaRL | Encryption | Checkpointing | Knowledge Performance | Logging