Skip to content

Commit

Permalink
docs: explain design choice
Browse files Browse the repository at this point in the history
Co-authored-by: Andrew Lamb <[email protected]>
  • Loading branch information
crepererum and alamb committed Apr 9, 2024
1 parent 01a4468 commit df7a034
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion datafusion/physical-plan/src/repartition/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,19 @@ impl RepartitionExecState {
}
}

type LazyState = Arc<OnceCell<Mutex<RepartitionExecState>>>;
//// Lazily initialized state
////
/// Note that the state is initialized ONCE for all partitions by a single task(thread).
/// This may take a short while. It is also like that multiple threads
/// call execute at the same time, because we have just started "target partitions" tasks
/// which is commonly set to the number of CPU cores and all call execute at the same time.
///
/// Thus, use a **tokio** `OnceCell` for this initialization so as not to waste CPU cycles
/// in a futex lock but instead allow other threads to do something useful.
///
/// Uses a parking_lot `Mutex` to control other accesses as they are very short duration
/// (e.g. removing channels on completion) where the overhead of `await` is not warranted.
type LazyState = Arc<tokio::sync::OnceCell<Mutex<RepartitionExecState>>>;

/// A utility that can be used to partition batches based on [`Partitioning`]
pub struct BatchPartitioner {
Expand Down

0 comments on commit df7a034

Please sign in to comment.