comphead commented on code in PR #16289: URL: https://github.com/apache/datafusion/pull/16289#discussion_r2132492980
########## datafusion/execution/src/memory_pool/mod.rs: ########## @@ -98,6 +98,61 @@ pub use pool::*; /// operator will spill the intermediate buffers to disk, and release memory /// from the memory pool, and continue to retry memory reservation. /// +/// # Related Structs +/// +/// To better understand memory management in DataFusion, here are the key structs +/// and their relationships: +/// +/// - [`MemoryConsumer`]: A named allocation traced by a particular operator. If an +/// execution is parallelized, and there are multiple partitions of the same +/// operator, each partition will have a separate `MemoryConsumer`. +/// - `SharedRegistration`: A registration of a `MemoryConsumer` with a `MemoryPool`. +/// `SharedRegistration` and `MemoryPool` have a many-to-one relationship. `MemoryPool` +/// implementation can decide how to allocate memory based on the registered consumers. +/// (e.g. `FairSpillPool` will try to share available memory evenly among all registered +/// consumers) +/// - [`MemoryReservation`]: Each `MemoryConsumer`/operator can have multiple +/// `MemoryReservation`s for different internal data structures. The relationship +/// between `MemoryConsumer` and `MemoryReservation` is one-to-many. This design +/// enables cleaner operator implementations: +/// - Different `MemoryReservation`s can be used for different purposes +/// - `MemoryReservation` follows RAII principles - to release a reservation, +/// simply drop the corresponding `MemoryReservation` object +/// +/// ## Relationship Diagram +/// +/// ```text +/// ┌──────────────────┐ ┌──────────────────┐ +/// │MemoryReservation │ │MemoryReservation │ +/// └───┬──────────────┘ └──────────────────┘ ...... +/// │belongs to │ +/// │ ┌───────────────────────┘ │ │ +/// │ │ │ │ +/// ▼ ▼ ▼ ▼ +/// ┌────────────────────────┐ ┌────────────────────────┐ +/// │ SharedRegistration │ │ SharedRegistration │ +/// │ ┌────────────────┐ │ │ ┌────────────────┐ │ +/// │ │ │ │ │ │ │ │ +/// │ │ MemoryConsumer │ │ │ │ MemoryConsumer │ │ +/// │ │ │ │ │ │ │ │ +/// │ └────────────────┘ │ │ └────────────────┘ │ +/// └────────────┬───────────┘ └────────────┬───────────┘ +/// │ │ +/// │ register│into +/// │ │ +/// └─────────────┐ ┌──────────────┘ +/// │ │ +/// ▼ ▼ +/// ╔═══════════════════════════════════════════════════╗ +/// ║ ║ +/// ║ MemoryPool ║ +/// ║ ║ +/// ╚═══════════════════════════════════════════════════╝ +/// +/// For example, there are two parallel partitions of an operator X: each partition +/// corresponds to a `MemoryConsumer` in the above diagram. Inside operator X there are Review Comment: Thanks @2010YOUY01 Can we clarify slightly more on `Inside operator X there are several `MemoryReservation`s for different internal data structures`. I read it twice, not sure what does that mean 🤔 is it `MemoryReservation` per data structure? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org