You are mixing up storage and execution memory.

Following is the sequence of storage retention/eviction.

- Execution and storage share a unified region (M).
- When no spark execution is underway, storage activity can take up the whole 
of M. This is vice versa for execution activity.
- When both spark execution and storage is underway, there is a priority 
sequence in terms of claiming regions of M that comes into play.
- When spark execution starts, if a portion of M is already occupied by storage 
but is now needed for execution, execution starts evicting storage to reclaim 
the space.
- But this eviction can't happen to reclaim the whole of M for execution. There 
is a certain reserved threshold R (subset of M) till which this eviction of 
storage by execution can take place. If the execution tries to evict more than 
R, it is stopped.
- In short, R is that subregion of M where storage will always have more 
priority than execution and will never be released to execution.



Regards,
Subhasis Mukherjee
________________________________
From: Sreyan Chakravarty <sreya...@gmail.com>
Sent: Wednesday, August 14, 2024 9:00:45 PM
To: user@spark.apache.org <user@spark.apache.org>
Subject: Need help understanding tuning docs

https://spark.apache.org/docs/latest/tuning.html#memory-management-overview

What is the meaning of :
"Execution may evict storage if necessary, but only until total storage memory 
usage falls under a certain threshold (R). In other words, R describes a 
subregion within M where cached blocks are never evicted. "

This seems contradictory, in simple terms I find the meaning that once total 
memory usage crosses a threshold(R) Spark will start evicting storage in a LRU 
fashion.

But the line:

"In other words, R describes a subregion within M where cached blocks are never 
evicted."

Seems contradictory, what is going on?

--
Regards,
Sreyan Chakravarty

Reply via email to