fwiw, I think that having cached RDD partitions prevents executors from
being removed under dynamic allocation by default; see SPARK-8958
<https://issues.apache.org/jira/browse/SPARK-8958>. The
"spark.dynamicAllocation.cachedExecutorIdleTimeout" config
<http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
controls this.

On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <justin.u...@gmail.com> wrote:

> Hey guys,
>
> According to the docs for 1.5.1, when an executor is removed for dynamic
> allocation, the cached data is gone. If I use off-heap storage like
> tachyon, conceptually there isn't this issue anymore, but is the cached
> data still available in practice? This would be great because then we would
> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
> quite small.
>
> ==================
> In addition to writing shuffle files, executors also cache data either on
> disk or in memory. When an executor is removed, however, all cached data
> will no longer be accessible. There is currently not yet a solution for
> this in Spark 1.2. In future releases, the cached data may be preserved
> through an off-heap storage similar in spirit to how shuffle files are
> preserved through the external shuffle service.
> ==================
>

Reply via email to