Hey all,

I have a conceptual question which I have hard time finding answer for.

Is the jvm where spark driver is running also used to run computations over
rdd partitions and persist them? The answer is obvious for local mode
(yes). But when it runs on yarn/mesos/standalone with many executors is the
answer no?

*My motivation is following*
In "executors" tab of sparkUI in "storage memory" column for driver table
line one can see "0.0 B / 14.2 GB" for example. This suggests that 14G of
ram are not available to computations done in driver but are reserved for
rdd caching.

But I have plenty of memory on executors to cache rdd there. I would like
to use driver memory for being able to collect medium sized data. Since I
assume that collected data are stored out of memory reserved from cache
this means that those 14G not available for saving collected data.

It looks like spark2.0.0 is doing this cache vs non-cache memory management
somehow automatically but I do not understand that yet

Thanks for any insight on this

Jakub D.

Reply via email to