Alternatively, watch Spark Summit talk on Memory Management to get insight from a developer's perspective.
https://spark-summit.org/2016/events/deep-dive-apache-spark-memory-management/ https://spark-summit.org/2017/events/a-developers-view-into-sparks-memory-model/ Cheers Jules Sent from my iPhone Pardon the dumb thumb typos :) > On Sep 12, 2017, at 4:07 AM, Vikash Pareek <vikaspareek1...@gmail.com> wrote: > > Obviously, you can't store 900GB of data into 80GB memory. > There is a concept in spark called disk spill, it means when your data size > increases and can't fit into memory then it spilled out to disk. > > Also, spark doesn't use whole memory for storing the data, some fraction of > memory used for processing, shuffling and internal data structure too. > For more detail, you can have a look at > https://0x0fff.com/spark-memory-management/ > <https://0x0fff.com/spark-memory-management/> > > Hope this will help you. > > > > > > > ----- > > __Vikash Pareek > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >