Sure - feel free to totally disagree.
On Fri, Apr 1, 2016 at 2:10 PM, Michael Slavitch <slavi...@gmail.com> wrote: > I totally disagree that it’s not a problem. > > - Network fetch throughput on 40G Ethernet exceeds the throughput of NVME > drives. > - What Spark is depending on is Linux’s IO cache as an effective buffer > pool This is fine for small jobs but not for jobs with datasets in the > TB/node range. > - On larger jobs flushing the cache causes Linux to block. > - On a modern 56-hyperthread 2-socket host the latency caused by multiple > executors writing out to disk increases greatly. > > I thought the whole point of Spark was in-memory computing? It’s in fact > in-memory for some things but use spark.local.dir as a buffer pool of > others. > > *Hence, the performance of Spark is gated by the performance of > spark.local.dir, even on large memory systems.* > > "Currently it is not possible to not write shuffle files to disk.” > > What changes >would< make it possible? > > The only one that seems possible is to clone the shuffle service and make > it in-memory. > > > > > > On Apr 1, 2016, at 4:57 PM, Reynold Xin <r...@databricks.com> wrote: > > spark.shuffle.spill actually has nothing to do with whether we write > shuffle files to disk. Currently it is not possible to not write shuffle > files to disk, and typically it is not a problem because the network fetch > throughput is lower than what disks can sustain. In most cases, especially > with SSDs, there is little difference between putting all of those in > memory and on disk. > > However, it is becoming more common to run Spark on a few number of beefy > nodes (e.g. 2 nodes each with 1TB of RAM). We do want to look into > improving performance for those. Meantime, you can setup local ramdisks on > each node for shuffle writes. > > > > On Fri, Apr 1, 2016 at 11:32 AM, Michael Slavitch <slavi...@gmail.com> > wrote: > >> Hello; >> >> I’m working on spark with very large memory systems (2TB+) and notice >> that Spark spills to disk in shuffle. Is there a way to force spark to >> stay in memory when doing shuffle operations? The goal is to keep the >> shuffle data either in the heap or in off-heap memory (in 1.6.x) and never >> touch the IO subsystem. I am willing to have the job fail if it runs out >> of RAM. >> >> spark.shuffle.spill true is deprecated in 1.6 and does not work in >> Tungsten sort in 1.5.x >> >> "WARN UnsafeShuffleManager: spark.shuffle.spill was set to false, but >> this is ignored by the tungsten-sort shuffle manager; its optimized >> shuffles will continue to spill to disk when necessary.” >> >> If this is impossible via configuration changes what code changes would >> be needed to accomplish this? >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > >