Re: Custom Hadoop InputSplit, Spark partitions, spark executors/task and Yarn containers

2015-09-24 Thread Sabarish Sasidharan
September 24, 2015 at 2:43 AM > To: Anfernee Xu > Cc: "user@spark.apache.org" > Subject: Re: Custom Hadoop InputSplit, Spark partitions, spark > executors/task and Yarn containers > > Hi Anfernee, > > That's correct that each InputSplit will map to exactly

Re: Custom Hadoop InputSplit, Spark partitions, spark executors/task and Yarn containers

2015-09-24 Thread Adrian Tanase
: Anfernee Xu Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" Subject: Re: Custom Hadoop InputSplit, Spark partitions, spark executors/task and Yarn containers Hi Anfernee, That's correct that each InputSplit will map to exactly a Spark partition. On YARN, each Spa

Re: Custom Hadoop InputSplit, Spark partitions, spark executors/task and Yarn containers

2015-09-23 Thread Sandy Ryza
Hi Anfernee, That's correct that each InputSplit will map to exactly a Spark partition. On YARN, each Spark executor maps to a single YARN container. Each executor can run multiple tasks over its lifetime, both parallel and sequentially. If you enable dynamic allocation, after the stage includi

Custom Hadoop InputSplit, Spark partitions, spark executors/task and Yarn containers

2015-09-23 Thread Anfernee Xu
Hi Spark experts, I'm coming across these terminologies and having some confusions, could you please help me understand them better? For instance I have implemented a Hadoop InputFormat to load my external data in Spark, in turn my custom InputFormat will create a bunch of InputSplit's, my questi