Re: Local spark context on an executor

2017-03-22 Thread Shashank Mandil
ultiple mysql table to be accesses in a single spark > job, you can create a list of tables and run a map on that list. Something > like: > > def getTable(tablename:String): Dataframe > def saveTable(d : Dataframe): Unit > > val tables = sc.paralleize() > tables.map(getTabl

Re: Local spark context on an executor

2017-03-21 Thread Shashank Mandil
it > > On Wed, Mar 22, 2017 at 9:34 AM, Shashank Mandil < > mandil.shash...@gmail.com> wrote: > >> Hi All, >> >> I am using spark in a yarn cluster mode. >> When I run a yarn application it creates multiple executors on the hadoop >> datanodes fo

Local spark context on an executor

2017-03-21 Thread Shashank Mandil
Hi All, I am using spark in a yarn cluster mode. When I run a yarn application it creates multiple executors on the hadoop datanodes for processing. Is it possible for me to create a local spark context (master=local) on these executors to be able to get a spark context ? Theoretically since eac

Spark data frame map problem

2017-03-21 Thread Shashank Mandil
Hi All, I have a spark data frame which has 992 rows inside it. When I run a map on this data frame I expect that the map should work for all the 992 rows. As a mapper runs on an executor on a cluster I did a distributed count of the number of rows the mapper is being run on. dataframe.map(r =>

Re: Spark submit on yarn does not return with exit code 1 on exception

2017-02-03 Thread Shashank Mandil
ned exit code 1, the use case touches Spark very little. > > > > What version is that? Do you see "There is an exception in the script > > exiting with status 1" printed out to stdout? > > > > Pozdrawiam, > > Jacek Laskowski > > > > https://medium.com

Spark submit on yarn does not return with exit code 1 on exception

2017-02-03 Thread Shashank Mandil
Hi All, I wrote a test script which always throws an exception as below : object Test { def main(args: Array[String]) { try { val conf = new SparkConf() .setAppName("Test") throw new RuntimeException("Some Exception") println("all done!") } catch

Re: Fwd: Need some help

2016-09-01 Thread Shashank Mandil
Hi Aakash, I think what it generally means that you have to use the general spark APIs of Dataframe to bring in the data and crunch the numbers, however you cannot use the KMeansClustering algorithm which is already present in the MLlib spark library. I think a good place to start would be unders