Re: multi-threaded Spark jobs

2016-01-26 Thread Elango Cheran
Igor Berman wrote: > IMHO, you are making mistake. > spark manages tasks and cores internally. when you open new threads inside > executor - meaning you "over-provisioning" executor(e.g. tasks on other > cores will be preempted) > > > > On 26 January 2016 at 07:59, Ela

multi-threaded Spark jobs

2016-01-25 Thread Elango Cheran
Hi everyone, I've gone through the effort of figuring out how to modify a Spark job to have an operation become multi-threaded inside an executor. I've written up an explanation of what worked, what didn't work, and why: http://www.elangocheran.com/blog/2016/01/using-clojure-to-create-multi-threa

Re: how to handle OOMError from groupByKey

2015-09-28 Thread Elango Cheran
gt;> memory. >> >> 2015-09-28 9:35 GMT+02:00 Akhil Das : >> >>> You can try to increase the number of partitions to get ride of the OOM >>> errors. Also try to use reduceByKey instead of groupByKey. >>> >>> Thanks >>> Best Regards >

how to handle OOMError from groupByKey

2015-09-25 Thread Elango Cheran
Hi everyone, I have an RDD of the format (user: String, timestamp: Long, state: Boolean). My task invovles converting the states, where on/off is represented as true/false, into intervals of 'on' of the format (beginTs: Long, endTs: Long). So this task requires me, per user, to line up all of the