Re: counters in spark

2015-04-14 Thread Imran Rashid
Hi Robert, A lot of task metrics are already available for individual tasks. You can get these programmatically by registering a SparkListener, and you van also view them in the UI. Eg., for each task, you can see runtime, serialization time, amount of shuffle data read, etc. I'm working on als

Re: counters in spark

2015-04-13 Thread Grandl Robert
Guys, Do you have any thoughts on this ? Thanks,Robert On Sunday, April 12, 2015 5:35 PM, Grandl Robert wrote: Hi guys, I was trying to figure out some counters in Spark, related to the amount of CPU or Memory used (in some metric), used by a task/stage/job, but I could not find

Re: Counters in Spark

2015-02-13 Thread Imran Rashid
this is more-or-less the best you can do now, but as has been pointed out, accumulators don't quite fit the bill for counters. There is an open issue to do something better, but no progress on that so far https://issues.apache.org/jira/browse/SPARK-603 On Fri, Feb 13, 2015 at 11:12 AM, Mark Hams

Re: Counters in Spark

2015-02-13 Thread Mark Hamstra
Except that transformations don't have an exactly-once guarantee, so this way of doing counters may produce different answers across various forms of failures and speculative execution. On Fri, Feb 13, 2015 at 8:56 AM, Sean McNamara wrote: > .map is just a transformation, so no work will actual

Re: Counters in Spark

2015-02-13 Thread Sean McNamara
.map is just a transformation, so no work will actually be performed until something takes action against it. Try adding a .count(), like so: inputRDD.map { x => { counter += 1 } }.count() In case it is helpful, here are the docs on what exactly the transformations and actions are: htt