Re: process_local vs node_local

2014-04-14 Thread dachuan
n Developer > Oculus Info Inc > 2 Berkeley Street, Suite 600, > Toronto, Ontario M5A 4J5 > Phone: +1-416-203-3003 x 238 > Email: nkronenf...@oculusinfo.com > -- Dachuan Huang Cellphone: 614-390-7234 2015 Neil Avenue Ohio State University Columbus, Ohio U.S.A. 43210

Re: Measure the Total Network I/O, Cpu and Memory Consumed by Spark Job

2014-04-14 Thread dachuan
nsumed-by-Spark-Job-tp3668.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > -- Dachuan Huang Cellphone: 614-390-7234 2015 Neil Avenue Ohio State University Columbus, Ohio U.S.A. 43210

Re: Use combineByKey and StatCount

2014-04-14 Thread dachuan
, Apr 1, 2014 at 10:55 AM, Jaonary Rabarisoa wrote: > Hi all; > > Can someone give me some tips to compute mean of RDD by key , maybe with > combineByKey and StatCount. > > Cheers, > > Jaonary > -- Dachuan Huang Cellphone: 614-390-7234 2015 Neil Avenue Ohio State

Re: Block

2014-03-11 Thread dachuan
In my opinion, BlockManager manages many types of Block, RDD's partition, a.k.a. RDDBlock, is one type of them. Other types of Blocks are ShuffleBlock, IndirectBlock (if the task's return status is too large), etc. So, BlockManager is a layer that is independent of RDD concept. On Mar 11, 2014 2:0