RDD.cacheDataSet() not working intermittently

2017-05-08 Thread jasbir.sing
Hi, I have a scenario in which I am caching my RDDs for future use. But I observed that when I use my RDD, complete DAG is re-executed and RDD gets created again. How can I avoid this scenario and make sure that RDD.cacheDataSet() caches RDD every time. Regards, Jasbir Singh __

RDD.cacheDataSet() not working intermittently

2017-05-08 Thread jasbir.sing
Hi, I have a scenario in which I am caching my RDDs for future use. But I observed that when I use my RDD, complete DAG is re-executed and RDD gets created again. How can I avoid this scenario and make sure that RDD.cacheDataSet() caches RDD every time. Regards, Jasbir Singh __

RE: Equally split a RDD partition into two partition at the same node

2017-01-15 Thread jasbir.sing
Hi, Coalesce is used to decrease the number of partitions. If you give the value of numPartitions greater than the current partition, I don’t think RDD number of partitions will be increased. Thanks, Jasbir From: Fei Hu [mailto:hufe...@gmail.com] Sent: Sunday, January 15, 2017 10:10 PM To: zou