Re: can't union two rdds

2015-03-31 Thread ankurjain.nitrr
case. If that amount for data is less, you can use rdd.collect, just iterate on it both the list and produce the desired result -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/can-t-union-two-rdds-tp22320p22323.html Sent from the Apache Spark User List mailing

Re: can't union two rdds

2015-03-31 Thread roy
use zip -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/can-t-union-two-rdds-tp22320p22321.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e

Re: UNION two RDDs

2014-12-22 Thread Jerry Lam
Hi Sean and Madhu, Thank you for the explanation. I really appreciate it. Best Regards, Jerry On Fri, Dec 19, 2014 at 4:50 AM, Sean Owen wrote: > coalesce actually changes the number of partitions. Unless the > original RDD had just 1 partition, coalesce(1) will make an RDD with 1 > partitio

Re: UNION two RDDs

2014-12-19 Thread Sean Owen
coalesce actually changes the number of partitions. Unless the original RDD had just 1 partition, coalesce(1) will make an RDD with 1 partition that is larger than the original partitions, of course. I don't think the question is about ordering of things within an element of the RDD? If the origi

Re: UNION two RDDs

2014-12-18 Thread madhu phatak
Hi, coalesce is an operation which changes no of records in a partition. It will not touch ordering with in a row AFAIK. On Fri, Dec 19, 2014 at 2:22 AM, Jerry Lam wrote: > > Hi Spark users, > > I wonder if val resultRDD = RDDA.union(RDDB) will always have records in > RDDA before records in RDDB

UNION two RDDs

2014-12-18 Thread Jerry Lam
Hi Spark users, I wonder if val resultRDD = RDDA.union(RDDB) will always have records in RDDA before records in RDDB. Also, will resultRDD.coalesce(1) change this ordering? Best Regards, Jerry