Re: UNION two RDDs

2014-12-22 Thread Jerry Lam
Hi Sean and Madhu, Thank you for the explanation. I really appreciate it. Best Regards, Jerry On Fri, Dec 19, 2014 at 4:50 AM, Sean Owen wrote: > coalesce actually changes the number of partitions. Unless the > original RDD had just 1 partition, coalesce(1) will make an RDD with 1 > partitio

Re: UNION two RDDs

2014-12-19 Thread Sean Owen
coalesce actually changes the number of partitions. Unless the original RDD had just 1 partition, coalesce(1) will make an RDD with 1 partition that is larger than the original partitions, of course. I don't think the question is about ordering of things within an element of the RDD? If the origi

Re: UNION two RDDs

2014-12-18 Thread madhu phatak
Hi, coalesce is an operation which changes no of records in a partition. It will not touch ordering with in a row AFAIK. On Fri, Dec 19, 2014 at 2:22 AM, Jerry Lam wrote: > > Hi Spark users, > > I wonder if val resultRDD = RDDA.union(RDDB) will always have records in > RDDA before records in RDDB