Re: cartesian in the loop, runtime grows

2016-01-26 Thread efa
Problem solved: for i in range(1,6): L=L.cartesian(D) L.unpersist() L=L.reduceByKey(min).coalesce(6).map(lambda (l,n):l).cache() L.collect() Number of partitions should be constant -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/cartesian-in

cartesian in the loop, runtime grows

2015-11-06 Thread efa
Hi All, I have problem with cartesian product. I build cartesian of two RDDs in the loop and the result is squeezed to the original size of one of participating variables. At the and of the iteration this result is assigned to the original variable. I expect same running time for each iteration, b