Re: Unique Partition Id per partition

2017-01-31 Thread Michael Allman
Hi Sumit, Can you use http://spark.apache.org/docs/latest/api/python/pyspark.html?highlight=rdd#pyspark.RDD.mapPartitionsWithIndex to solve your problem? Michael > On Jan 31, 2017,

Unique Partition Id per partition

2017-01-31 Thread Chawla,Sumit
Hi All I have a rdd, which i partition based on some key, and then can sc.runJob for each partition. Inside this function, i assign each partition a unique key using following: "%s_%s" % (id(part), int(round(time.time())) This is to make sure that, each partition produces separate bookeeping st