Re: When do map how to get the line number?

2015-03-31 Thread jitesh129
You can use zipWithIndex() to get index for each record and then you can increment by 1 for each index. val tf=sc.textFile("test").zipWithIndex() tf.map(s=>(s[1]+1,s[0])) Above should serve your purpose. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/When

Broadcasting a parquet file using spark and python

2015-03-31 Thread jitesh129
How can we implement a BroadcastHashJoin for spark with python? My SparkSQL inner joins are taking a lot of time since it is performing ShuffledHashJoin. Tables on which join is performed are stored as parquet files. Please help. Thanks and regards, Jitesh -- View this message in context: h