Hi Davis,
Thank you for you answer. This is my code. I think it is very similar with
word count example in spark
lines = sc.textFile(sys.argv[2])
sie = lines.map(lambda l: (l.strip().split(',')[4],1)).reduceByKey(lambda
a, b: a + b)
sort_sie = sie.sortByKey(False)
Thanks again.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/The-difference-between-pyspark-rdd-PipelinedRDD-and-pyspark-rdd-RDD-tp14421p14448.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]