No, partition number is determined by the parameter you set in groupByKey,
see
http://spark.apache.org/docs/latest/api/core/index.html#org.apache.spark.rdd.PairRDDFunctions
for details, suggest you reading some docs before ask questions


Joe L wrote
> I was wonder if groupByKey returns 2 partitions in the below example?
> 
>>>> x = sc.parallelize([("a", 1), ("b", 1), ("a", 1)])
>>>> sorted(x.groupByKey().collect())
> [('a', [1, 1]), ('b', [1])]





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/groupByKey-None-returns-partitions-according-to-the-keys-tp4318p4377.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to