Actually, sortBy will return an ordered RDD.
Your output is unordered integers may be due to foreach.

You can reference the following code snippet, it will return ordered
integers [1,1,1,2,2,3,4,5,7,8,9]

val rdd = sc.parallelize(Array(1, 3, 2, 7, 1, 4, 2, 5, 1, 8, 9),
2).sortBy(x => x, true)
println(rdd.collect().mkString(","))



2015-02-27 15:38 GMT+08:00 Wush Wu <w...@bridgewell.com>:

> Dear all,
>
> I want to implement some sequential algorithm on RDD.
>
> For example:
>
> val conf = new SparkConf()
>   conf.setMaster("local[2]").
>   setAppName("SequentialSuite")
> val sc = new SparkContext(conf)
> val rdd = sc.
>    parallelize(Array(1, 3, 2, 7, 1, 4, 2, 5, 1, 8, 9), 2).
>    sortBy(x => x, true)
> rdd.foreach(println)
>
> I want to see the ordered number on my screen, but it shows unordered
> integers. The two partitions execute the println simultaneously.
>
> How do I make the RDD execute a function globally sequential?
>
> Best,
> Wush
>

Reply via email to