Have you run the same test but with a URL in HDFS rather than the local
filesystem? I think order may be preserved in that run, which makes the
local filesystem losing order look more like a bug.
Sent from my mobile phone
On Apr 25, 2014 9:11 AM, "Mingyu Kim" wrote:
> If the underlying file syst
If the underlying file system returns files in a non-alphabetical order to
java.io.File.listFiles(), Spark reads the partitions out of order. Here¹s an
example.
var sc = new SparkContext(³local[3]², ³test²);
var rdd1 = sc.parallelize([1,2,3,4,5]);
rdd1.saveAsTextFile(³file://path/to/file²);
var rd