I am trying to read a file from S3 in the correct order. It seems to be
that Flink is downloading the file out of order, or at least its
constructing the DataSet out of order. I
tried using hadoop to download the file and it seemed to download it in
order.
I am able to reproduce the problem with the following line:

env.readTextFileWithValue(conf.options.get(S3FileName).get)

   .writeAsText(s"${conf.output}/output",writeMode =
FileSystem.WriteMode.OVERWRITE)

The output looks something like

line 1001
line 1002
...
line 1304
line 1

Is there a way to guarantee order?

-- 
Benjamin Kadish
(260) 441-6159

Reply via email to