I am trying to read a file from S3 in the correct order. It seems to be that Flink is downloading the file out of order, or at least its constructing the DataSet out of order. I tried using hadoop to download the file and it seemed to download it in order. I am able to reproduce the problem with the following line:
env.readTextFileWithValue(conf.options.get(S3FileName).get) .writeAsText(s"${conf.output}/output",writeMode = FileSystem.WriteMode.OVERWRITE) The output looks something like line 1001 line 1002 ... line 1304 line 1 Is there a way to guarantee order? -- Benjamin Kadish (260) 441-6159