Hi Max, thank you for your reply. DataSink contains data ordered, I mean, it contains in order output1, output1 ... output5? Or are them mixed?
Thanks a lot, Giacomo On Tue, Apr 14, 2015 at 11:58 AM, Maximilian Michels <m...@apache.org> wrote: > Hi Giacomo, > > If I understand you correctly, you want your Flink job to execute with a > parallelism of 5. Just call setDegreeOfParallelism(5) on your > ExecutionEnvironment. That way, all operations, when possible, will be > performed using 5 parallel instances. This is also true for the DataSink > which will produce 5 files containing the output data from the parallel > instances. > > Best, > Max > > > On Tue, Apr 14, 2015 at 10:38 AM, Giacomo Licari <giacomo.lic...@gmail.com > > wrote: > >> Hi guys, >> I have a question about how parallelism works. >> >> If I have a large dataset and I would divide it into 5 blocks, can I pass >> each block of data to a fixed parallel process (for example I set up 5 >> process) ? >> >> And if the results data from each process arrive to the output not in an >> ordered way, can I order them? For example: >> >> data from process 1 >> data from process 2 >> and so on >> >> Thank you guys! >> > >