[ https://issues.apache.org/jira/browse/FLINK-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fabian Hueske updated FLINK-2239: --------------------------------- Fix Version/s: (was: 0.10.0) 1.0.0 > print() on DataSet: stream results and print incrementally > ---------------------------------------------------------- > > Key: FLINK-2239 > URL: https://issues.apache.org/jira/browse/FLINK-2239 > Project: Flink > Issue Type: Improvement > Components: Distributed Runtime > Affects Versions: 0.9 > Reporter: Maximilian Michels > Fix For: 1.0.0 > > > Users find it counter-intuitive that {{print()}} on a DataSet internally > calls {{collect()}} and fully materializes the set. This leads to out of > memory errors on the client. It also leaves users with the feeling that Flink > cannot handle large amount of data and that it fails frequently. > To improve on this situation requires some major architectural changes in > Flink. The easiest solution would probably be to transfer the data from the > job manager to the client via the {{BlobManager}}. Alternatively, the client > could directly connect to the task managers and fetch the results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)