There is already an ongoing discussion and an issue open about that:

http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Gather-a-distributed-dataset-td3216.html

I am sadly currently time-pressed with other things, but if nobody else
handles this, I expect to be able to work on that within two weeks.

Regards,
Alex

2015-01-28 10:46 GMT+01:00 John Sandiford (JIRA) <j...@apache.org>:

> John Sandiford created FLINK-1459:
> -------------------------------------
>
>              Summary: Collect DataSet to client
>                  Key: FLINK-1459
>                  URL: https://issues.apache.org/jira/browse/FLINK-1459
>              Project: Flink
>           Issue Type: Improvement
>             Reporter: John Sandiford
>
>
> Hi, I may well have missed something obvious here but I cannot find an
> easy way to extract the values in a DataSet to the client.  Spark has
> collect, collectAsMap etc...
>
> (I need to pass the values from a small aggregated DataSet back to a
> machine learning library which is controlling the iterations.)
>
> The only way I could find to do this was to implement my own in memory
> OutputFormat.  This is not ideal, but does work.
>
> Many thanks, John
>
>
>
> val env = ExecutionEnvironment.getExecutionEnvironment
>
>   val data: DataSet[Double] = env.fromElements(1.0, 2.0, 3.0, 4.0)
>
>   val result = data.reduce((a, b) => a)
>   val valuesOnClient = result.???
>
>   env.execute("Simple example")
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>

Reply via email to