Re: Should collect() and count() be treated as data sinks?

Aljoscha Krettek Thu, 02 Apr 2015 10:41:40 -0700

In my opinion it should not be handled like print. The idea behind
count()/collect() is that they immediately return the result which can
then be used in further flink operations.


Right now, this is not properly/efficiently implemented but once we
have support for intermediate results on this level they start making
more sense. Also, in such a case an execute would not be required
after a collect()/count() if only the result of that call is required.

On Thu, Apr 2, 2015 at 5:33 PM, Felix Neutatz <[email protected]> wrote:
> Hi,
>
> I have run the following program:
>
> final ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
>
> List l = Arrays.asList(new Tuple1<Long>(1L));
> TypeInformation t = TypeInfoParser.parse("Tuple1<Long>");
> DataSet<Tuple1<Long>> data = env.fromCollection(l, t);
>
> long value = data.count();
> System.out.println(value);
>
> env.execute("example");
>
>
> Since there is no "real" data sink, I get the following:
> Exception in thread "main" java.lang.RuntimeException: No data sinks have
> been created yet. A program needs at least one sink that consumes data.
> Examples are writing the data set or printing it.
>
> In my opinion, we should handle count() and collect() like print().
>
> What do you think?
>
> Best regards,
>
> Felix

Re: Should collect() and count() be treated as data sinks?

Reply via email to