Re: execute() and collect()/print()/count()

2015-06-23 Thread Maximilian Michels
@Stephan I understand your concerns that the user might wonder that nothing happens when executing. However, in this case a warning will provide a hint to the user that he didn't define any sinks. In the case where he immediately calls execute() after an eager execution, the program is actually exe

Re: execute() and collect()/print()/count()

2015-06-23 Thread Stephan Ewen
That would help to get around many cases, it still leaves some open, like forgetting to create a sink after transformations after a collect() call. Would probably be a good improvement over the status quo, though... On Mon, Jun 22, 2015 at 9:37 PM, Alexander Alexandrov < alexander.s.alexand...@gm

Re: execute() and collect()/print()/count()

2015-06-22 Thread Alexander Alexandrov
What about adding some state state to the DataBag internals that tracks the following conditions 1. whether the last job execution was triggered by an "enforcer" API method like print() / collect(); 2. whether a DataSource / lazy operator was created after that; If 1 is true and 2 is false, a WAR

Re: execute() and collect()/print()/count()

2015-06-22 Thread Stephan Ewen
We have two situations to trade off here, and fixing one will make the other worse: 1) env.execute() after collect() - see Max's mail 2) env.execute() on empty sinks program. Not throwing an exception makes people wonder why nothing happens (if they write the program to just test whether it runs

Re: execute() and collect()/print()/count()

2015-06-22 Thread Maximilian Michels
+1 for cleaning up the documentation +1 for adding a link to the documentation (should be a permalink) +1 for printing a warning instead of an exception On Sun, Jun 21, 2015 at 12:25 AM, Robert Metzger wrote: > We could also add a link to the documentation into the exception that > explains the

Re: execute() and collect()/print()/count()

2015-06-20 Thread Robert Metzger
We could also add a link to the documentation into the exception that explains the behavior. On Fri, Jun 19, 2015 at 5:52 AM, Chiwan Park wrote: > +1 for ignoring execute() call with warning. > > But I'm concerned for how the user catches the error in program without > any data sinks. > > By the

Re: execute() and collect()/print()/count()

2015-06-19 Thread Chiwan Park
+1 for ignoring execute() call with warning. But I'm concerned for how the user catches the error in program without any data sinks. By the way, eager execution is not well documented in data sinks section but is in program skeleton section. [1] This makes the user’s confusion. We should clean

execute() and collect()/print()/count()

2015-06-19 Thread Maximilian Michels
Dear Flink community, I have stopped to count how many people on the user list and during Flink trainings have asked why their Flink program throws an Exception when they just one to print a DataSet. The reason for this is that print() now executes eagerly, thus, executes the Flink program. Subseq