We could also add a link to the documentation into the exception that explains the behavior.
On Fri, Jun 19, 2015 at 5:52 AM, Chiwan Park <chiwanp...@icloud.com> wrote: > +1 for ignoring execute() call with warning. > > But I'm concerned for how the user catches the error in program without > any data sinks. > > By the way, eager execution is not well documented in data sinks section > but is in program > skeleton section. [1] This makes the user’s confusion. We should clean up > documents. > There are many codes calling execute() method after print() method. [2][3] > > We should add a description for count() method to documents too. > > [1] > http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#data-sinks > [2] > http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#parallel-execution > [3] > http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#iteration-operators > > Regards, > Chiwan Park > > > On Jun 19, 2015, at 9:15 PM, Maximilian Michels <m...@apache.org> wrote: > > > > Dear Flink community, > > > > I have stopped to count how many people on the user list and during Flink > > trainings have asked why their Flink program throws an Exception when > they > > just one to print a DataSet. The reason for this is that print() now > > executes eagerly, thus, executes the Flink program. Subsequent calls to > > execute() need to define new DataSinks and throw an exception otherwise. > > > > We have recently introduced a flag in the ExecutionEnvironment that > checks > > whether the user executed before (explicitly via execute() or implicitly > > through collect()/print()/count()). That enabled us to print a nicer > > exception message. However, users either do not read the exception > message > > or do not understand it. They do ask this question a lot. > > > > That's why I propose to ignore calls to execute() entirely if no sinks > are > > defined. That will get rid of one of the core annoyances for Flink > users. I > > know, that this is painfully for us programmers because we understand how > > Flink works internally but let's step back once and see that it wouldn't > be > > so bad if execute didn't do anything in case of no new sinks. > > > > What would be the downside of this change? Users might call execute() and > > wonder that nothing happens. We would then simply print a warning that > > their program didn't define any sinks. That is a big difference to the > > behavior before because users are scared of exceptions. If they just get > a > > warning they will double-check their program and investigate why nothing > > happens. Most of the cases they do actually have defined sinks but simply > > left a call to execute() when they were printing a DataSet. > > > > What are you opinions on this issue? I have opened a JIRA for this as > well: > > https://issues.apache.org/jira/browse/FLINK-2249 > > > > Best, > > Max > > > > >