+1 for ignoring execute() call with warning.

But I'm concerned for how the user catches the error in program without any 
data sinks.

By the way, eager execution is not well documented in data sinks section but is 
in program
skeleton section. [1] This makes the user’s confusion. We should clean up 
documents.
There are many codes calling execute() method after print() method. [2][3]

We should add a description for count() method to documents too.

[1] 
http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#data-sinks
[2] 
http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#parallel-execution
[3] 
http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#iteration-operators

Regards,
Chiwan Park

> On Jun 19, 2015, at 9:15 PM, Maximilian Michels <m...@apache.org> wrote:
> 
> Dear Flink community,
> 
> I have stopped to count how many people on the user list and during Flink
> trainings have asked why their Flink program throws an Exception when they
> just one to print a DataSet. The reason for this is that print() now
> executes eagerly, thus, executes the Flink program. Subsequent calls to
> execute() need to define new DataSinks and throw an exception otherwise.
> 
> We have recently introduced a flag in the ExecutionEnvironment that checks
> whether the user executed before (explicitly via execute() or implicitly
> through collect()/print()/count()). That enabled us to print a nicer
> exception message. However, users either do not read the exception message
> or do not understand it. They do ask this question a lot.
> 
> That's why I propose to ignore calls to execute() entirely if no sinks are
> defined. That will get rid of one of the core annoyances for Flink users. I
> know, that this is painfully for us programmers because we understand how
> Flink works internally but let's step back once and see that it wouldn't be
> so bad if execute didn't do anything in case of no new sinks.
> 
> What would be the downside of this change? Users might call execute() and
> wonder that nothing happens. We would then simply print a warning that
> their program didn't define any sinks. That is a big difference to the
> behavior before because users are scared of exceptions. If they just get a
> warning they will double-check their program and investigate why nothing
> happens. Most of the cases they do actually have defined sinks but simply
> left a call to execute() when they were printing a DataSet.
> 
> What are you opinions on this issue? I have opened a JIRA for this as well:
> https://issues.apache.org/jira/browse/FLINK-2249
> 
> Best,
> Max




Reply via email to