As I said, the common print prefix might indicate eager execution. I know that writeToTaskManagerStdOut() is quite bulky, but we should make the difference in the behavior very clear, IMO.
2015-05-28 14:29 GMT+02:00 Stephan Ewen <se...@apache.org>: > Actually, there is a method "print(String prefix)" which still goes to the > sysout of where the job is executed. > > Let's give that one the name "printOnTaskManager()" and then we should have > it... > > On Thu, May 28, 2015 at 2:13 PM, Fabian Hueske <fhue...@gmail.com> wrote: > > > I would avoid to call it printXYZ, since print()'s behavior changed to > > eager execution. > > > > 2015-05-28 14:10 GMT+02:00 Robert Metzger <rmetz...@apache.org>: > > > > > Okay, you are right, local is actually confusing. > > > I'm against introducing "worker" as a term in the API. Its still called > > > "TaskManager". Maybe "printOnTaskManager()" ? > > > > > > On Thu, May 28, 2015 at 2:06 PM, Fabian Hueske <fhue...@gmail.com> > > wrote: > > > > > > > +1 for both. > > > > > > > > printLocal() might not be the best name, because "local" is not well > > > > defined and could also be understood as the local machine of the > user. > > > > How about naming the method completely different > > (writeToWorkerStdOut()?) > > > > to make sure users are not confused with eager and lazy execution? > > > > > > > > > > > > 2015-05-28 13:44 GMT+02:00 Robert Metzger <rmetz...@apache.org>: > > > > > > > > > Hi Sebastian, > > > > > > > > > > thank you for the feedback. I agree that both variants have a right > > to > > > > > exist. > > > > > > > > > > I would vote for adding another method to the DataSet called > > > > "printLocal()" > > > > > that has the old behavior. > > > > > > > > > > On Thu, May 28, 2015 at 1:01 PM, Kruse, Sebastian < > > > > sebastian.kr...@hpi.de> > > > > > wrote: > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > I am a bit worried about that recent change of the print() > method. > > I > > > > can > > > > > > understand the rationale that obtaining the stdout from all the > > > > > > taskmanagers is cumbersome (although, for local debugging the old > > > > print() > > > > > > was fine). > > > > > > However, a major problem, I see with the new print(), is, that > now > > > you > > > > > can > > > > > > only have one print() per plan, as the plan is directly executed > as > > > > soon > > > > > as > > > > > > print() is invoked. If you regard print() as a debugging means, > > this > > > > is a > > > > > > severe restriction. > > > > > > I see use cases for both print() implementations, but I would at > > > least > > > > > > provide some kind of backwards compatibility, be at a parameter > or > > a > > > > > > legacyPrint() method or anything else. As I assume print() to be > > very > > > > > > frequently used, a lot of existing programs would benefit from > this > > > and > > > > > > might otherwise not be directly portable to newer Flink versions. > > > What > > > > do > > > > > > you think? > > > > > > > > > > > > Cheers, > > > > > > Sebastian > > > > > > > > > > > > -----Original Message----- > > > > > > From: Robert Metzger [mailto:rmetz...@apache.org] > > > > > > Sent: Dienstag, 26. Mai 2015 11:12 > > > > > > To: dev@flink.apache.org > > > > > > Subject: Re: Changed the behavior of "DataSet.print()" > > > > > > > > > > > > I've filed a JIRA to update the documentation: > > > > > > https://issues.apache.org/jira/browse/FLINK-2092 > > > > > > > > > > > > On Fri, May 22, 2015 at 11:08 AM, Stephan Ewen <se...@apache.org > > > > > > wrote: > > > > > > > > > > > > > Hi all! > > > > > > > > > > > > > > Me merged a patch yesterday that changed the API behavior of > the > > > > > > > "DataSet.print()" function. > > > > > > > > > > > > > > "print()" now prints to stdout on the client process, rather > than > > > the > > > > > > > TaskManager process, as before. This is much nicer for > debugging > > > and > > > > > > > exploring data sets. > > > > > > > > > > > > > > One implication of this is that print() is now an eager method > ( > > > like > > > > > > > collect() or count() ). That means that calling "print()" > > > immediately > > > > > > > triggers the execution, and no "env.execute()" is required any > > > more. > > > > > > > > > > > > > > Greetings, > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >