I would like to reach consensus on this before the 0.9 release. So far we have the following ideas:
writeToWorkerStdOut(prefix) printOnTaskManager(prefix) (+1) logOnTaskManager(prefix) I'm against logOnTM because we are not logging the output, we are writing or printing it. *I would vote for deprecating "print(prefix)" and adding "writeToWorkerStdOut(prefix)"* On Thu, May 28, 2015 at 5:00 PM, Chiwan Park <chiwanp...@icloud.com> wrote: > I agree that avoiding name which starts with “print” is better. > > Regards, > Chiwan Park > > > On May 28, 2015, at 11:35 PM, Maximilian Michels <m...@apache.org> wrote: > > > > +1 for printOnTaskManager() > > > > On Thu, May 28, 2015 at 2:53 PM, Kruse, Sebastian < > sebastian.kr...@hpi.de> > > wrote: > > > >> Thanks, for your quick responses! > >> > >> I also think that renaming the old print method should do the trick. As > a > >> contribution to your brainstorming for a name, I propose > logOnTaskManager() > >> ;) > >> > >> Cheers, > >> Sebastian > >> > >> -----Original Message----- > >> From: Fabian Hueske [mailto:fhue...@gmail.com] > >> Sent: Donnerstag, 28. Mai 2015 14:34 > >> To: dev@flink.apache.org > >> Subject: Re: Changed the behavior of "DataSet.print()" > >> > >> As I said, the common print prefix might indicate eager execution. > >> > >> I know that writeToTaskManagerStdOut() is quite bulky, but we should > make > >> the difference in the behavior very clear, IMO. > >> > >> 2015-05-28 14:29 GMT+02:00 Stephan Ewen <se...@apache.org>: > >> > >>> Actually, there is a method "print(String prefix)" which still goes to > >>> the sysout of where the job is executed. > >>> > >>> Let's give that one the name "printOnTaskManager()" and then we should > >>> have it... > >>> > >>> On Thu, May 28, 2015 at 2:13 PM, Fabian Hueske <fhue...@gmail.com> > >> wrote: > >>> > >>>> I would avoid to call it printXYZ, since print()'s behavior changed > >>>> to eager execution. > >>>> > >>>> 2015-05-28 14:10 GMT+02:00 Robert Metzger <rmetz...@apache.org>: > >>>> > >>>>> Okay, you are right, local is actually confusing. > >>>>> I'm against introducing "worker" as a term in the API. Its still > >>>>> called "TaskManager". Maybe "printOnTaskManager()" ? > >>>>> > >>>>> On Thu, May 28, 2015 at 2:06 PM, Fabian Hueske <fhue...@gmail.com> > >>>> wrote: > >>>>> > >>>>>> +1 for both. > >>>>>> > >>>>>> printLocal() might not be the best name, because "local" is not > >>>>>> well defined and could also be understood as the local machine > >>>>>> of the > >>> user. > >>>>>> How about naming the method completely different > >>>> (writeToWorkerStdOut()?) > >>>>>> to make sure users are not confused with eager and lazy execution? > >>>>>> > >>>>>> > >>>>>> 2015-05-28 13:44 GMT+02:00 Robert Metzger <rmetz...@apache.org>: > >>>>>> > >>>>>>> Hi Sebastian, > >>>>>>> > >>>>>>> thank you for the feedback. I agree that both variants have a > >>>>>>> right > >>>> to > >>>>>>> exist. > >>>>>>> > >>>>>>> I would vote for adding another method to the DataSet called > >>>>>> "printLocal()" > >>>>>>> that has the old behavior. > >>>>>>> > >>>>>>> On Thu, May 28, 2015 at 1:01 PM, Kruse, Sebastian < > >>>>>> sebastian.kr...@hpi.de> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Hi everyone, > >>>>>>>> > >>>>>>>> I am a bit worried about that recent change of the print() > >>> method. > >>>> I > >>>>>> can > >>>>>>>> understand the rationale that obtaining the stdout from all > >>>>>>>> the taskmanagers is cumbersome (although, for local > >>>>>>>> debugging the old > >>>>>> print() > >>>>>>>> was fine). > >>>>>>>> However, a major problem, I see with the new print(), is, > >>>>>>>> that > >>> now > >>>>> you > >>>>>>> can > >>>>>>>> only have one print() per plan, as the plan is directly > >>>>>>>> executed > >>> as > >>>>>> soon > >>>>>>> as > >>>>>>>> print() is invoked. If you regard print() as a debugging > >>>>>>>> means, > >>>> this > >>>>>> is a > >>>>>>>> severe restriction. > >>>>>>>> I see use cases for both print() implementations, but I > >>>>>>>> would at > >>>>> least > >>>>>>>> provide some kind of backwards compatibility, be at a > >>>>>>>> parameter > >>> or > >>>> a > >>>>>>>> legacyPrint() method or anything else. As I assume print() > >>>>>>>> to be > >>>> very > >>>>>>>> frequently used, a lot of existing programs would benefit > >>>>>>>> from > >>> this > >>>>> and > >>>>>>>> might otherwise not be directly portable to newer Flink > >> versions. > >>>>> What > >>>>>> do > >>>>>>>> you think? > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> Sebastian > >>>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: Robert Metzger [mailto:rmetz...@apache.org] > >>>>>>>> Sent: Dienstag, 26. Mai 2015 11:12 > >>>>>>>> To: dev@flink.apache.org > >>>>>>>> Subject: Re: Changed the behavior of "DataSet.print()" > >>>>>>>> > >>>>>>>> I've filed a JIRA to update the documentation: > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-2092 > >>>>>>>> > >>>>>>>> On Fri, May 22, 2015 at 11:08 AM, Stephan Ewen > >>>>>>>> <se...@apache.org > >>>> > >>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi all! > >>>>>>>>> > >>>>>>>>> Me merged a patch yesterday that changed the API behavior > >>>>>>>>> of > >>> the > >>>>>>>>> "DataSet.print()" function. > >>>>>>>>> > >>>>>>>>> "print()" now prints to stdout on the client process, > >>>>>>>>> rather > >>> than > >>>>> the > >>>>>>>>> TaskManager process, as before. This is much nicer for > >>> debugging > >>>>> and > >>>>>>>>> exploring data sets. > >>>>>>>>> > >>>>>>>>> One implication of this is that print() is now an eager > >>>>>>>>> method > >>> ( > >>>>> like > >>>>>>>>> collect() or count() ). That means that calling "print()" > >>>>> immediately > >>>>>>>>> triggers the execution, and no "env.execute()" is required > >>>>>>>>> any > >>>>> more. > >>>>>>>>> > >>>>>>>>> Greetings, > >>>>>>>>> Stephan > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > > > >