I agree that avoiding name which starts with “print” is better.

Regards,
Chiwan Park

> On May 28, 2015, at 11:35 PM, Maximilian Michels <m...@apache.org> wrote:
> 
> +1 for printOnTaskManager()
> 
> On Thu, May 28, 2015 at 2:53 PM, Kruse, Sebastian <sebastian.kr...@hpi.de>
> wrote:
> 
>> Thanks, for your quick responses!
>> 
>> I also think that renaming the old print method should do the trick. As a
>> contribution to your brainstorming for a name, I propose logOnTaskManager()
>> ;)
>> 
>> Cheers,
>> Sebastian
>> 
>> -----Original Message-----
>> From: Fabian Hueske [mailto:fhue...@gmail.com]
>> Sent: Donnerstag, 28. Mai 2015 14:34
>> To: dev@flink.apache.org
>> Subject: Re: Changed the behavior of "DataSet.print()"
>> 
>> As I said, the common print prefix might indicate eager execution.
>> 
>> I know that writeToTaskManagerStdOut() is quite bulky, but we should make
>> the difference in the behavior very clear, IMO.
>> 
>> 2015-05-28 14:29 GMT+02:00 Stephan Ewen <se...@apache.org>:
>> 
>>> Actually, there is a method "print(String prefix)" which still goes to
>>> the sysout of where the job is executed.
>>> 
>>> Let's give that one the name "printOnTaskManager()" and then we should
>>> have it...
>>> 
>>> On Thu, May 28, 2015 at 2:13 PM, Fabian Hueske <fhue...@gmail.com>
>> wrote:
>>> 
>>>> I would avoid to call it printXYZ, since print()'s behavior changed
>>>> to eager execution.
>>>> 
>>>> 2015-05-28 14:10 GMT+02:00 Robert Metzger <rmetz...@apache.org>:
>>>> 
>>>>> Okay, you are right, local is actually confusing.
>>>>> I'm against introducing "worker" as a term in the API. Its still
>>>>> called "TaskManager". Maybe "printOnTaskManager()" ?
>>>>> 
>>>>> On Thu, May 28, 2015 at 2:06 PM, Fabian Hueske <fhue...@gmail.com>
>>>> wrote:
>>>>> 
>>>>>> +1 for both.
>>>>>> 
>>>>>> printLocal() might not be the best name, because "local" is not
>>>>>> well defined and could also be understood as the local machine
>>>>>> of the
>>> user.
>>>>>> How about naming the method completely different
>>>> (writeToWorkerStdOut()?)
>>>>>> to make sure users are not confused with eager and lazy execution?
>>>>>> 
>>>>>> 
>>>>>> 2015-05-28 13:44 GMT+02:00 Robert Metzger <rmetz...@apache.org>:
>>>>>> 
>>>>>>> Hi Sebastian,
>>>>>>> 
>>>>>>> thank you for the feedback. I agree that both variants have a
>>>>>>> right
>>>> to
>>>>>>> exist.
>>>>>>> 
>>>>>>> I would vote for adding another method to the DataSet called
>>>>>> "printLocal()"
>>>>>>> that has the old behavior.
>>>>>>> 
>>>>>>> On Thu, May 28, 2015 at 1:01 PM, Kruse, Sebastian <
>>>>>> sebastian.kr...@hpi.de>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi everyone,
>>>>>>>> 
>>>>>>>> I am a bit worried about that recent change of the print()
>>> method.
>>>> I
>>>>>> can
>>>>>>>> understand the rationale that obtaining the stdout from all
>>>>>>>> the taskmanagers is cumbersome (although, for local
>>>>>>>> debugging the old
>>>>>> print()
>>>>>>>> was fine).
>>>>>>>> However, a major problem, I see with the new print(), is,
>>>>>>>> that
>>> now
>>>>> you
>>>>>>> can
>>>>>>>> only have one print() per plan, as the plan is directly
>>>>>>>> executed
>>> as
>>>>>> soon
>>>>>>> as
>>>>>>>> print() is invoked. If you regard print() as a debugging
>>>>>>>> means,
>>>> this
>>>>>> is a
>>>>>>>> severe restriction.
>>>>>>>> I see use cases for both print() implementations, but I
>>>>>>>> would at
>>>>> least
>>>>>>>> provide some kind of backwards compatibility, be at a
>>>>>>>> parameter
>>> or
>>>> a
>>>>>>>> legacyPrint() method or anything else. As I assume print()
>>>>>>>> to be
>>>> very
>>>>>>>> frequently used, a lot of existing programs would benefit
>>>>>>>> from
>>> this
>>>>> and
>>>>>>>> might otherwise not be directly portable to newer Flink
>> versions.
>>>>> What
>>>>>> do
>>>>>>>> you think?
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> Sebastian
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Robert Metzger [mailto:rmetz...@apache.org]
>>>>>>>> Sent: Dienstag, 26. Mai 2015 11:12
>>>>>>>> To: dev@flink.apache.org
>>>>>>>> Subject: Re: Changed the behavior of "DataSet.print()"
>>>>>>>> 
>>>>>>>> I've filed a JIRA to update the documentation:
>>>>>>>> https://issues.apache.org/jira/browse/FLINK-2092
>>>>>>>> 
>>>>>>>> On Fri, May 22, 2015 at 11:08 AM, Stephan Ewen
>>>>>>>> <se...@apache.org
>>>> 
>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi all!
>>>>>>>>> 
>>>>>>>>> Me merged a patch yesterday that changed the API behavior
>>>>>>>>> of
>>> the
>>>>>>>>> "DataSet.print()" function.
>>>>>>>>> 
>>>>>>>>> "print()" now prints to stdout on the client process,
>>>>>>>>> rather
>>> than
>>>>> the
>>>>>>>>> TaskManager process, as before. This is much nicer for
>>> debugging
>>>>> and
>>>>>>>>> exploring data sets.
>>>>>>>>> 
>>>>>>>>> One implication of this is that print() is now an eager
>>>>>>>>> method
>>> (
>>>>> like
>>>>>>>>> collect() or count() ). That means that calling "print()"
>>>>> immediately
>>>>>>>>> triggers the execution, and no "env.execute()" is required
>>>>>>>>> any
>>>>> more.
>>>>>>>>> 
>>>>>>>>> Greetings,
>>>>>>>>> Stephan
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 




Reply via email to