+1 for both.

printLocal() might not be the best name, because "local" is not well
defined and could also be understood as the local machine of the user.
How about naming the method completely different (writeToWorkerStdOut()?)
to make sure users are not confused with eager and lazy execution?


2015-05-28 13:44 GMT+02:00 Robert Metzger <rmetz...@apache.org>:

> Hi Sebastian,
>
> thank you for the feedback. I agree that both variants have a right to
> exist.
>
> I would vote for adding another method to the DataSet called "printLocal()"
> that has the old behavior.
>
> On Thu, May 28, 2015 at 1:01 PM, Kruse, Sebastian <sebastian.kr...@hpi.de>
> wrote:
>
> > Hi everyone,
> >
> > I am a bit worried about that recent change of the print() method. I can
> > understand the rationale that obtaining the stdout from all the
> > taskmanagers is cumbersome (although, for local debugging the old print()
> > was fine).
> > However, a major problem, I see with the new print(), is, that now you
> can
> > only have one print() per plan, as the plan is directly executed as soon
> as
> > print() is invoked. If you regard print() as a debugging means, this is a
> > severe restriction.
> > I see use cases for both print() implementations, but I would at least
> > provide some kind of backwards compatibility, be at a parameter or a
> > legacyPrint() method or anything else. As I assume print() to be very
> > frequently used, a lot of existing programs would benefit from this and
> > might otherwise not be directly portable to newer Flink versions. What do
> > you think?
> >
> > Cheers,
> > Sebastian
> >
> > -----Original Message-----
> > From: Robert Metzger [mailto:rmetz...@apache.org]
> > Sent: Dienstag, 26. Mai 2015 11:12
> > To: dev@flink.apache.org
> > Subject: Re: Changed the behavior of "DataSet.print()"
> >
> > I've filed a JIRA to update the documentation:
> > https://issues.apache.org/jira/browse/FLINK-2092
> >
> > On Fri, May 22, 2015 at 11:08 AM, Stephan Ewen <se...@apache.org> wrote:
> >
> > > Hi all!
> > >
> > > Me merged a patch yesterday that changed the API behavior of the
> > > "DataSet.print()" function.
> > >
> > > "print()" now prints to stdout on the client process, rather than the
> > > TaskManager process, as before. This is much nicer for debugging and
> > > exploring data sets.
> > >
> > > One implication of this is that print() is now an eager method ( like
> > > collect() or count() ). That means that calling "print()" immediately
> > > triggers the execution, and no "env.execute()" is required any more.
> > >
> > > Greetings,
> > > Stephan
> > >
> > >
> >
>

Reply via email to