+1 for the second option:

It would also provide possibility to properly commit a state checkpoint
after the terminate message was triggered. In some cases this can be a
desirable behaviour.

On Wed, May 27, 2015 at 8:46 AM, Gyula Fóra <gyf...@apache.org> wrote:

> Hey,
>
> I would also strongly prefer the second option, users need to have the
> option to force cancel a program in case of something unwanted behaviour.
>
> Cheers,
> Gyula
>
> Matthias J. Sax <mj...@informatik.hu-berlin.de> ezt írta (időpont: 2015.
> máj. 27., Sze, 1:20):
>
> > Hi,
> >
> > currently, the only way to stop a streaming job is to "cancel" the job,
> > This has multiple disadvantage:
> >  1) a "clean" stopping is not possible (see
> > https://issues.apache.org/jira/browse/FLINK-1929 -- I think a clean stop
> > is a pre-requirement for FLINK-1929) and
> >  2) as a minor issue, all canceled jobs are listed as canceled in the
> > history (what is somewhat confusing for the user -- at least it was for
> > me when I started to work with Flink Streaming).
> >
> > This issue was raised a few times already, however, no final conclusion
> > was there (if I remember correctly). I could not find a JIRA for it
> either.
> >
> > From my understanding of the system, there would be two ways to
> > implement a nice way for stopping streaming jobs:
> >
> >   1) "Task"s can be distinguished between "batch" and "streaming"
> >      -> canceling a batch jobs works as always
> >      -> canceling a streaming job only send a "canceling" signal to the
> > sources, and waits until the job finishes (ie, sources stop emitting
> > data and finish regularly, triggering the finishing of all operators).
> > For this case, streaming jobs are stopped in a "clean way" (as is the
> > input would have be finite) and the job will be listed as "finished" in
> > the history regularly.
> >
> >   This approach has the advantage, that it should be simpler to
> > implement. However, the disadvantages are (1) a "hard canceling" of jobs
> > is not possible any more, and (2) Flink must be able to distinguishes
> > batch and streaming jobs (I don't think Flink runtime can distinguish
> > both right now?)
> >
> >   2) A new message "terminate" (or similar) is introduced, that can only
> > be used for streaming jobs (would be ignored for batch jobs) that stops
> > the sources and waits until the job finishes regularly.
> >
> >   This approach has the advantage, that current system behavior is
> > preserved (it only adds a few feature). The disadvantage is, that all
> > clients need to be touched and it must be clear to the user, that
> > "terminate" does not work for streaming jobs. If an error/warning should
> > be raised if a user tries to "terminate" a batch job, Flink must be able
> > to distinguish between batch and streaming jobs, too.  As an
> > alternative, "terminate" on batch jobs could be interpreted as "cancel",
> > too.
> >
> >
> > I personally think, that the second approach is better. Please give
> > feedback. If we can get to a conclusion how to implement it, I would
> > like to work on it.
> >
> >
> > -Matthias
> >
> >
>

Reply via email to