What Arvid said is correct. The only thing I have to add is that "stop" allows also exactly-once sinks to push out their buffered data to their final destination (e.g. Filesystem). In other words, it takes into account side-effects, so it guarantees exactly-once end-to-end, assuming that you are using exactly-once sources and sinks. Cancel with savepoint on the other hand did not necessarily and committing side-effects is was following a "best-effort" approach.
For more information you can check [1]. Cheers, Kostas [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212 On Mon, Jun 8, 2020 at 10:23 AM Arvid Heise <ar...@ververica.com> wrote: > It was before I joined the dev team, so the following are kind of > speculative: > > The concept of stoppable functions never really took off as it was a bit > of a clumsy approach. There is no fundamental difference between stopping > and cancelling on (sub)task level. Indeed if you look in the twitter source > of 1.6 [1], cancel() and stop() are doing the exact same thing. I'd assume > that this is probably true for all sources. > > So what is the difference between cancel and stop then? It's more the way > on how you terminate the whole DAG. On cancelling, you cancel() on all > tasks more or less simultaneously. If you want to stop, it's more a > fine-grain cancel, where you stop first the sources and then let the tasks > close themselves when all upstream tasks are done. Just before closing the > tasks, you also take a snapshot. Thus, the difference should not be visible > in user code but only in the Flink code itself (task/checkpoint coordinator) > > So for your question: > 1. No, as on task level stop() and cancel() are the same thing on UDF > level. > 2. Yes, stop will be more graceful and creates a snapshot. [2] > 3. Not that I am aware of. In the whole flink code base, there are no more > (see javadoc). You could of course check if there are some in Bahir. But it > shouldn't really matter. There is no huge difference between stopping and > cancelling if you wait for a checkpoint to finish. > 4. Okay you answered your second question ;) Yes cancel with savepoint = > stop now to make it easier for new users. > > [1] > https://github.com/apache/flink/blob/release-1.6/flink-connectors/flink-connector-twitter/src/main/java/org/apache/flink/streaming/connectors/twitter/TwitterSource.java#L180-L190 > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/cli.html > > On Sun, Jun 7, 2020 at 1:04 AM M Singh <mans2si...@yahoo.com> wrote: > >> >> Hi Arvid: >> >> Thanks for the links. >> >> A few questions: >> >> 1. Is there any particular interface in 1.9+ that identifies the source >> as stoppable ? >> 2. Is there any distinction b/w stop and cancel in 1.9+ ? >> 3. Is there any list of sources which are documented as stoppable besides >> the one listed in your SO link ? >> 4. In 1.9+ there is flink stop command and a flink cancel command. ( >> https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#stop). >> So it appears that flink stop will take a savepoint and the call cancel, >> and cancel will just cancel the job (looks like cancel with savepoint is >> deprecated in 1.10). >> >> Thanks again for your help. >> >> >> >> On Saturday, June 6, 2020, 02:18:57 PM EDT, Arvid Heise < >> ar...@ververica.com> wrote: >> >> >> Yes, it seems as if FlinkKinesisConsumer does not implement it. >> >> Here are the links to the respective javadoc [1] and code [2]. Note that >> in later releases (1.9+) this interface has been removed. Stop is now >> implemented through a cancel() on source level. >> >> In general, I don't think that in a Kinesis to Kinesis use case, stop is >> needed anyways, since there is no additional consistency expected over a >> normal cancel. >> >> [1] >> https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/api/common/functions/StoppableFunction.html >> [2] >> https://github.com/apache/flink/blob/release-1.6/flink-core/src/main/java/org/apache/flink/api/common/functions/StoppableFunction.java >> >> On Sat, Jun 6, 2020 at 8:03 PM M Singh <mans2si...@yahoo.com> wrote: >> >> Hi Arvid: >> >> I check the link and it indicates that only Storm SpoutSource, >> TwitterSource and NifiSource support stop. >> >> Does this mean that FlinkKinesisConsumer is not stoppable ? >> >> Also, can you please point me to the Stoppable interface mentioned in the >> link ? I found the following but am not sure if TwitterSource implements >> it : >> >> https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/StartStoppable.java >> >> Thanks >> >> >> >> >> >> On Friday, June 5, 2020, 02:48:49 PM EDT, Arvid Heise < >> ar...@ververica.com> wrote: >> >> >> Hi, >> >> could you check if this SO thread [1] helps you already? >> >> [1] >> https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable >> >> On Thu, Jun 4, 2020 at 7:43 PM M Singh <mans2si...@yahoo.com> wrote: >> >> Hi: >> >> I am running a job which consumes data from Kinesis and send data to >> another Kinesis queue. I am using an older version of Flink (1.6), and >> when I try to stop the job I get an exception >> >> Caused by: java.util.concurrent.ExecutionException: >> org.apache.flink.runtime.rest.util.RestClientException: [Job termination >> (STOP) failed: This job is not stoppable.] >> >> >> I wanted to find out what is a stoppable job and it possible to make a >> job stoppable if is reading/writing to kinesis ? >> >> Thanks >> >> >> >> >> -- >> >> Arvid Heise | Senior Java Developer >> >> <https://www.ververica.com/> >> >> Follow us @VervericaData >> >> -- >> >> Join Flink Forward <https://flink-forward.org/> - The Apache Flink >> Conference >> >> Stream Processing | Event Driven | Real Time >> >> -- >> >> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany >> >> -- >> Ververica GmbH >> Registered at Amtsgericht Charlottenburg: HRB 158244 B >> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji >> (Toni) Cheng >> >> >> >> -- >> >> Arvid Heise | Senior Java Developer >> >> <https://www.ververica.com/> >> >> Follow us @VervericaData >> >> -- >> >> Join Flink Forward <https://flink-forward.org/> - The Apache Flink >> Conference >> >> Stream Processing | Event Driven | Real Time >> >> -- >> >> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany >> >> -- >> Ververica GmbH >> Registered at Amtsgericht Charlottenburg: HRB 158244 B >> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji >> (Toni) Cheng >> > > > -- > > Arvid Heise | Senior Java Developer > > <https://www.ververica.com/> > > Follow us @VervericaData > > -- > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > Conference > > Stream Processing | Event Driven | Real Time > > -- > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > -- > Ververica GmbH > Registered at Amtsgericht Charlottenburg: HRB 158244 B > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji > (Toni) Cheng >