Thanks Kostas, Arvid, and Senthil for your help. On Monday, June 8, 2020, 12:47:56 PM EDT, Senthil Kumar <senthi...@vmware.com> wrote: #yiv1043440718 #yiv1043440718 -- _filtered {} _filtered {} _filtered {} _filtered {} _filtered {}#yiv1043440718 #yiv1043440718 p.yiv1043440718MsoNormal, #yiv1043440718 li.yiv1043440718MsoNormal, #yiv1043440718 div.yiv1043440718MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;}#yiv1043440718 a:link, #yiv1043440718 span.yiv1043440718MsoHyperlink {color:blue;text-decoration:underline;}#yiv1043440718 a:visited, #yiv1043440718 span.yiv1043440718MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv1043440718 p.yiv1043440718msonormal0, #yiv1043440718 li.yiv1043440718msonormal0, #yiv1043440718 div.yiv1043440718msonormal0 {margin-right:0in;margin-left:0in;font-size:11.0pt;font-family:sans-serif;}#yiv1043440718 span.yiv1043440718EmailStyle19 {font-family:sans-serif;color:windowtext;}#yiv1043440718 .yiv1043440718MsoChpDefault {font-size:10.0pt;} _filtered {}#yiv1043440718 div.yiv1043440718WordSection1 {}#yiv1043440718 I am just stating this for completeness. When a job is cancelled, Flink sends an Interrupt signal to the Thread running the Source.run method For some reason (unknown to me), this does not happen when a Stop command is issued. We ran into some minor issues because of said behavior. From: Kostas Kloudas <kklou...@gmail.com> Date: Monday, June 8, 2020 at 2:35 AM To: Arvid Heise <ar...@ververica.com> Cc: M Singh <mans2si...@yahoo.com>, User-Flink <user@flink.apache.org> Subject: Re: Stopping a job What Arvid said is correct. The only thing I have to add is that "stop" allows also exactly-once sinks to push out their buffered data to their final destination (e.g. Filesystem). In other words, it takes into account side-effects, so it guarantees exactly-once end-to-end, assuming that you are using exactly-once sources and sinks. Cancel with savepoint on the other hand did not necessarily and committing side-effects is was following a "best-effort" approach. For more information you can check [1]. Cheers, Kostas [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212 On Mon, Jun 8, 2020 at 10:23 AM Arvid Heise <ar...@ververica.com> wrote:
It was before I joined the dev team, so the following are kind of speculative: The concept of stoppable functions never really took off as it was a bit of a clumsy approach. There is no fundamental difference between stopping and cancelling on (sub)task level. Indeed if you look in the twitter source of 1.6 [1], cancel() and stop() are doing the exact same thing. I'd assume that this is probably true for all sources. So what is the difference between cancel and stop then? It's more the way on how you terminate the whole DAG. On cancelling, you cancel() on all tasks more or less simultaneously. If you want to stop, it's more a fine-grain cancel, where you stop first the sources and then let the tasks close themselves when all upstream tasks are done. Just before closing the tasks, you also take a snapshot. Thus, the difference should not be visible in user code but only in the Flink code itself (task/checkpoint coordinator) So for your question: 1. No, as on task level stop() and cancel() are the same thing on UDF level. 2. Yes, stop will be more graceful and creates a snapshot. [2] 3. Not that I am aware of. In the whole flink code base, there are no more (see javadoc). You could of course check if there are some in Bahir. But it shouldn't really matter. There is no huge difference between stopping and cancelling if you wait for a checkpoint to finish. 4. Okay you answered your second question ;) Yes cancel with savepoint = stop now to make it easier for new users. [1] https://github.com/apache/flink/blob/release-1.6/flink-connectors/flink-connector-twitter/src/main/java/org/apache/flink/streaming/connectors/twitter/TwitterSource.java#L180-L190 [2] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/cli.html On Sun, Jun 7, 2020 at 1:04 AM M Singh <mans2si...@yahoo.com> wrote: Hi Arvid: Thanks for the links. A few questions: 1. Is there any particular interface in 1.9+ that identifies the source as stoppable ? 2. Is there any distinction b/w stop and cancel in 1.9+ ? 3. Is there any list of sources which are documented as stoppable besides the one listed in your SO link ? 4. In 1.9+ there is flink stop command and a flink cancel command. (https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#stop). So it appears that flink stop will take a savepoint and the call cancel, and cancel will just cancel the job (looks like cancel with savepoint is deprecated in 1.10). Thanks again for your help. On Saturday, June 6, 2020, 02:18:57 PM EDT, Arvid Heise <ar...@ververica.com> wrote: Yes, it seems as if FlinkKinesisConsumer does not implement it. Here are the links to the respective javadoc [1] and code [2]. Note that in later releases (1.9+) this interface has been removed. Stop is now implemented through a cancel() on source level. In general, I don't think that in a Kinesis to Kinesis use case, stop is needed anyways, since there is no additional consistency expected over a normal cancel. [1]https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/api/common/functions/StoppableFunction.html [2]https://github.com/apache/flink/blob/release-1.6/flink-core/src/main/java/org/apache/flink/api/common/functions/StoppableFunction.java On Sat, Jun 6, 2020 at 8:03 PM M Singh <mans2si...@yahoo.com> wrote: Hi Arvid: I check the link and it indicates that only Storm SpoutSource, TwitterSource and NifiSource support stop. Does this mean that FlinkKinesisConsumer is not stoppable ? Also, can you please point me to the Stoppable interface mentioned in the link ? I found the following but am not sure if TwitterSource implements it : https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/StartStoppable.java Thanks On Friday, June 5, 2020, 02:48:49 PM EDT, Arvid Heise <ar...@ververica.com> wrote: Hi, could you check if this SO thread [1] helps you already? [1]https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable On Thu, Jun 4, 2020 at 7:43 PM M Singh <mans2si...@yahoo.com> wrote: Hi: I am running a job which consumes data from Kinesis and send data to another Kinesis queue. I am using an older version of Flink (1.6), and when I try to stop the job I get an exception Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.rest.util.RestClientException: [Job termination (STOP) failed: This job is not stoppable.] I wanted to find out what is a stoppable job and it possible to make a job stoppable if is reading/writing to kinesis ? Thanks -- Arvid Heise| Senior Java Developer Follow us @VervericaData -- JoinFlink Forward - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- VervericaGmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng -- Arvid Heise| Senior Java Developer Follow us @VervericaData -- JoinFlink Forward - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- VervericaGmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng -- Arvid Heise| Senior Java Developer Follow us @VervericaData -- JoinFlink Forward - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- VervericaGmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng