pnowojski commented on a change in pull request #14968:
URL: https://github.com/apache/flink/pull/14968#discussion_r580080652



##########
File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/runtime/tasks/SourceStreamTaskTest.java
##########
@@ -856,4 +887,41 @@ private void output(String record) {
             output.collect(new StreamRecord<>(record));
         }
     }
+
+    /**
+     * This source sleeps a little bit before processing cancellation and 
records whether it was
+     * interrupted by the {@link SourceStreamTask} or not.
+     */
+    private static class WasInterruptedTestingSource implements 
SourceFunction<String> {
+        private static final int MAX_POST_CANCEL_ITERATIONS = 100;
+        private static final long serialVersionUID = 1L;
+        private volatile boolean running;
+
+        public static boolean wasInterrupted;
+
+        @Override
+        public void run(SourceContext<String> ctx) throws Exception {
+            wasInterrupted = false;
+            try {
+                int postCancelIterations = 0;
+                while (running || postCancelIterations < 
MAX_POST_CANCEL_ITERATIONS) {

Review comment:
       Yes, even without the sleep test can have false negative from time to 
time. Since this require unexpected sleep/hiccup of 100ms (or more), it looks 
like chances of this happening are pretty low (`<10%` ?), so the test would 
catch a regression quite quickly. This test checks:
   1. we have a source that ignores for `x` milliseconds cancellation
   2. checks that within those `x` ms source thread was not interrupted. 
   
   `notifyCheckpointCompleteAsync`? Did you mean 
`StreamTask#notifyCheckpointComplete` -> 
`CheckpointListener#notifyCheckpointComplete`? 
   
   Exiting on `CheckpointListener#notifyCheckpointComplete` would still require 
a sleep to check if the interruption was not called after it. Also in the stop 
with savepoint case, both `CheckpointListener#notifyCheckpointComplete` and 
`SourceFunction#cancel` are both being called from 
`StreamTask#notifyCheckpointComplete` one after another. As a matter of fact, 
`notifyCheckpointComplete()` is called before `cancel()`, so it would require 
even so slightly longer/more `MAX_POST_CANCEL_ITERATIONS`.
   
   I don't see how can we reliably test this without some sleep based 
probability window, because for that to be possible we would need to have a 
notification hook that is always triggered **after** interruption, so that the 
source would know if the interruption was supposed to happen it would have 
already happened. But we don't have anything like that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to