GitHub user NicoK opened a pull request:

    https://github.com/apache/flink/pull/5661

     [FLINK-8896][kafka08] fix Kafka08Fetcher trying to look up topic "n/a" on 
partiton "-1"

    ## What is the purpose of the change
    
    `Kafka08Fetcher` uses a `MARKER` to make sure the main thread wakes up when 
cancelling. While looking up partition leaders, this marker is removed only 
once from the list of partitions to look up and there is a code path that leads 
to two markers being present in which case the lookup will throw an exception:
    - `FlinkKafkaConsumerBase#cancel()` is called in one thread, stopped right 
after setting `running` to false, and then
    -  `FlinkKafkaConsumerBase`'s partition discovery loop thread drops out 
before the first thread was able to call `Kafka08Fetcher#cancel`.
    
    ## Brief change log
    
    - remove all markers in the list, not just one
    - make `FlinkKafkaConsumerBase`'s partition discovery loop not call 
`cancel()` if already cancelled
    
    ## Verifying this change
    
    This change is a trivial rework / code cleanup without any test coverage.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): **no**
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
      - The serializers: **no**
      - The runtime per-record code paths (performance sensitive): **no**
      - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: **no**
      - The S3 file system connector: **no**
    
    ## Documentation
    
      - Does this pull request introduce a new feature? **no**
      - If yes, how is the feature documented? **JavaDocs**


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/NicoK/flink flink-8896

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5661.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5661
    
----
commit e9711a61be0c260629c9f2f34c03af6e1aa9ac61
Author: Nico Kruber <nico@...>
Date:   2018-03-08T11:22:32Z

    [FLINK-8896][kafka08] remove all cancel MARKERs before trying to find 
partition leaders
    
    This guards us against #cancel() being called multiple times and then 
trying to
    look up an invalid topic/partition pair.

commit bffe96f5e3522824656ed55074ac09591a36e2ae
Author: Nico Kruber <nico@...>
Date:   2018-03-08T11:23:47Z

    [hotfix][kafka] do not run cancel() in the discovery loop if already 
cancelled

----


---

Reply via email to