[jira] [Commented] (FLINK-10455) Potential Kafka producer leak in case of failures

Jiangjie Qin (JIRA) Tue, 04 Jun 2019 02:14:38 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855489#comment-16855489
 ]


Jiangjie Qin commented on FLINK-10455:
--------------------------------------

[~sunjincheng121] Thanks for the help. I am still working on this. I'm trying 
to get to the bottom and come to a complete solution.

> Potential Kafka producer leak in case of failures
> -------------------------------------------------
>
>                 Key: FLINK-10455
>                 URL: https://issues.apache.org/jira/browse/FLINK-10455
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka
>    Affects Versions: 1.5.2
>            Reporter: Nico Kruber
>            Assignee: Jiangjie Qin
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.5.6, 1.7.0, 1.7.3, 1.9.0, 1.8.1
>
>         Attachments: image-2019-06-04-14-25-16-916.png, 
> image-2019-06-04-14-30-55-985.png
>
>
> If the Kafka brokers' timeout is too low for our checkpoint interval [1], we 
> may get an {{ProducerFencedException}}. Documentation around 
> {{ProducerFencedException}} explicitly states that we should close the 
> producer after encountering it.
> By looking at the code, it doesn't seem like this is actually done in 
> {{FlinkKafkaProducer011}}. Also, in case one transaction's commit in 
> {{TwoPhaseCommitSinkFunction#notifyCheckpointComplete}} fails with an 
> exception, we don't clean up (nor try to commit) any other transaction.
> -> from what I see, {{TwoPhaseCommitSinkFunction#notifyCheckpointComplete}} 
> simply iterates over the {{pendingCommitTransactions}} which is not touched 
> during {{close()}}
> Now if we restart the failing job on the same Flink cluster, any resources 
> from the previous attempt will still linger around.
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/connectors/kafka.html#kafka-011



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10455) Potential Kafka producer leak in case of failures

Reply via email to