Scott Kidder created FLINK-5946:
-----------------------------------

             Summary: Kinesis Producer uses KPL that orphans threads that 
consume 100% CPU
                 Key: FLINK-5946
                 URL: https://issues.apache.org/jira/browse/FLINK-5946
             Project: Flink
          Issue Type: Bug
          Components: Kinesis Connector
    Affects Versions: 1.2.0
            Reporter: Scott Kidder


It's possible for the Amazon Kinesis Producer Library (KPL) to leave orphaned 
threads running after the producer has been instructed to shutdown via the 
`destroy()` method. These threads run in a very tight infinite loop that can 
push CPU usage to 100%. I've seen this happen on several occasions, though it 
does not happen all of the time. Once these threads are orphaned, the only 
solution to bring CPU utilization back down is to restart the Flink Task 
Manager.

When a KPL producer is instantiated, it creates several threads: one to execute 
and monitor the native sender process, and two threads to monitor the process' 
stdout and stderr output. It's possible for the process-monitor thread to stop 
in such a way that leaves the output monitor threads orphaned.

I've submitted a Github issue and pull-request against the KPL project:
https://github.com/awslabs/amazon-kinesis-producer/issues/93
https://github.com/awslabs/amazon-kinesis-producer/pull/94

This issue is rooted in the Amazon Kinesis Producer Library (KPL) that the 
Flink Kinesis streaming connector depends upon. It ought to be fixed in the 
KPL, but I want to document it on the Flink project. The Flink KPL dependency 
should be updated once the KPL has been fixed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to