Scott Kidder created FLINK-5946: ----------------------------------- Summary: Kinesis Producer uses KPL that orphans threads that consume 100% CPU Key: FLINK-5946 URL: https://issues.apache.org/jira/browse/FLINK-5946 Project: Flink Issue Type: Bug Components: Kinesis Connector Affects Versions: 1.2.0 Reporter: Scott Kidder
It's possible for the Amazon Kinesis Producer Library (KPL) to leave orphaned threads running after the producer has been instructed to shutdown via the `destroy()` method. These threads run in a very tight infinite loop that can push CPU usage to 100%. I've seen this happen on several occasions, though it does not happen all of the time. Once these threads are orphaned, the only solution to bring CPU utilization back down is to restart the Flink Task Manager. When a KPL producer is instantiated, it creates several threads: one to execute and monitor the native sender process, and two threads to monitor the process' stdout and stderr output. It's possible for the process-monitor thread to stop in such a way that leaves the output monitor threads orphaned. I've submitted a Github issue and pull-request against the KPL project: https://github.com/awslabs/amazon-kinesis-producer/issues/93 https://github.com/awslabs/amazon-kinesis-producer/pull/94 This issue is rooted in the Amazon Kinesis Producer Library (KPL) that the Flink Kinesis streaming connector depends upon. It ought to be fixed in the KPL, but I want to document it on the Flink project. The Flink KPL dependency should be updated once the KPL has been fixed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)