[jira] [Commented] (FLINK-36939) High CPU Utilization with Flink Kinesis EFO Consumer

Keith Lee (Jira) Sun, 13 Apr 2025 11:39:28 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-36939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17943940#comment-17943940
 ]


Keith Lee commented on FLINK-36939:
-----------------------------------

Refactored the changes for https://issues.apache.org/jira/browse/FLINK-36947 , 
making changes to KinesisShardSplitReaderBase so that both the issue here with 
high CPU utilisation when on EFO mode and GetRecords throttling when on Polling 
are addressed

See PR: https://github.com/apache/flink-connector-aws/pull/195

> High CPU Utilization with Flink Kinesis EFO Consumer
> ----------------------------------------------------
>
>                 Key: FLINK-36939
>                 URL: https://issues.apache.org/jira/browse/FLINK-36939
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Kinesis
>    Affects Versions: 1.20.0, aws-connector-5.0.0
>            Reporter: Keith Lee
>            Priority: Major
>         Attachments: Main.kt, Screenshot 1734584639640.png, Screenshot 
> 1734584781285.png, image-2025-01-10-12-43-29-262.png, 
> image-2025-01-10-12-44-48-869.png, image-2025-01-10-12-51-04-104.png, 
> image-2025-01-10-12-51-36-141.png, image.png
>
>
> Observation: When EFO is enabled, the CPU usage spikes and stays elevated, 
> regardless of record volume. If we switch back to the standard polling 
> consumer (disabling EFO), CPU utilization returns to normal levels.
> Profiling Results: Local profiling and flamegraphs suggest the connector may 
> be engaged in a busy-wait loop, continuously parking and un-parking threads 
> even when no data is available. This behavior consumes CPU cycles 
> unnecessarily.
> Performance Impact: While the job still processes records correctly when they 
> arrive, the high baseline CPU consumption is concerning. It wastes resources 
> and triggers unnecessary scaling, which doesn’t resolve the issue since new 
> instances also experience the same CPU pattern.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-36939) High CPU Utilization with Flink Kinesis EFO Consumer

Reply via email to