[
https://issues.apache.org/jira/browse/FLINK-31557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
LiuZeshan updated FLINK-31557:
------------------------------
Description:
Currently, metric viewUpdater and reporterTask share the same
SingleThreadScheduledExecutor, and customized reporters may have unpredictable
logic, such as unreasonable network timeout settings, which can affect
viewUpdater's calculation of PerSecond related metrics. For example, a real
online problem we encountered, the network timeout of the reporter is set to 10
seconds, and the reporting interval is 15 seconds. When the server is
unavailable, the thread is blocked for 10s, resulting in 66.7% (5/3x) higher
PerSecond related metrics.
Is it possible to optimize here, such as whether it can be changed to a
ScheduledThreadPool executor?
was:
Currently, metric viewUpdater and reporterTask share the same
SingleThreadScheduledExecutor, and customized reporters may have unpredictable
logic, such as unreasonable network timeout settings, which can affect
viewUpdater's calculation of PerSecond related metrics. For example, a real
online problem we encountered, the network timeout of the reporter is set to 10
seconds, and the reporting interval is 15 seconds. When the server is
unavailable, the thread is blocked for 10s, resulting in 40% higher PerSecond
related metrics.
Is it possible to optimize here, such as whether it can be changed to a
ScheduledThreadPool executor?
> Metric viewUpdater and reporter task in a SingleThreadScheduledExecutor lead
> to inaccurate PerSecond related metrics
> --------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-31557
> URL: https://issues.apache.org/jira/browse/FLINK-31557
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Metrics
> Reporter: LiuZeshan
> Priority: Minor
>
> Currently, metric viewUpdater and reporterTask share the same
> SingleThreadScheduledExecutor, and customized reporters may have
> unpredictable logic, such as unreasonable network timeout settings, which can
> affect viewUpdater's calculation of PerSecond related metrics. For example, a
> real online problem we encountered, the network timeout of the reporter is
> set to 10 seconds, and the reporting interval is 15 seconds. When the server
> is unavailable, the thread is blocked for 10s, resulting in 66.7% (5/3x)
> higher PerSecond related metrics.
> Is it possible to optimize here, such as whether it can be changed to a
> ScheduledThreadPool executor?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)