Zeyu Chen created SPARK-51358:
---------------------------------

             Summary: Introduce snapshot upload lag detection through 
StateStoreCoordinator
                 Key: SPARK-51358
                 URL: https://issues.apache.org/jira/browse/SPARK-51358
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 4.0.0, 4.1
            Reporter: Zeyu Chen


As part of the first step to increase visibility into snapshot upload lag, we 
want to add a snapshot lag alerting system. Using the state store coordinator, 
we want to publish driver logs to warn about specific state store instances 
falling behind.

This allows us to enable observability through dashboards and alerts, helping 
us understand the patterns and frequency of lag in production. The collected 
data will also inform future remediation strategies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to