Zeyu Chen created SPARK-51358: --------------------------------- Summary: Introduce snapshot upload lag detection through StateStoreCoordinator Key: SPARK-51358 URL: https://issues.apache.org/jira/browse/SPARK-51358 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.0.0, 4.1 Reporter: Zeyu Chen
As part of the first step to increase visibility into snapshot upload lag, we want to add a snapshot lag alerting system. Using the state store coordinator, we want to publish driver logs to warn about specific state store instances falling behind. This allows us to enable observability through dashboards and alerts, helping us understand the patterns and frequency of lag in production. The collected data will also inform future remediation strategies. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org