Hello! This is my first time ever utilizing a mailing list, so I apologize if I’m not conforming to any standards or rules (and please correct me where obvious). I’m looking to inquire about Spark’s StreamingQueryListener.
I currently have a Spark Streaming job with a trigger interval of 10 minutes in a cluster. I want to periodically execute maintenance jobs (OPTIMIZE, DELETE, VACUUM) in the same cluster to save on compute resources. Ideally, I don’t want all of these jobs running concurrently or when the Spark Streaming job is processing data. I want to implement a `StreamingQueryListener` to detect when any streaming queries are running and delay the execution of the maintenance jobs. From testing, I see that `onQueryIdle` does not trigger when a query is waiting for the next trigger interval. Before diving into the Apache Spark code, I wanted to get thoughts on whether it’s worth implementing a new QueryListener method (something like `onQueryWait`) that will report when a streaming query is awaiting a new trigger. Thoughts? Is this too naive? --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org