Use a streaming query listener that tracks repetitive progress events for the same batch id. If x amount of time has elapsed given repetitive progress events for the same batch id, the source is not providing new offsets and stream execution is not scheduling new micro batches. See also: spark.sql.streaming.pollingDelay. Alternative methods may produce less than desirable results due to specific characteristics of a source / sink / workflow. It may be more desirable to represent the amount of time as the number of repetitive progress events to be more forgiving of implementation details (e.g., kafka source has internal retry attempts to determine latest offsets and sleeps in between attempts if there is a miss when asked for new data, etc.).
-Chris ________________________________ From: Aakash Basu <aakash.spark....@gmail.com> Sent: Thursday, March 22, 2018 10:45:38 PM To: user Subject: Structured Streaming Spark 2.3 Query Hi, What is the way to stop a Spark Streaming job if there is no data inflow for an arbitrary amount of time (eg: 2 mins)? Thanks, Aakash.