While running my Spark (Stateful) Structured Streaming job I am setting
'maxOffsetsPerTrigger' value to 10 Million. I've noticed that messages are
processed faster if I use a large value for this property.

What I am also noticing is that until the batch is completely processed, no
messages are getting written to the output Kafka topic. The 'State timeout'
is set to 10 minutes so I am expecting to see at least some of the
messages after 10 minutes or so BUT messages are not getting written until
processing of the next batch is started.

Is there any property I can use to kinda 'flush' the messages that are
ready to be written? Please let me know. Thanks.

Reply via email to