Setting forarch microbatch processing data count in structured streaming

Karthick Nk Fri, 30 Aug 2024 00:52:18 -0700

Hi All,

I am using structured streaming in Databricks by using foreach
functionality to do my transformation and action and finally need to write
the data into a delta table my data soruce is either (Eventhub or delta
table or azure cosmos changefeed).


Whenever there are huge changes in source(Delta table, Azure SQL, cosmos),
in the streaming process all the data is processed at the same micro-batch
due to which with our existing cluster we are not able to process all the
data at the same time.

So we need to chunk the data in forarch microbatch while reading change
data from source(delta table, Azure SQL, Azure cosmos). we need to set the
foreach microbatch count limit in the structured streaming flow.

Could you please suggest a way to do the same.

Thanks,

Setting forarch microbatch processing data count in structured streaming

Reply via email to