Re: Decompress Gzip files from EventHub with Structured Streaming

2022-03-08 Thread ayan guha
Hi IMHO this is not the best use of spark. I would suggest to use simple azure function to unzip. Is there any specific reason to use gzip over event hub? If you can wait 10-20 sec to process, you can use eventhub capture to write data to storage and then process it. It all depends on compute

Decompress Gzip files from EventHub with Structured Streaming

2022-03-08 Thread Data Guy
Hi everyone, ** Context: I have events coming into Databricks from an Azure Event Hub in a Gzip compressed format. Currently, I extract the files with a UDF and send the unzipped data into the silver layer in my Delta Lake with .write. Note that even though data comes in continuously I do not use