subject:"DataStreamReader cleanSource option"

Re: DataStreamReader cleanSource option

2022-02-03 Thread Jungtaek Lim

Hi, Could you please set the config "spark.sql.streaming.fileSource.cleaner.numThreads" to 0 and see whether it works? (NOTE: will slow down your process since the cleaning phase will happen in the foreground. The default is background with 1 thread. You can try out more threads than 1.) If it doe

Re: DataStreamReader cleanSource option

2022-01-27 Thread Mich Talebzadeh

Hi Gabriela, I don't know about data lake but this is about Spark Structured Streaming. Have both readStream and writeStream working OK, for example can you do df.printSchema() after read? It is advisable to wrap the logic inside try: This is an example of wrapping it data_path = "file://

DataStreamReader cleanSource option

2022-01-27 Thread Gabriela Dvořáková

Hi, I am writing to ask for advice regarding the cleanSource option of the DataStreamReader. I am using pyspark with Spark 3.1. via Azure Synapse. To my knowledge, cleanSource option was introduced in Spark version 3. I'd spent a significant amount of time trying to configure this option with both

Re: DataStreamReader cleanSource option

Re: DataStreamReader cleanSource option

DataStreamReader cleanSource option

3 matches

Site Navigation

Mail list logo

Footer information