from:"刘唯"

Re: Structured Streaming Initial Listing Issue

2025-05-12 Thread 刘唯

That 1073.3 MiB isn't too much bigger than spark.driver.maxResultSize, can't you just increase that config with a larger number? / Wei Anastasiia Sokhova 于2025年4月16日周三 03:37写道： > Dear Spark Community, > > > > I run a Structured Streaming Query to read json files from S3 into an Ic > eberg table

Re: Structured Streaming and Spark Connect

2024-09-23 Thread 刘唯

Hi Anastasiia, Thanks for the email. I think you can tweak this spark config *spark.connect.session.manager.defaultSessionTimeout, *this is defined here*: * https://github.com/apache/spark/blob/343471dac4b96b43a09763d759b6c30760fb626e/sql/connect/server/src/main/scala/org/apache/spark/sql/connect/

Re: Re: [Spark SQL] How can I use .sql() in conjunction with watermarks?

2024-04-09 Thread 刘唯

Sorry this is not a bug but essentially a user error. Spark throws a really confusing error and I'm also confused. Please see the reply in the ticket for how to make things correct. https://issues.apache.org/jira/browse/SPARK-47718 刘唯于2024年4月6日周六 11:41写道： > This indeed looks like a bug

Re: Re: [Spark SQL] How can I use .sql() in conjunction with watermarks?

2024-04-06 Thread 刘唯

This indeed looks like a bug. I will take some time to look into it. Mich Talebzadeh 于2024年4月3日周三 01:55写道： > > hm. you are getting below > > AnalysisException: Append output mode not supported when there are > streaming aggregations on streaming DataFrames/DataSets without watermark; > > The pro

Re: Bug in How to Monitor Streaming Queries in PySpark

2024-03-11 Thread 刘唯

ect to the best of my kno > wledge but of course cannot be guaranteed . It is essential to note that, > as with any advice, quote "one test result is worth one-thousand expert op > inions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun > <https://en.wik

Re: Bug in How to Monitor Streaming Queries in PySpark

2024-03-10 Thread 刘唯

*now -> not 刘唯于2024年3月10日周日 22:04写道： > Have you tried using microbatch_data.get("processedRowsPerSecond")? > Camel case now snake case > > Mich Talebzadeh 于2024年3月10日周日 11:46写道： > >> >> There is a paper from Databricks on this subject >> >>

Re: Bug in How to Monitor Streaming Queries in PySpark

2024-03-10 Thread 刘唯

Have you tried using microbatch_data.get("processedRowsPerSecond")? Camel case now snake case Mich Talebzadeh 于2024年3月10日周日 11:46写道： > > There is a paper from Databricks on this subject > > > https://www.databricks.com/blog/2022/05/27/how-to-monitor-streaming-queries-in-pyspark.html > > But havi

Re: Structured Streaming Initial Listing Issue

Re: Structured Streaming and Spark Connect

Re: Re: [Spark SQL] How can I use .sql() in conjunction with watermarks?

Re: Re: [Spark SQL] How can I use .sql() in conjunction with watermarks?

Re: Bug in How to Monitor Streaming Queries in PySpark

Re: Bug in How to Monitor Streaming Queries in PySpark

Re: Bug in How to Monitor Streaming Queries in PySpark

7 matches

Site Navigation

Mail list logo

Footer information