date:20210704

Re: Spark AQE Post-Shuffle partitions coalesce don't work as expected, and even make data skew in some partitions. Need help to debug issue.

2021-07-04 Thread Mich Talebzadeh

Hi Nick, I looked at both this thread and your SO question. Trying to understand 1. You are reading through Kafka via Spark structured streaming. 2. Your messages from Kafka are not uniform, meaning you may get variable record size in each window. 3. How are you processing these mes

Spark AQE Post-Shuffle partitions coalesce don't work as expected, and even make data skew in some partitions. Need help to debug issue.

2021-07-04 Thread Nick Grigoriev

I have ask this question on stack overflow, but it look to complex for Q/A resource. https://stackoverflow.com/questions/68236323/spark-aqe-post-shuffle-partitions-coalesce-dont-work-as-expected-and-even-make So I want ask for help here. I use global sort on my spark DF, and when I enable AQE and