ayushi-agarwal opened a new issue, #10214:
URL: https://github.com/apache/incubator-gluten/issues/10214

   ### Description
   
   Hi @marin-ma!
   In one of our workloads we are seeing high deserialize time in columnar 
exchange node when shuffle data is read. This is hash based shuffle. The 
partitions were set to 1920 initially, we tried to decrease it to 1000 and then 
to 500 which decreases this time but still it's significant. What I observerd 
is that the number of batches become really huge after shuffle read annd number 
of rows per batch is very low. It improves with decreasing shuffle partitions 
but if we decrease it too much, we are not able to utilize the parallelism for 
other operators in the stage after shuffle read like sort, window, join and 
project with UDF. Wanted to get your input on this issue.
   
   This is the screenshot with 1000 partitions.
   <img width="456" height="984" alt="Image" 
src="https://github.com/user-attachments/assets/f9ec4eab-9007-4dab-8a25-5be2c9efb2ef";
 />
   
   This is with 500 partitions
   
   <img width="456" height="984" alt="Image" 
src="https://github.com/user-attachments/assets/1a5634ae-e43d-42c4-8177-87ba1488156f";
 />
   
   The number of batches reduced from 148 million to 74 million but it's much 
higher than input batches which is 1.6 million.
   I tried sort based shuffle also but didn't see any improvement.
   Could you give some pointers that can help improve its performance?
   
   ### Gluten version
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to