Re: [I] CometHashJoin always selects BuildRight which causes potential performance regression [datafusion-comet]

via GitHub Fri, 21 Feb 2025 06:45:28 -0800


andygrove commented on issue #1382:
URL: 
https://github.com/apache/datafusion-comet/issues/1382#issuecomment-2674736651


   > [@kazuyukitanimura](https://github.com/kazuyukitanimura) I am not sure but 
I think the slowness comes from CometExchange that is executed after the join 
with BuildLeft
   > 
   > this is for original Comet
   > 
   > CometExchange
   > 
   > shuffle records written: 65,254,713
   > number of spills: 2,619
   > shuffle write time total (min, med, max )
   > 29.7 s (2 ms, 43 ms, 138 ms )
   > number of input batches: 100,000
   > records read: 65,254,713
   > memory pool time total (min, med, max )
   > 1.2 m (25 ms, 113 ms, 183 ms )
   > local bytes read total (min, med, max )
   > 513.1 MiB (4.3 MiB, 7.7 MiB, 9.5 MiB )
   > fetch wait time total (min, med, max )
   > 0 ms (0 ms, 0 ms, 0 ms )
   > remote bytes read total (min, med, max )
   > 3.4 GiB (37.6 MiB, 52.2 MiB, 54.1 MiB )
   > repartition time total (min, med, max )
   > 50.8 s (21 ms, 53 ms, 315 ms )
   > decoding and decompression time total (min, med, max )
   > 3.3 m (1.6 s, 2.4 s, 8.6 s )
   > local blocks read: 171,154
   > spilled bytes: 2,250,148,478,976
   > remote blocks read: 1,160,828
   > data size total (min, med, max )
   > 4.4 GiB (4.5 MiB, 6.8 MiB, 7.2 MiB )
   > native shuffle writer time total (min, med, max )
   > 4.8 m (100 ms, 341 ms, 1.5 s )
   > number of partitions: 2,000
   > encoding and compression time total (min, med, max )
   > 1.9 m (34 ms, 102 ms, 744 ms )
   > remote reqs duration total (min, med, max )
   > 1.2 m (339 ms, 639 ms, 2.2 s )
   > shuffle bytes written total (min, med, max )
   > 3.9 GiB (2.6 MiB, 6.2 MiB, 6.8 MiB )
   > 
   > and here is the metric with my change
   > 
   > ```
   > CometExchange
   > 
   > shuffle records written: 65,254,713
   > number of spills: 17,160
   > shuffle write time total (min, med, max )
   > 3.0 m (2 ms, 243 ms, 878 ms )
   > number of input batches: 11,485,874
   > records read: 65,254,713
   > memory pool time total (min, med, max )
   > 8.9 m (36 ms, 782 ms, 1.7 s )
   > local bytes read total (min, med, max )
   > 748.1 MiB (1611.3 KiB, 7.9 MiB, 9.8 MiB )
   > fetch wait time total (min, med, max )
   > 0 ms (0 ms, 0 ms, 0 ms )
   > remote bytes read total (min, med, max )
   > 4.8 GiB (12.7 MiB, 52.0 MiB, 55.9 MiB )
   > repartition time total (min, med, max )
   > 4.2 m (62 ms, 294 ms, 2.0 s )
   > decoding and decompression time total (min, med, max )
   > 18.8 m (7.4 s, 9.9 s, 36.4 s )
   > local blocks read: 174,515
   > spilled bytes: 16,134,291,652,608
   > remote blocks read: 1,157,467
   > data size total (min, med, max )
   > 5.9 GiB (6.0 MiB, 9.1 MiB, 9.6 MiB )
   > native shuffle writer time total (min, med, max )
   > 22.1 m (152 ms, 1.7 s, 6.7 s )
   > number of partitions: 2,000
   > encoding and compression time total (min, med, max )
   > 3.7 m (21 ms, 244 ms, 2.0 s )
   > remote reqs duration total (min, med, max )
   > 57.0 s (71 ms, 337 ms, 1.6 s )
   > shuffle bytes written total (min, med, max )
   > 5.6 GiB (2.1 MiB, 8.7 MiB, 9.2 MiB )
   > ```
   > 
   > It is weird that most of the metrics including spill size and execution 
time get 7-8x higher. I don't know why it happens but I am trying to figure out.
   
   Spilling 16 TB (`16,134,291,652,608`) is definitely not desirable. I suspect 
that our spilling is currently too aggressive. Could you try allocating more 
off-heap memory to avoid spilling and see how the performance looks?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] CometHashJoin always selects BuildRight which causes potential performance regression [datafusion-comet]

Reply via email to