jonathanc-n commented on PR #16450:
URL: https://github.com/apache/datafusion/pull/16450#issuecomment-2988126572

   @xudong963 These were tests run with one of the sides having 5 rows:
   <details>
     <summary>Click to expand</summary>
     ```
   joins/HashJoin/l=16_r=5 time:   [9.5541 µs 9.6068 µs 9.6640 µs]
                           change: [-1.2426% -0.2737% +0.6768%] (p = 0.57 > 
0.05)
                           No change in performance detected.
   Found 15 outliers among 100 measurements (15.00%)
     5 (5.00%) low mild
     8 (8.00%) high mild
     2 (2.00%) high severe
   joins/NestedLoopJoin/l=16_r=5
                           time:   [8.3347 µs 8.4427 µs 8.5472 µs]
                           change: [+16.951% +17.961% +19.019%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   joins/HashJoin/l=64_r=5 time:   [9.7575 µs 9.8982 µs 10.029 µs]
                           change: [-6.8033% -2.8109% -0.1691%] (p = 0.11 > 
0.05)
                           No change in performance detected.
   Found 30 outliers among 100 measurements (30.00%)
     8 (8.00%) low severe
     5 (5.00%) low mild
     2 (2.00%) high mild
     15 (15.00%) high severe
   joins/NestedLoopJoin/l=64_r=5
                           time:   [10.104 µs 10.157 µs 10.228 µs]
                           change: [+12.067% +12.951% +13.830%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   joins/HashJoin/l=256_r=5
                           time:   [10.351 µs 10.460 µs 10.576 µs]
                           change: [-0.9628% +0.0307% +1.0802%] (p = 0.95 > 
0.05)
                           No change in performance detected.
   joins/NestedLoopJoin/l=256_r=5
                           time:   [20.469 µs 20.519 µs 20.577 µs]
                           change: [+8.3494% +9.3946% +10.285%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 6 outliers among 100 measurements (6.00%)
     3 (3.00%) low mild
     3 (3.00%) high severe
   joins/HashJoin/l=1024_r=5
                           time:   [24.460 µs 24.713 µs 24.901 µs]
                           change: [+21.363% +31.250% +41.778%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 23 outliers among 100 measurements (23.00%)
     16 (16.00%) low severe
     3 (3.00%) low mild
     4 (4.00%) high mild
   joins/NestedLoopJoin/l=1024_r=5
                           time:   [67.322 µs 67.751 µs 68.268 µs]
                           change: [+6.0403% +8.0079% +9.5998%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 3 outliers among 100 measurements (3.00%)
     1 (1.00%) high mild
     2 (2.00%) high severe
   joins/HashJoin/l=4096_r=5
                           time:   [22.298 µs 22.814 µs 23.411 µs]
                           change: [-4.7502% -0.0368% +4.6566%] (p = 0.99 > 
0.05)
                           No change in performance detected.
   Found 6 outliers among 100 measurements (6.00%)
     5 (5.00%) high mild
     1 (1.00%) high severe
   joins/NestedLoopJoin/l=4096_r=5
                           time:   [258.64 µs 259.56 µs 260.40 µs]
                           change: [+5.1991% +8.4282% +10.949%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 22 outliers among 100 measurements (22.00%)
     13 (13.00%) low severe
     3 (3.00%) low mild
     5 (5.00%) high mild
     1 (1.00%) high severe
   joins/HashJoin/l=32768_r=5
                           time:   [118.18 µs 120.61 µs 124.07 µs]
                           change: [-0.5396% +0.8988% +2.6146%] (p = 0.26 > 
0.05)
                           No change in performance detected.
   Found 4 outliers among 100 measurements (4.00%)
     3 (3.00%) high mild
     1 (1.00%) high severe
   joins/NestedLoopJoin/l=32768_r=5
                           time:   [2.3731 ms 2.4175 ms 2.4715 ms]
                           change: [-2.4822% +2.7216% +7.0548%] (p = 0.29 > 
0.05)
                           No change in performance detected.
   Found 11 outliers among 100 measurements (11.00%)
     6 (6.00%) high mild
     5 (5.00%) high severe
   joins/HashJoin/l=5_r=16 time:   [9.3953 µs 9.4891 µs 9.5818 µs]
                           change: [+0.2660% +2.3151% +3.8663%] (p = 0.01 < 
0.05)
                           Change within noise threshold.
   joins/NestedLoopJoin/l=5_r=16
                           time:   [8.4094 µs 8.4620 µs 8.5189 µs]
                           change: [+7.3597% +13.271% +16.947%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) low mild
   joins/HashJoin/l=5_r=64 time:   [10.281 µs 10.304 µs 10.322 µs]
                           change: [+5.5270% +7.6174% +9.6222%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) low mild
   joins/NestedLoopJoin/l=5_r=64
                           time:   [10.511 µs 10.644 µs 10.845 µs]
                           change: [+7.7402% +11.531% +14.867%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 4 outliers among 100 measurements (4.00%)
     1 (1.00%) low mild
     2 (2.00%) high mild
     1 (1.00%) high severe
   joins/HashJoin/l=5_r=256
                           time:   [9.8597 µs 9.9384 µs 10.007 µs]
                           change: [-1.8462% -0.6659% +0.4999%] (p = 0.29 > 
0.05)
                           No change in performance detected.
   joins/NestedLoopJoin/l=5_r=256
                           time:   [21.108 µs 21.195 µs 21.280 µs]
                           change: [+1.5413% +6.7273% +10.204%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   joins/HashJoin/l=5_r=1024
                           time:   [11.088 µs 11.182 µs 11.279 µs]
                           change: [-0.8211% +0.3052% +1.3850%] (p = 0.59 > 
0.05)
                           No change in performance detected.
   joins/NestedLoopJoin/l=5_r=1024
                           time:   [69.937 µs 76.254 µs 83.638 µs]
                           change: [+5.7682% +9.4840% +14.122%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 4 outliers among 100 measurements (4.00%)
     4 (4.00%) high severe
   joins/HashJoin/l=5_r=4096
                           time:   [16.550 µs 16.633 µs 16.713 µs]
                           change: [-1.4548% -0.7755% -0.0668%] (p = 0.03 < 
0.05)
                           Change within noise threshold.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   joins/NestedLoopJoin/l=5_r=4096
                           time:   [271.97 µs 272.96 µs 273.65 µs]
                           change: [+13.243% +14.024% +14.795%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 15 outliers among 100 measurements (15.00%)
     7 (7.00%) low severe
     1 (1.00%) low mild
     5 (5.00%) high mild
     2 (2.00%) high severe
   joins/HashJoin/l=5_r=32768
                           time:   [79.925 µs 81.041 µs 82.978 µs]
                           change: [+2.3887% +3.6331% +5.0589%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) high mild
     1 (1.00%) high severe
   joins/NestedLoopJoin/l=5_r=32768
                           time:   [2.6531 ms 2.7439 ms 2.8609 ms]
                           change: [+9.0545% +13.275% +18.591%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 8 outliers among 100 measurements (8.00%)
     4 (4.00%) high mild
     4 (4.00%) high severe
     ```
    </details> 
     
   Interesting I noticed I wasn't using a filter for the `NestedLoopJoin` and 
it was hardly performing faster. The benchmarks I just sent were the benchmarks 
using the filter for NLJ, which is why there are some regressions. I'll see if 
I can take some time to see how we can speed up `NestedLoopJoin`.
   
   The reason why `NestedLoopJoin` is so slow is due to the cartesian product 
which is being calculated + running the filter through all the indices (which 
is also memory expensive). It would probably be faster to just keep one side in 
memory if possible. and have the other side run a block nested loop join on it. 
In that case, equijoins tend to be much faster on a smaller table. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to