Re: [PR] Speed up prefix sums when decoding doc IDs. [lucene]

via GitHub Wed, 21 Aug 2024 13:13:37 -0700


jpountz commented on PR #13658:
URL: https://github.com/apache/lucene/pull/13658#issuecomment-2302929875


   I ran wikibigall on a M3, which is interesting because it does inline the 
splitLongs call, both before and after the change (presumably because the 
generated native code is smaller thus still candidate for inlining), and the 
benchmark doesn't get a slowdown (but neither a speedup or a very small one):
   
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                          OrHighLow     1387.75      (4.7%)     1358.50      
(6.4%)   -2.1% ( -12% -    9%) 0.236
                       AndStopWords       90.68      (5.1%)       88.85      
(6.6%)   -2.0% ( -13% -   10%) 0.278
                          And3Terms      438.98      (4.6%)      433.21      
(5.7%)   -1.3% ( -11% -    9%) 0.425
                  HighTermTitleSort      167.18      (4.6%)      165.66      
(4.3%)   -0.9% (  -9% -    8%) 0.519
               HighTermTitleBDVSort       82.36      (4.3%)       81.75      
(5.1%)   -0.7% (  -9% -    9%) 0.618
                      OrHighNotHigh      624.72      (4.5%)      620.27      
(5.1%)   -0.7% (  -9% -    9%) 0.642
                        AndHighHigh      222.77      (3.9%)      221.34      
(4.1%)   -0.6% (  -8% -    7%) 0.611
                 Or2Terms2StopWords      424.69      (4.7%)      422.05      
(5.3%)   -0.6% ( -10% -    9%) 0.695
                         AndHighMed      426.20      (4.9%)      423.75      
(5.0%)   -0.6% ( -10% -    9%) 0.715
                           Or3Terms      424.04      (4.8%)      421.69      
(5.0%)   -0.6% (  -9% -    9%) 0.721
                         OrHighRare      486.01      (5.4%)      483.99      
(5.8%)   -0.4% ( -10% -   11%) 0.813
                          OrHighMed      513.21      (4.9%)      511.77      
(4.8%)   -0.3% (  -9% -    9%) 0.854
                           PKLookup      409.03      (6.5%)      408.23      
(7.2%)   -0.2% ( -12% -   14%) 0.928
                  HighTermMonthSort     1781.87      (4.0%)     1780.09      
(6.2%)   -0.1% (  -9% -   10%) 0.952
                      OrNotHighHigh      570.80      (5.0%)      570.42      
(7.1%)   -0.1% ( -11% -   12%) 0.973
              HighTermDayOfYearSort     1509.01      (5.9%)     1512.74      
(6.1%)    0.2% ( -11% -   12%) 0.896
                       OrNotHighMed      686.50      (5.2%)      690.48      
(7.7%)    0.6% ( -11% -   14%) 0.781
                       OrHighNotMed      784.76      (5.2%)      789.88      
(4.8%)    0.7% (  -8% -   11%) 0.683
                    CountAndHighMed      507.69      (4.4%)      511.31      
(5.8%)    0.7% (  -9% -   11%) 0.664
                         OrHighHigh      193.31      (4.8%)      195.18      
(4.3%)    1.0% (  -7% -   10%) 0.506
                         TermDTSort      526.16      (3.1%)      531.24      
(5.7%)    1.0% (  -7% -   10%) 0.508
                And2Terms2StopWords      449.28      (4.5%)      454.04      
(5.5%)    1.1% (  -8% -   11%) 0.506
                         AndHighLow     2221.82      (6.3%)     2255.93      
(7.5%)    1.5% ( -11% -   16%) 0.484
                   CountAndHighHigh      188.97      (4.4%)      191.99      
(4.9%)    1.6% (  -7% -   11%) 0.278
                       OrHighNotLow     1001.86      (6.4%)     1021.97      
(6.2%)    2.0% (  -9% -   15%) 0.314
                        OrStopWords      100.73      (4.0%)      102.78      
(6.4%)    2.0% (  -8% -   12%) 0.229
                           HighTerm      914.24      (5.6%)      933.01      
(6.2%)    2.1% (  -9% -   14%) 0.271
                            MedTerm     1122.89      (5.2%)     1146.58      
(7.2%)    2.1% (  -9% -   15%) 0.287
                            LowTerm     1617.38      (7.5%)     1661.67      
(6.7%)    2.7% ( -10% -   18%) 0.221
                       OrNotHighLow     2220.24      (5.5%)     2294.40      
(8.9%)    3.3% ( -10% -   18%) 0.152
                     CountOrHighMed      461.03      (5.7%)      477.71      
(6.4%)    3.6% (  -7% -   16%) 0.058
                    CountOrHighHigh      265.96      (6.2%)      279.21      
(7.7%)    5.0% (  -8% -   20%) 0.024
   ```
   
   I'm debating whether I should revert this change or whether we should keep 
it because the micro benchmark suggests it's a speedup (both on my Ryzen and my 
M3) and we might find ways to take advantage of this speedup later on. I'm 
curious if you have an opinion @gsmiller.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Speed up prefix sums when decoding doc IDs. [lucene]

Reply via email to