uschindler commented on PR #13133:
URL: https://github.com/apache/lucene/pull/13133#issuecomment-1964672015

   > I'm surprised by how slow this is with AVX off given that this can be 
implemented with SSE2 :(.
   
   This is why we try to avoid the incubating vector API as much as possible. 
The code needs to be tested with all platforms and bitsizes using extensive 
benchmarking. The problem is:
   - If Hotspot does not have an optimization for the actual CPU -> slow
   - If you use a JDK with Graal -> slow
   - If you use non-Hotspot (e.g., OpenJ9) -> slow
   - If you disable teiered compilation / use client compiler -> slow
   
   Actually the code is slow because without support in hardware, the code 
executes as written in Java code, producing hundreds of instances.
   
   Actually another thing: Once you fixed the code, make sure to show a 
benchmark on real queries. Just because the group vint decoding is 30% faster, 
it does not mean you would see any difference in production. Normally we would 
only accept incubator vector optimizations if the results are at least 4 times 
faster than scalar code (e.g., the float dotproduct is 12 to 16 times faster, 
but still the effect on query/merging performance is not 16 times faster, just 
about 15%).
   
   So if the effect on queries is <5% on query performance, I would disagree to 
merge this. Sorry!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to