Re: Performance Difference between files getting opened with IoContext.RANDOM vs IoContext.READ

2024-10-01 Thread Navneet Verma
Hi Uwe, Thanks for sharing the link and providing the useful information. I will definitely go ahead and create a gh issue. In the meantime I did some testing by changing the IOContext from RANDOM to READ for FlatVectors

Re: Performance Difference between files getting opened with IoContext.RANDOM vs IoContext.READ

2024-10-01 Thread Uwe Schindler
Hi, thinking about it a bit more: In 10.x we already have some ways to preload data with WILL_NEED (or similar). Maybe this can also be used on merging when we reuse an already open IndexInput. Maybe it is possible to chanhge the madvise on an already open IndexInput and change it before merg

Re: Performance Difference between files getting opened with IoContext.RANDOM vs IoContext.READ

2024-10-01 Thread Uwe Schindler
Hi, great. I still think the difference between RANDOM and READ is huge in your case. Are you sure that you have not misconfigured your system. The most important thing for Lucene is to make sure that heap space of the Java VM is limited as much as possible (shortly over the OOM boundary) and

Re: Performance Difference between files getting opened with IoContext.RANDOM vs IoContext.READ

2024-10-01 Thread Navneet Verma
Hi Uwe, To ans your question about the RAM and heap size. Here are some details RAM: 128GB Heap: 32GB CPU: 16 This is where I will put some reproducible benchmarks using Lucene alone. I have currently used Opensearch 2.17 version to run these benchmarks. *In general, the correct fix for this is

Re: Performance Difference between files getting opened with IoContext.RANDOM vs IoContext.READ

2024-10-01 Thread Uwe Schindler
Hi, this seems to be aspecial case in FlatVectors, because normally theres a separate method to open an IndexInput for checksumming: https://github.com/apache/lucene/blob/524ea208c870861a719f21b1ea48943c8b7520da/lucene/core/src/java/org/apache/lucene/store/Directory.java#L155-L157 Could you o