Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread Baris Kazar
So, just cat will do this. Thanks From: Robert Muir Sent: Tuesday, February 23, 2021 4:45 PM To: Baris Kazar Cc: java-user Subject: Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory) The preload isn't magical. It only "reads in

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread Robert Muir
The preload isn't magical. It only "reads in the whole file" to get it cached, same as if you did that yourself with 'cat' or 'dd'. It "warms" the file. It just does this in an efficient way at the low level to make the warming itself efficient. It madvise()s kernel to announce some read-ahead and

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
Thanks again, Robert. Could you please explain "preload"? Which functionality is that? we discussed in this thread before about a preload. Is there a Lucene url / site that i can look at for preload? Thanks for the explanations. This thread will be useful for many folks i believe. Best regar

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread Robert Muir
On Tue, Feb 23, 2021 at 4:07 PM wrote: > What i want to achieve: Problem statement: > > base case is disk based Lucene index with FSDirectory > > speedup case was supposed to be in memory Lucene index with MMapDirectory > On 64-bit systems, FSDirectory just invokes MMapDirectory already. So you d

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
(edited previous response) Thanks, but each different query at the first run i see some slowdown (not much though) with MMapDirectory and FSDirectory wrt second, third runs (due to cold start), though. Cold start slowdown is a little bit more with FSdirectory. So, MMapDirectory is slightly

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
Thanks, but each different query i see some slowdown (not much though) with MMapDirectory and FSDirectory, though. It is a little bit more with FSdirectory. So, MMapDirectory is slightly better in that, too: ie, cold start. What i want to achieve: Problem statement: base case is disk based

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread Robert Muir
speedup over what? You are probably already using MMapDirectory (it is the default). So I don't know what you are trying to achieve, but giving lots of memory to your java process is not going to help. If you just want to prevent the first few queries to a fresh cold machine instance from being sl

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
Thanks but then how will MMapDirectory help gain speedup? i will try tmpfs and see what happens. i was expecting to get on order of magnitude of speedup from already very fast on disk Lucene indexes. So i was expecting really really really fast response with MMapDirectory. Thanks On 2/23/21

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread Robert Muir
Don't give gobs of memory to your java process, you will just make things slower. The kernel will cache your index files. On Tue, Feb 23, 2021 at 1:45 PM wrote: > Ok, but how is this MMapDirectory used then? > > Best regards > > > On 2/23/21 7:03 AM, Robert Muir wrote: > > > > > > On Tue, Feb 23

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
As Uwe suggested some time ago, tmpfs file system usage with MMapDirectory is the only way to get high speedup wrt on disk Lucene index, right? Best regards On 2/23/21 1:44 PM, baris.ka...@oracle.com wrote: Ok, but how is this MMapDirectory used then? Best regards On 2/23/21 7:03 AM, Rob

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread baris . kazar
Ok, but how is this MMapDirectory used then? Best regards On 2/23/21 7:03 AM, Robert Muir wrote: On Tue, Feb 23, 2021 at 2:30 AM > wrote: Hi,-   I tried MMapDirectory and i allocated as big as index size on my J2EE Container but Don't alloc

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-23 Thread Robert Muir
On Tue, Feb 23, 2021 at 2:30 AM wrote: > Hi,- > > I tried MMapDirectory and i allocated as big as index size on my J2EE > Container but > > Don't allocate java heap memory for the index, MMapDirectory does not use java heap memory!

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2021-02-22 Thread baris . kazar
Hi,-  I tried MMapDirectory and i allocated as big as index size on my J2EE Container but it only gives me at most 25% speedup and even sometimes a small amount of slowdown. How can i effectively use Lucene indexes in memory? Best regards On 12/14/20 6:35 PM, baris.ka...@oracle.com wrote

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
Thanks Robert. I think these valuable comments need to be placed on javadocs for future references. i think i am getting enough info for making a decision: i will use MMapDirectory without setPreload and i hope my index will fit into the RAM. i plan to post a blog for findings. Best regar

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread Robert Muir
On Mon, Dec 14, 2020 at 1:59 PM Uwe Schindler wrote: > > Hi, > > as writer of the original bog post, here my comments: > > Yes, MMapDirectory.setPreload() is the feature mentioned in my blog post is > to load everything into memory - but that does not guarantee anything! > Still, I would not recom

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
I see, i think i will use first way the constructor woith MMap and i will not use setPreload api to avoid slowdowns. yes, i was expecting a warning from eclipse in the second usage but nothing came up. Thanks for the clarifications. Best regards On 12/14/20 2:55 PM, Uwe Schindler wrote: H

RE: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread Uwe Schindler
Hi, > Thanks Uwe, i am not insisting on to load everything into memory > > but loading into memory might speed up and i would like to see how much > speedup. > > > but i have one more question and that is still not clear to me: > > "it is much better to open index, with MMAP directory" > >

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
This also brings me another question: does using MMap over FSDirectory bring any advantage with or without tmpfs? Best regards On 12/14/20 2:17 PM, Jigar Shah wrote: Thanks, Uwe Yes, recommended, tmpfs/ramfs worked like a charm in our use-case with a read-only index, giving us very high-thro

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
Thanks Uwe, i am not insisting on to load everything into memory but loading into memory might speed up and i would like to see how much speedup. but i have one more question and that is still not clear to me: "it is much better to open index, with MMAP directory" does this mean i should n

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread Jigar Shah
Thanks, Uwe Yes, recommended, tmpfs/ramfs worked like a charm in our use-case with a read-only index, giving us very high-throughput and consistent response time on queries. We had to have some redundancy to be built around that service to be high-available, so we can do a rolling update on the r

RE: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread Uwe Schindler
Hi, as writer of the original bog post, here my comments: Yes, MMapDirectory.setPreload() is the feature mentioned in my blog post is to load everything into memory - but that does not guarantee anything! Still, I would not recommend to use that function, because all it does is to just touch ever

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
Thanks Jigar, these are great notes, observations, experiments to know about and they are very very valuable, i also plan to write a blog on this topic to help Lucene advance. Best regards On 12/14/20 12:44 PM, Jigar Shah wrote: I used one of the Linux feature (ramfs, basically mounting ram

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread Jigar Shah
I used one of the Linux feature (ramfs, basically mounting ram on a partition) to guarantee that it's always in ram (No accidental paging ;) cost too). https://www.jamescoyle.net/how-to/943-create-a-ram-disk-in-linux WARN: Only use if it's a read-only index and can fit in ram and have a back-up c

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread baris . kazar
Thanks Mike, appreciate the reply and the suggestions very much. And Your article link to concurrent search is amazing. Together with in memory and concurrent index (especially in read only mode) these will speed up Lucene queries very much. Happy Holidays Best regards On 12/14/20 10:12 AM,

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread Michael McCandless
Hello, Yes, that is exactly what MMapDirectory.setPreload is trying to do, but not promises (it is best effort). I think it asks the OS to touch all pages in the mapped region so they are cached in RAM, if you have enough RAM. Make your JVM heap as low as possible to let the OS have more RAM to