Re: [EXTERNAL] Re: Slow HNSW creation times.

2024-05-01 Thread Krishnamurthy, Kannan
is considered bad. Will soon share insights from the profiler. Kannan From: Uwe Schindler Date: Monday, April 29, 2024 at 8:08 AM To: java-user@lucene.apache.org Subject: [EXTERNAL] Re: Slow HNSW creation times. Hi, how much physical RAM has the machine, because 30 GiB heap sounds a lot to me

Re: Slow HNSW creation times.

2024-04-29 Thread Uwe Schindler
Hi, how much physical RAM has the machine, because 30 GiB heap sounds a lot to me? If you use so much heap and the remaining physical RAM without the heap allocation is not able to fit the rest of the total index into page cache, then it will start to read. This is a usual problem I have seen

Re: Slow HNSW creation times.

2024-04-28 Thread Adrien Grand
Hello Kannan, The fact that adding 10k docs to an empty HNSW graph is faster than adding 10k docs to a large HNSW graph sounds expected to me, but the 120x factor that you are reporting sounds high. Maybe your dataset is larger than the size of your page cache, forcing your OS to read vectors from