Re: Temporary vector file during merging

2025-06-27 Thread Viliam Ďurina
I can confirm the temp file isn't renamed, but it's copied a second time. I'm on vacation next week. Dňa pi 27. 6. 2025, 21:24 Michael Sokolov napísal(a): > Right! Thanks for the pointer. It does seem like there is room for > improvement then, maybe Viliam wants to tackle it? > > On Fri, Jun 27,

Re: Temporary vector file during merging

2025-06-27 Thread Michael Sokolov
Without this temp file we would need to load the entire set of vectors for the new merged segment into RAM in order to support building an HNSW graph from it. This way we can read the vectors off the disk in the same way we would do during normal searches. I'm not sure, but I think the temp file s

Re: Temporary vector file during merging

2025-06-27 Thread Michael Sokolov
Right! Thanks for the pointer. It does seem like there is room for improvement then, maybe Viliam wants to tackle it? On Fri, Jun 27, 2025 at 12:57 PM Adrien Grand wrote: > > Mike, I believe that the answer to your question is in this PR review > comment: https://github.com/apache/lucene/pull/601

Re: Temporary vector file during merging

2025-06-27 Thread Adrien Grand
Mike, I believe that the answer to your question is in this PR review comment: https://github.com/apache/lucene/pull/601#discussion_r783711025. Merging is currently implemented by looping over fields once, and merging them. Writing the vec file first would require merging flat vectors for all fiel