On 11/06/2024 07:53, Thomas Munro wrote:
Someone involved in that project mentioned that it's probably not a great topic to research in practice, because real world users of HNSW use fully cached ie prewarmed indexes, because the performance is so bad otherwise. (Though maybe that argument is a little circular...).
I think that's true in practice for *building* an HNSW index, but faster *searching* when the index is not in memory seems quite useful. And of course, faster is always better, even if it's only in a non-optimal scenario.
So although this patch clearly speeds up cold HSNW searches to a degree controlled by effective_io_concurrency, I'll probably look for something else. Suggestions for interesting index types to look at streamifying are very welcome!
GiST and GIN?
Hmm. If that's really true about HNSW though, then there may still be an opportunity to do automatic memory prefetching[1]. But then in the case of index building, "stream" is NULL in this patch anyway. It surely must also be possible to find some good places to put profitable explicit pg_mem_prefetch() calls given the predictability and the need to get only ~60ns ahead for that usage. I didn't look into that because I was trying to prove things about read_stream.c, not get involved in another project :-D Here ends my science experiment report, which I'm dropping here just in case others see useful ideas here. The main thing I learned about the read stream API is that it'd be nice to be able to reset the stream but preserve the distance (something that came up on the streaming sequential scan thread for a different reason), to deal with cases where look-ahead opportunities come in bursts but you want a longer lived stream than I used here. That is the reason the patch creates and destroys temporary streams in a loop; doh. It also provides an interesting case study for what speculative random look-ahead support might need to look like.
This reminds me of a prototype I wrote earlier, see https://github.com/pgvector/pgvector/pull/386, 1st commit. It reorganizes HnswSearchLayer() so that it in iteration, it first collects all the neighbors to visit, and then visits them, somewhat similar to your patch.
-- Heikki Linnakangas Neon (https://neon.tech)