Re: Trying out read streams in pgvector (an extension)

2024-09-06 Thread Thomas Munro
There was a mistake in my query, so the macOS speedup column was wrong (was accidentally comparing Linux number with macOS master, sorry for the noise). I also forgot to mention that you don't actually get the speedup on PostgreSQL 17 on a Mac, because Peter only recently implemented the needed re

Re: Trying out read streams in pgvector (an extension)

2024-09-05 Thread Thomas Munro
On Fri, Sep 6, 2024 at 4:28 PM Thomas Munro wrote: > Without this > patch for PostgreSQL, it reads 1, 2, 4, 7 blocks (= 16 in total) > before it has to take a break to hop to a new page, and then it start > again at 1. Oops. Erm, correction: 1, 2, 4, 8, 1 (because it runs out due to m == 16 and

Re: Trying out read streams in pgvector (an extension)

2024-09-05 Thread Thomas Munro
On Wed, Jun 12, 2024 at 3:37 AM Jonathan S. Katz wrote: > If you're curious, I can fire up some of my more serious benchmarks on > this to do a before/after to see if there's anything interesting. I have > a few large datasets (10s of millions) of larger vectors (1536dim => 6KB > payloads) that co

Re: Trying out read streams in pgvector (an extension)

2024-06-11 Thread Jonathan S. Katz
On 6/11/24 12:53 AM, Thomas Munro wrote: Hi, I was looking around for an exotic index type to try the experience of streamifying an extension, ie out-of-core code. I am totally new to pgvector, but since everyone keeps talking about it, I could not avoid picking up some basic facts in the pgcon

Re: Trying out read streams in pgvector (an extension)

2024-06-11 Thread Heikki Linnakangas
On 11/06/2024 07:53, Thomas Munro wrote: Someone involved in that project mentioned that it's probably not a great topic to research in practice, because real world users of HNSW use fully cached ie prewarmed indexes, because the performance is so bad otherwise. (Though maybe that argument is a