Hi folks,
I was wondering if people had thought about this problem before.
If in a search system, you have clients that can tolerate high latency, is
there a way to increase their latency and improve overall system
throughput, where there may be other clients that demand reasonable latency
and hig
Hi folks. I've been experimenting with our new scalar quantization
support - yay, thanks for adding it! I'm finding that when I index a
large number of large vectors, enabling quantization (vs simply
indexing the full-width floats) requires more heap - I keep getting
OOMs and have to increase heap
Heya Michael,
> the first one I traced was referenced by vector writers involved in a merge
> (Lucene99FlatVectorsWriter.FieldsWriter.vectors). Is this expected?
Yes, that is holding the raw floats before flush. You should see
nearly the exact same overhead there as you would indexing raw
vector
Empirically I thought I saw the need to increase JVM heap with this,
but let me do some more testing to narrow down what is going on. It's
possible the same heap requirements exist for the non-quantized case
and I am just seeing some random vagary of the merge process happening
to tip over a limit
Michael,
Empirically, I am not surprised there is an increase in heap usage. We
do have extra overhead with the scalar quantization on flush. There
may also be some additional heap usage on merge.
I just don't think it is via: Lucene99FlatVectorsWriter
On Wed, Jun 12, 2024 at 11:55 AM Michael So