On 10/17/23 13:20, Walter Underwood wrote:
Gzipping the JSON can be a big win, especially if there are lots of repeated
keys, like in state.json. Gzip has the advantage that some editors can natively
unpack it.
It may save you some transfer time, provided the transport subsystem
doesn't com
Hello.
I posted the message below to this list back on 9 September, but it
didn't seem to elicit a response. Trying again in the hopes someone can
lend some assistance, for which I would be most grateful.
Thanks
Apache Avro is a JSON-equivalent binary format. That would be smaller. Looking
around the web, it might be 2X to 4X smaller.
Gzipping the JSON can be a big win, especially if there are lots of repeated
keys, like in state.json. Gzip has the advantage that some editors can natively
unpack it.
T
Hi Florin and Matthias,
Thanks for sharing about this!
Looking into where the JSON indentation in storage comes from -- from code
reading only -- I think this is the code trail:
*
https://github.com/apache/solr/blob/releases/solr/9.4.0/solr/modules/ltr/src/java/org/apache/solr/ltr/store/rest/M
b) The knn query results are the approximate nearest neighbors, but they
might not be the best. We'd like to define some kind of cut-off value
for knn document scores. Is this possible, and what would be a good day
to do so? Implement a post-processing filter query with an frange on the
score field
Hi Mirko,
the topK is per shard.
Then shards * k results are aggregated.
Does it make sense?
In regards to the debugging, it seems a bug, they all should be with a
score and within top-k
--
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache
Hey!
Thank you for your help!
We are running in cloud mode on GKE. Our index has 2 shards, and every
shard has 2 replicas. The leader is a TLOG, the other replica is a PULL.
Our main query is basically {!knn f=VECTOR_FIELD topK=10}[VECTOR DATA].
Thats it.
I am really unsure how to debug th
What's your full Solr query?
Are you on SolrCloud or single Solr node?
--
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr PMC Member*
e-mail: a.benede...@sease.io
*Sease* - Information Retrieval Applied
Consulting | Training | Open
To correct me, there was a typo. I meant:
If I specify topK=6, I get numFound=12, but only some of them match the
top 6
Am 17.10.2023 um 09:31 schrieb Mirko Sertic:
Hi!
To keep you updated, here are some observations regarding the
numFound/resultset size and DenseVectorQueries:
If I specity
Hi!
To keep you updated, here are some observations regarding the
numFound/resultset size and DenseVectorQueries:
If I specity topK=10, I get numFound=20, but only some of them match the
top 10
If I specify topK=8, I get numFound=16, but only some of them match the
top 8
If I specify topK=6, I
10 matches
Mail list logo