Cool! I missed that! I'll make sure to align with my digital marketing manager to make her add all our Solr-related external posts! Good to see this live! -------------------------- *Alessandro Benedetti* Director @ Sease Ltd. *Apache Lucene/Solr Committer* *Apache Solr PMC Member*
e-mail: a.benede...@sease.io *Sease* - Information Retrieval Applied Consulting | Training | Open Source Website: Sease.io <http://sease.io/> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter <https://twitter.com/seaseltd> | Youtube <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github <https://github.com/seaseltd> On Wed, 29 May 2024 at 20:13, David Smiley <dsmi...@apache.org> wrote: > There *is* a Solr blog site that just launched: > https://solr.apache.org/blog.html > > On Thu, Mar 28, 2024 at 3:49 PM rajani m <rajinima...@gmail.com> wrote: > > > > @Alessandro, > > Is there a solr blog site where we can submit work/articles or are you > > suggesting to post on my own site and share a link here? I prefer the > > former if there is one because there were times when I had my own, > > it hardly had any views and on top of that google blogging made me > migrate > > from blogs to sites and sites got deprecated. Is there or can we have a > > solr specific wiki/blog site where solr users can submit common features > > configs/modules configs/examples/performance metrics and so on....and > maybe > > have a voting/likes to confirm it works. We will have one common place to > > submit and look for. > > > > > > > > On Thu, Mar 28, 2024 at 3:33 PM rajani m <rajinima...@gmail.com> wrote: > > > > > Run the same knn queries at a slow throughput for 30-60 minutes, this > > > should warm up disk caches with hnsw index files, and then you should > see a > > > significant drop in the query time. Also make use of "fq" and reduce > the > > > document space as much as you can. > > > > > > On Thu, Mar 28, 2024 at 12:50 PM Iram Tariq > > > <iram.ta...@northbaysolutions.net.invalid> wrote: > > > > > >> Hi Alessandro, > > >> > > >> Thank you for the feedback. Kindly see my comments below, > > >> > > >> *Ale*: > > >> > https://www.elastic.co/blog/accelerating-vector-search-simd-instructions, > > >> I > > >> suggest to experiment with simD vector improvements (unless you are > > >> already doing it) > > >> > > >> * We will try this soon. * > > >> > > >> *Ale*: What about the machine memory? > > >> > > >> Following is the system specification: Linux ( CPU:64, RAM:488 GB, > > >> OS:Ubuntu 20.04.6 ) > > >> > > >> *Ale*: you can fine-tune the hyper-parameter to compromise a bit on > recall > > >> in favour of performance (hnswBeamWidth, hnswMaxConnections) > > >> > > >> I am trying this as a first step. But I am sure it will impact recall. > > >> > > >> Regards, > > >> > > >> > > >> Iram Tariq | Software Architect > > >> > > >> NorthBay > > >> > > >> Direct: +1 (902) 329-7329 > > >> > > >> iram.ta...@northbaysolutions.net > > >> > > >> www.northbaysolutions.com > > >> > > >> > > >> > > >> > > >> On Thu, Mar 28, 2024 at 5:42 AM Alessandro Benedetti < > > >> a.benede...@sease.io> > > >> wrote: > > >> > > >> > That's interesting. > > >> > I think it's vital to get back some performance tests from the > > >> community. > > >> > Since my contribution to support Vector-search in Apache Solr was > > >> merged, > > >> > we got little or null feedback to understand its performance, in > > >> real-world > > >> > use cases. > > >> > Blogs, open benchmarks or even just this sort of mail message are > > >> welcome. > > >> > Let me reply in line: > > >> > -------------------------- > > >> > *Alessandro Benedetti* > > >> > Director @ Sease Ltd. > > >> > *Apache Lucene/Solr Committer* > > >> > *Apache Solr PMC Member* > > >> > > > >> > e-mail: a.benede...@sease.io > > >> > > > >> > > > >> > *Sease* - Information Retrieval Applied > > >> > Consulting | Training | Open Source > > >> > > > >> > Website: Sease.io <http://sease.io/> > > >> > LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter > > >> > <https://twitter.com/seaseltd> | Youtube > > >> > <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github > > >> > <https://github.com/seaseltd> > > >> > > > >> > > > >> > On Wed, 27 Mar 2024 at 21:06, Kent Fitch <kent.fi...@gmail.com> > wrote: > > >> > > > >> > > Hi Iram, > > >> > > > > >> > > Is the machine doing lots of IO? If the hnsw graphs are not > entirely > > >> in > > >> > > memory, performance will be poor. What JVM? You may get some > benefit > > >> from > > >> > > simd support in java 21. Can you use the latest quantisation > changes > > >> in > > >> > > Lucene to reduce memory footprint of the hnsw graphs? That's a > large > > >> > topk, > > >> > > but I guess you need it? > > >> > > > > >> > > Best regards > > >> > > > > >> > > Kent Fitch > > >> > > > > >> > > On Thu, 28 Mar 2024, 5:12 am Iram Tariq, > > >> > > <iram.ta...@northbaysolutions.net.invalid> wrote: > > >> > > > > >> > > > Hi All, > > >> > > > > > >> > > > I am using Dense vectors in SOLR and facing slowness in it. Each > > >> search > > >> > > is > > >> > > > taking 10-25 seconds. I want to reduce the time to 5 seconds (or > > >> less > > >> > > > ideally). > > >> > > > > > >> > > > Following configurations are being used. > > >> > > > > > >> > > > > > >> > > > 1. *SOLR Version:* 9.3.0 > > >> > > > 2. *Lucene Version:* 9.7.0 > > >> > > > > >> > *Ale*: > > >> > > > >> > https://www.elastic.co/blog/accelerating-vector-search-simd-instructions, > > >> > I > > >> > suggest to experiment with simD vector improvements (unless you are > > >> > already doing it) > > >> > > > >> > > > 3. *Vector Dimensions*: 384 > > >> > > > 4. *Total Shards:* 5 > > >> > > > 5. *Number of Vectors (Per shard*): 43209158 > > >> > > > 6. *JVM for each Instance:* 35GB > > >> > > > > >> > *Ale*: What about the machine memory? > > >> > > > >> > > > 7. *TopK: *1000 (Getting 1000 from each shard) > > >> > > > 8. *Rows: *1000 > > >> > > > 9. *Vector Field Schema: *<fieldType name="knn_vector_384" > > >> > > > class="solr.DenseVectorField" hnswMaxConnections="20" > > >> > > > knnAlgorithm="hnsw" > > >> > > > vectorDimension="384" similarityFunction="cosine" > > >> > hnswBeamWidth="40"/> > > >> > > > > >> > *Ale*: you can fine-tune the hyper-parameter to compromise a bit on > > >> recall > > >> > in favour of performance (hnswBeamWidth, hnswMaxConnections) > > >> > > > >> > > > 10. *Stored*: False > > >> > > > 11. *WebServer:* Apache Tomcat > > >> > > > 12. *System Specs*: Linux ( CPU:64, RAM:488 GB, OS:Ubuntu > > >> 20.04.6 ) > > >> > > > > > >> > > > Any sort of help/clue will be appreciated. > > >> > > > > > >> > > > > > >> > > > > > >> > > > Regards, > > >> > > > > > >> > > > > > >> > > > Iram Tariq | Software Architect > > >> > > > > > >> > > > NorthBay > > >> > > > > > >> > > > Direct: +1 (902) 329-7329 > > >> > > > > > >> > > > iram.ta...@northbaysolutions.net > > >> > > > > > >> > > > www.northbaysolutions.com > > >> > > > > > >> > > > > >> > > > >> > > > >