would it make sense to create a separate Lucene module for ANN search ? we could then experiment with the different approaches and compare them across the same benchmarks.
On Thu, 16 Jul 2020 at 23:14, Ali Akhtar <ali@ali.actor> wrote: > I’m a bit of a layman in this area, but if we are talking about formats for > vectors, I vote for the one used by FastAI word vectors. It’s pretty easy > to work with. > > If we are talking about the same / similiar things, if not just ignore me > 😀 > > On Thu, 16 Jul 2020 at 7:06 PM, Michael Sokolov <msoko...@gmail.com> > wrote: > > > We have some prototype implementations in the issues you found. If > > you want to try out the approaches in those issues, you could build > > Lucene from source and patch it, but there is no release containing > > KNN/vector support. We're still working to establish consensus on what > > the best way forward is. I think the most fruitful thing we can do at > > the moment is establish a format for storing and accessing vectors > > that will support different approaches since there is such a rich > > variety of algorithms and approaches in this area. The last issue you > > pointed to is focused on the format. > > > > On Wed, Jul 15, 2020 at 11:20 AM Alex K <aklib...@gmail.com> wrote: > > > > > > Hi Mikhail, > > > > > > I'm not sure about the state of ANN in lucene proper. Very interested > to > > > see the response from others. > > > I've been doing some work on ANN for an Elasticsearch plugin: > > > http://elastiknn.klibisz.com/ > > > I think it's possible to extract my custom queries and modeling code so > > > that it's elasticsearch-agnostic and can be used directly in Lucene > apps. > > > However I'm much more familiar with Elasticsearch's APIs and > > usage/testing > > > patterns than I am with raw Lucene, so I'd likely need to get some help > > > from the Lucene community. > > > Please LMK if that sounds interesting to anyone. > > > > > > - Alex > > > > > > > > > > > > On Wed, Jul 15, 2020 at 11:11 AM Mikhail <wmas...@mail.ru.invalid> > > wrote: > > > > > > > > > > > Hi, > > > > > > > > I want to incorporate semantic search in my project, which > > uses > > > > Lucene. I want to use sentence embeddings and ANN (approximate > nearest > > > > neighbor) search. I found the related Lucene issues: > > > > https://issues.apache.org/jira/browse/LUCENE-9004 , > > > > https://issues.apache.org/jira/browse/LUCENE-9136 , > > > > https://issues.apache.org/jira/browse/LUCENE-9322 . I see that there > > > > are some related work and related PRs. What is the current state of > > this > > > > functionality? > > > > > > > > -- > > > > Thanks, > > > > Mikhail > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > >