Hi Julie, thank you for working on LUCENE-9322 (I also love the issue). I think it would be great if we can try some preliminary aknn implementations (both of clustering-based and graph-based approach) on LUCENE-9322, to explore working unified API and Codec/Format for vectors; for now, I still have no good clear image about desirable abstraction we should have. Sorry for my inactivity on the issues - I wish I could have more time and expertise on it.
Tomoko 2020年7月23日(木) 3:10 Julie Tibshirani <juliet...@gmail.com>: > Tomoko -- this fits with my experience as well. I really like the idea of > treating LUCENE-9322 as setting up a framework for experimentation + > benchmarking (but not requiring us to commit a particular ANN > implementation quite yet). > > Julie > > On 2020/07/17 12:16:18, Tomoko Uchida <t...@gmail.com> wrote: > > > would it make sense to create a separate Lucene module for ANN search > ?> > > > > From a bit of my experience with LUCENE-9004, it is currently impossible > to> > > plug in or opt in custom codecs and indexing chain for aknn search > without> > > touching lucene-core module (plz correct that if it's wrong).> > > I think LUCENE-9322 (unified low-level Codec/Format for dense vectors)> > > would open the possibility for us to experiment different aknn > algorithms> > > on some sandbox modules or even separated jars from Lucene itself.> > > > > Tomoko> > > > > > > 2020年7月17日(金) 17:00 Tommaso Teofili <to...@gmail.com>:> > > > > > would it make sense to create a separate Lucene module for ANN search > ?> > > > we could then experiment with the different approaches and compare > them> > > > across the same benchmarks.> > > >> > > > On Thu, 16 Jul 2020 at 23:14, Ali Akhtar <al...@ali.actor> wrote:> > > >> > > > > I’m a bit of a layman in this area, but if we are talking about > formats> > > > for> > > > > vectors, I vote for the one used by FastAI word vectors. It’s pretty > easy> > > > > to work with.> > > > >> > > > > If we are talking about the same / similiar things, if not just > ignore me> > > > > 😀> > > > >> > > > > On Thu, 16 Jul 2020 at 7:06 PM, Michael Sokolov <ms...@gmail.com>> > > > > wrote:> > > > >> > > > > > We have some prototype implementations in the issues you found. > If> > > > > > you want to try out the approaches in those issues, you could > build> > > > > > Lucene from source and patch it, but there is no release > containing> > > > > > KNN/vector support. We're still working to establish consensus on > what> > > > > > the best way forward is. I think the most fruitful thing we can do > at> > > > > > the moment is establish a format for storing and accessing > vectors> > > > > > that will support different approaches since there is such a rich> > > > > > variety of algorithms and approaches in this area. The last issue > you> > > > > > pointed to is focused on the format.> > > > > >> > > > > > On Wed, Jul 15, 2020 at 11:20 AM Alex K <ak...@gmail.com> wrote:> > > > > > >> > > > > > > Hi Mikhail,> > > > > > >> > > > > > > I'm not sure about the state of ANN in lucene proper. Very > interested> > > > > to> > > > > > > see the response from others.> > > > > > > I've been doing some work on ANN for an Elasticsearch plugin:> > > > > > > http://elastiknn.klibisz.com/> > > > > > > I think it's possible to extract my custom queries and modeling > code> > > > so> > > > > > > that it's elasticsearch-agnostic and can be used directly in > Lucene> > > > > apps.> > > > > > > However I'm much more familiar with Elasticsearch's APIs and> > > > > > usage/testing> > > > > > > patterns than I am with raw Lucene, so I'd likely need to get > some> > > > help> > > > > > > from the Lucene community.> > > > > > > Please LMK if that sounds interesting to anyone.> > > > > > >> > > > > > > - Alex> > > > > > >> > > > > > >> > > > > > >> > > > > > > On Wed, Jul 15, 2020 at 11:11 AM Mikhail <wm...@mail.ru.invalid>> > > > > > > wrote:> > > > > > >> > > > > > > >> > > > > > > > Hi,> > > > > > > >> > > > > > > > I want to incorporate semantic search in my > project,> > > > which> > > > > > uses> > > > > > > > Lucene. I want to use sentence embeddings and ANN > (approximate> > > > > nearest> > > > > > > > neighbor) search. I found the related Lucene issues:> > > > > > > > https://issues.apache.org/jira/browse/LUCENE-9004 ,> > > > > > > > https://issues.apache.org/jira/browse/LUCENE-9136 ,> > > > > > > > https://issues.apache.org/jira/browse/LUCENE-9322 . I see > that> > > > there> > > > > > > > are some related work and related PRs. What is the current > state of> > > > > > this> > > > > > > > functionality?> > > > > > > >> > > > > > > > --> > > > > > > > Thanks,> > > > > > > > Mikhail> > > > > > > >> > > > > > > >> > > > > >> > > > > > > ---------------------------------------------------------------------> > > > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org> > > > > > For additional commands, e-mail: java-user-h...@lucene.apache.org> > > > > > >> > > > > >> > > > >> > > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >