Re: ANN search current state

2020-07-23 Thread Tomoko Uchida
Hi Julie, thank you for working on LUCENE-9322 (I also love the issue). I think it would be great if we can try some preliminary aknn implementations (both of clustering-based and graph-based approach) on LUCENE-9322, to explore working unified API and Codec/Format for vectors; for now, I still hav

Re: ANN search current state

2020-07-22 Thread Julie Tibshirani
Tomoko -- this fits with my experience as well. I really like the idea of treating LUCENE-9322 as setting up a framework for experimentation + benchmarking (but not requiring us to commit a particular ANN implementation quite yet). Julie On 2020/07/17 12:16:18, Tomoko Uchida wrote: > > would

Re: ANN search current state

2020-07-17 Thread Tomoko Uchida
> would it make sense to create a separate Lucene module for ANN search ? >From a bit of my experience with LUCENE-9004, it is currently impossible to plug in or opt in custom codecs and indexing chain for aknn search without touching lucene-core module (plz correct that if it's wrong). I think LU

Re: ANN search current state

2020-07-17 Thread Tommaso Teofili
would it make sense to create a separate Lucene module for ANN search ? we could then experiment with the different approaches and compare them across the same benchmarks. On Thu, 16 Jul 2020 at 23:14, Ali Akhtar wrote: > I’m a bit of a layman in this area, but if we are talking about formats fo

Re: ANN search current state

2020-07-16 Thread Ali Akhtar
I’m a bit of a layman in this area, but if we are talking about formats for vectors, I vote for the one used by FastAI word vectors. It’s pretty easy to work with. If we are talking about the same / similiar things, if not just ignore me 😀 On Thu, 16 Jul 2020 at 7:06 PM, Michael Sokolov wrote:

Re: ANN search current state

2020-07-16 Thread Michael Sokolov
We have some prototype implementations in the issues you found. If you want to try out the approaches in those issues, you could build Lucene from source and patch it, but there is no release containing KNN/vector support. We're still working to establish consensus on what the best way forward is.

Re: ANN search current state

2020-07-15 Thread Alex K
Hi Mikhail, I'm not sure about the state of ANN in lucene proper. Very interested to see the response from others. I've been doing some work on ANN for an Elasticsearch plugin: http://elastiknn.klibisz.com/ I think it's possible to extract my custom queries and modeling code so that it's elasticse

ANN search current state

2020-07-15 Thread Mikhail
Hi,              I want to incorporate semantic search in my project, which uses Lucene. I want to use sentence embeddings and ANN (approximate nearest neighbor) search. I found the related Lucene issues: https://issues.apache.org/jira/browse/LUCENE-9004 , https://issues.apache.org/jira/brow