Re: Lucene/Solr and BERT

2021-05-23 Thread Michael Wechner
Hi Michael Thank you for your explanations! I am currently trying to implement it, whereas I am learning from the code of https://github.com/jtibshirani/lucene/blob/hnsw-bench/lucene/core/src/java/org/apache/lucene/search/PythonEntryPoint.java whereas Julie told me, that the code is a bit ou

Re: Lucene/Solr and BERT

2021-05-23 Thread Russell Jurney
For practical search using BERT on any reasonable sized dataset, they're going to need ANN, which Lucene recently added. This won't work in practice if the query and document are of a different size, which is where sentence transformers see a lot of use for documents up to 500 words. https://issue

Re: Lucene/Solr and BERT

2021-05-23 Thread Michael Sokolov
Hi Michael, that is fully-functional in the sense that Lucene will build an HNSW graph for a vector-valued field and you can then use the VectorReader.search method to do KNN-based search. Next steps may include some integration with lexical, inverted-index type search so that you can retrieve N-cl