Re: Computing multiple different aggregations over a match-set in one pass

2023-02-10 Thread Greg Miller
Hi Stefan- Can you clarify your example a little bit? It sounds like you want to facet over three different match sets (one constrained by "Mark Twain" as the author, one constrained by "American authors" and one constrained by the "sci-fi" genre). Is that correct? Cheers, -Greg On Fri, Feb 10,

Computing multiple different aggregations over a match-set in one pass

2023-02-10 Thread Stefan Vodita
Hi all, Let’s say I have an index of books, similar to the example in the facet demo [1] with a hierarchical facet field encapsulating `Genre / Author’s nationality / Author’s name`. I might like to find the latest publish date of a book written by Mark Twain, the sum of the prices of books writt

Re: Re-ranking using cross-encoder after vector search (bi-encoder)

2023-02-10 Thread Robert Muir
I think it would be good to provide something like a VectorRerankField (sorry for the bad name, maybe FastVectorField would be amusing too), that just stores vectors as docvalues (no HNSW) and has a newRescorer() method that implements org.apache.lucene.search.Rescorer. Then its easy to do as that

Re-ranking using cross-encoder after vector search (bi-encoder)

2023-02-10 Thread Michael Wechner
Hi I use the vector search of Lucene, whereas the embeddings I get from SentenceBERT for example. According to https://www.sbert.net/examples/applications/retrieve_rerank/README.html a re-ranking with a cross-encoder after the vector search (bi-encoding) can improve the ranking. Would it

Re: Need help for conversion code from Lucene 2.4.0 to 8.11.2

2023-02-10 Thread Uwe Schindler
Hi, the reason for this is that files in Lucene are always write-once. We never ever change a file after it was written and committed in the 2-phase-commit. If you write some own index files, e.g. as part of an Index Codec you must adhere this rule. See Docvalues or Livedocs implementation fo