Re: unexpected performance TermsQuery Occur.SHOULD vs TermsInSetQuery?

2020-10-13 Thread Rob Audenaerde
Ah. That makes sense. Thanks! (I might re-run on a larger index just to learn how it works in more detail) On Tue, Oct 13, 2020 at 1:24 PM Adrien Grand wrote: > 100,000+ requests per core per second is a lot. :) My initial reaction is > that the query is likely so fast on that index that the bo

Re: unexpected performance TermsQuery Occur.SHOULD vs TermsInSetQuery?

2020-10-13 Thread Adrien Grand
100,000+ requests per core per second is a lot. :) My initial reaction is that the query is likely so fast on that index that the bottleneck might be rewriting or the initialization of weights/scorers (which don't get more costly as the index gets larger) rather than actual query execution, which m

Re: unexpected performance TermsQuery Occur.SHOULD vs TermsInSetQuery?

2020-10-13 Thread Rob Audenaerde
I reduced the benchmark as far as I could, and now got these results, TermsInSet being a lot slower compared to the Terms/SHOULD. BenchmarkOrQuery.benchmarkTerms thrpt5 190820.510 ± 16667.411 ops/s BenchmarkOrQuery.benchmarkTermsInSet thrpt5 110548.345 ± 7490.169 ops/s @Fork

Re: unexpected performance TermsQuery Occur.SHOULD vs TermsInSetQuery?

2020-10-13 Thread Rob Audenaerde
Hello Adrien, Thanks for the swift reply. I'll add the details: Lucene version: 8.6.2 The restrictionQuery is indeed a conjunction, it allowes for a document to be a hit if the 'roles' field is empty as well. It's used within a bigger query builder; so maybe I did something else wrong. I'll rewr

Re: unexpected performance TermsQuery Occur.SHOULD vs TermsInSetQuery?

2020-10-13 Thread Adrien Grand
Can you give us a few more details: - What version of Lucene are you testing? - Are you benchmarking "restrictionQuery" on its own, or its conjunction with another query? You mentioned that you combine your "restrictionQuery" and the user query with Occur.MUST, Occur.FILTER feels more appropriat

Re: Deduplication of search result with custom with custom sort

2020-10-13 Thread Dmitry Emets
I studied the Las Vegas patch and got one simple thought. FirstPassingGroupCollector collects CollectedSearchGroup inside itself. CollectedSearchGroup contains docId and sortValues. This is exactly what I need. Thanks for the help! пн, 12 окт. 2020 г. в 17:38, Diego Ceccarelli (BLOOMBERG/ LONDON)

unexpected performance TermsQuery Occur.SHOULD vs TermsInSetQuery?

2020-10-13 Thread Rob Audenaerde
Hello, I'm benchmarking an application which implements security on lucene by adding a multivalue field "roles". If the user has one of these roles, he can find the document. I implemented this as a Boolean and query, added the original query and the restriction with Occur.MUST. I'm having some