Re: Lucene 8 early termination

2020-01-23 Thread Uwe Schindler
Hi, There is no support with calculating facets, because the counts can't be optimized with wand or blockmax. The general recommendation is to execute facets/aggregations in separate Elasticsearch or Solr requests (e.g. using AJAX on your website). The display of search results would be instan

Lucene 8 early termination

2020-01-23 Thread Wei
Hi, I am excited to see Lucene 8 introduced BlockMax WAND as a major speed improvement https://issues.apache.org/jira/browse/LUCENE-8135. My question is, how does it integrate with facet request, when the numFound won't be exact? I did some search but haven't found any documentation on this. Any

RE: early termination with query time sorting (but without index-time SortingMergePolicy)

2019-04-19 Thread Ayse Onalan
Formatting got messed up - fixed to make it more readable. -Original Message- From: Ayse Onalan Sent: Friday, April 19, 2019 1:29 PM To: java-user@lucene.apache.org Subject: early termination with query time sorting (but without index-time SortingMergePolicy) Hi Lucene users, I&#

early termination with query time sorting (but without index-time SortingMergePolicy)

2019-04-19 Thread Ayse Onalan
ty to sort efficiently using more than one sort spec. I want to understand what it would take to allow early termination with query time sorting but without index-time sorting. Or what issues would prevent us from doing so. At a high level, it appears theoretically possible to me with the followi

Re: prorated early termination

2019-02-05 Thread Michael Sokolov
readed or not), we want a way to share that up-front > cost > > without needing to pay it over again for each work unit. I think a > similar > > problem also occurs with some other query types (MultiTerm can produce a > > bitset I believe?). > > > > As far as the spec

Re: prorated early termination

2019-02-05 Thread Robert Muir
p-front cost > without needing to pay it over again for each work unit. I think a similar > problem also occurs with some other query types (MultiTerm can produce a > bitset I believe?). > > As far as the specific (prorated early termination) proposal here .. this > is something ve

Re: prorated early termination

2019-02-05 Thread Michael Sokolov
pes (MultiTerm can produce a bitset I believe?). As far as the specific (prorated early termination) proposal here .. this is something very specific and localized within TopFieldCollector that doesn't require any public-facing API change or refactoring at all. It just terminates a little e

Re: prorated early termination

2019-02-04 Thread Robert Muir
Regarding adding a threshold to TopFieldCollector, do you have ideas on what it would take to fix the relevant collector/indexsearcher APIs to make this kind of thing easier? (i know this is a doozie, but we should at least try to think about it, maybe make some progress) I can see where things be

Re: prorated early termination

2019-02-03 Thread Michael McCandless
On Sun, Feb 3, 2019 at 10:41 AM Michael Sokolov wrote: > > In single-threaded mode we can check against minCompetitiveScore and > terminate collection for each segment appropriately, > > > Does Lucene do this today by default? That should be a nice > optimization, > and it'd be safe/correct. >

Re: prorated early termination

2019-02-03 Thread Michael Sokolov
> > On Fri, Feb 1, 2019 at 11:28 AM Adrien Grand wrote: > > > > > Something makes me curious: queries that can leverage sorted indices > > > should be _very_ fast, for instance in your case they only need to > > > look at 500 documents per segment at most

Re: prorated early termination

2019-02-03 Thread Michael McCandless
or instance in your case they only need to > > look at 500 documents per segment at most (less in practice since we > > stop collecting as soon as a non-competitive hit is found), so why do > > you need to parallelize query execution? > > > > On Fri, Feb 1, 2019 at 3:18 PM

Re: prorated early termination

2019-02-03 Thread Michael McCandless
ss http://blog.mikemccandless.com On Fri, Feb 1, 2019 at 9:18 AM Michael Sokolov wrote: > I want to propose an optimization to early termination that gives nice > speedups for large result sets when searching with multiple threads at the > cost of a small (controllable) probability of collecting documents o

Re: prorated early termination

2019-02-01 Thread Michael Sokolov
el Sokolov wrote: > > > > I want to propose an optimization to early termination that gives nice > > speedups for large result sets when searching with multiple threads at > the > > cost of a small (controllable) probability of collecting documents out of > > order: i

Re: prorated early termination

2019-02-01 Thread Adrien Grand
query execution? On Fri, Feb 1, 2019 at 3:18 PM Michael Sokolov wrote: > > I want to propose an optimization to early termination that gives nice > speedups for large result sets when searching with multiple threads at the > cost of a small (controllable) probability of collecting d

prorated early termination

2019-02-01 Thread Michael Sokolov
I want to propose an optimization to early termination that gives nice speedups for large result sets when searching with multiple threads at the cost of a small (controllable) probability of collecting documents out of order: in benchmarks I see +60-70% QPS for tasks like HighTermDayOfYearSort

Re: Early Termination of Queries

2017-04-18 Thread Michael McCandless
Each segment in Lucene is its own little index, and you can get the SegmentReader for it (use IndexReader.leaves() API from the full reader you opened), pass that to IndexSearcher, and search it. But be careful: the "last" segment is an unpredictable thing, because the default merge policy merges

Early Termination of Queries

2017-04-18 Thread aravinth thangasami
Hi all, *EarlyTerminatingSortingCollector* in lucene takes N documents from each segment. I have a case where i need to get the result from latest segment alone will be enough to provide the results. On finding N results in latest segment i will stop searching What is your opinion on this ?? wi

Re: Early Termination

2011-03-16 Thread mark harwood
See https://issues.apache.org/jira/browse/LUCENE-1720 - Original Message From: Alex vB To: java-user@lucene.apache.org Sent: Wed, 16 March, 2011 0:12:41 Subject: Early Termination Hi, is Lucene capable of any early termination techniques during query processing? On the forum I only

Early Termination

2011-03-15 Thread Alex vB
Hi, is Lucene capable of any early termination techniques during query processing? On the forum I only found some information about TimeLimitedCollector. Are there more implementations? Regards Alex -- View this message in context: http://lucene.472066.n3.nabble.com/Early-Termination