RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-03-08 Thread saisantoshi
Could someone please comment on the above? Thanks, Sai -- View this message in context: http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4045855.html Sent from the Lucene - Java Users mailing list archive at

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-03-06 Thread saisantoshi
below as well? IndexReader indexReader = DirectoryReader.open(directory); // Current Should it be changed to: AtomicReader indexReader = DirectoryReader.open(directory); Thanks, Sai -- View this message in context: http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector

Re: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-03-01 Thread Michael Sokolov
On 03/01/2013 07:56 AM, Uwe Schindler wrote: The slowdown happens not on making the doc ids absolute (it is just an addition), the slowdown appears when you retrieve the stored fields on the top-level reader (because the composite top-level reader has to do a binary search in the reader tree t

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-03-01 Thread Uwe Schindler
che.org > Cc: Uwe Schindler > Subject: Re: TopDocCollector vs TopScoreDocCollector (semantics changed in > 4.0, not backward comptabile) > > On 2/28/2013 5:05 PM, Uwe Schindler wrote: > > ... Collector instead of HitCollector (like your ancient Lucene from 2.4), > >

Re: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-03-01 Thread Michael Sokolov
On 2/28/2013 5:05 PM, Uwe Schindler wrote: ... Collector instead of HitCollector (like your ancient Lucene from 2.4), you have to respect the new semantics that are *different* to old HitCollector. Collector works with low-level atomic readers (also in Lucene 3.x), the calls to the "collect(in

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-02-28 Thread Uwe Schindler
Message- > From: saisantoshi [mailto:saisantosh...@gmail.com] > Sent: Thursday, February 28, 2013 10:55 PM > To: java-user@lucene.apache.org > Subject: RE: TopDocCollector vs TopScoreDocCollector (semantics changed in > 4.0, not backward comptabile) > > Thanks a lot. Really a

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-02-28 Thread saisantoshi
This seems to be a bug in the IndexReader in 4.0 // indexReader.document(doc) is giving incorrect result in 4.0 // atomicReader.document(doc) is giving the correct result. -- View this message in context: http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-chan

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-02-28 Thread Uwe Schindler
Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: saisantoshi [mailto:saisantosh...@gmail.com] > Sent: Thursday, February 28, 2013 7:26 PM > To: java-user@lucene.apache.org > Subject: RE

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-02-28 Thread saisantoshi
/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4043719.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: java-user-unsubscr

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-02-27 Thread saisantoshi
://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4043502.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-02-27 Thread Uwe Schindler
age- > From: saisantoshi [mailto:saisantosh...@gmail.com] > Sent: Wednesday, February 27, 2013 11:51 PM > To: java-user@lucene.apache.org > Subject: RE: TopDocCollector vs TopScoreDocCollector (semantics changed in > 4.0, not backward comptabile) > > Thanks. Is there any issu

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-02-27 Thread saisantoshi
LOGIC HERE * How do I get an AtomicReader context here? * delegate.collect(doc); } Thanks and appreciate your help here. -- View this message in context: http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-02-27 Thread Uwe Schindler
.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: saisantoshi [mailto:saisantosh...@gmail.com] > Sent: Wednesday, February 27, 2013 10:39 PM > To: java-user@lucene.apache.org > Subject: Re: TopDocCollector vs TopSc

Re: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-02-27 Thread saisantoshi
indexReader is fetching an incorrect document. Do you think that there are any concurrency issues here? Thanks, Sai. -- View this message in context: http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4043488.html

Re: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-01-25 Thread saisantoshi
I am not looking for negative scores and want to skip it. Thanks, Sai -- View this message in context: http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4036378.html Sent from the Lucene - Java Users mailing

Re: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-01-25 Thread Simon Willnauer
do you get neg. scores? > > Thanks, > Sai > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4036240.html > Sent from the Lucene -

Re: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-01-25 Thread saisantoshi
Thanks a lot. If we want to wrap TopScoreDocCollector into PositiveScoresOnlyCollector. Can we do that? I need only positive scores and I dont think topscore collector can handle by itself right? Thanks, Sai -- View this message in context: http://lucene.472066.n3.nabble.com/TopDocCollector

Re: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-01-25 Thread Simon Willnauer
gt; http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4036093.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > ---

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-01-24 Thread saisantoshi
Can someone please help us here to validate the above? Thanks, Sai. -- View this message in context: http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4036093.html Sent from the Lucene - Java Users mailing

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-01-23 Thread saisantoshi
PositiveScoresOnlyCollector(topScore)); searcher.search(query, (Filter) null, collector); } finally { } -- View this message in context: http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-01-23 Thread saisantoshi
modify our existing collector. Thanks in advance and really appreciate your help here... Any example code is also fine... -- View this message in context: http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4035815

RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-01-23 Thread Uwe Schindler
; From: saisantoshi [mailto:saisantosh...@gmail.com] > Sent: Thursday, January 24, 2013 12:19 AM > To: java-user@lucene.apache.org > Subject: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, > not backward comptabile) > > Our current search implementation

TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

2013-01-23 Thread saisantoshi
Our current search implementation (based on 2.4.0) uses a collector extending the TopDocCollector class public class MyHitCollector extends TopDocsCollector { private IndexReader indexReader; private CustomFilter customFilter; public MyHitCollector (IndexReader indexReader, int

Re: TopDocCollector limits

2009-09-30 Thread Mark Miller
the deprecated Hits class? > > On Tue, Sep 29, 2009 at 7:40 PM, Mark Miller wrote: > > >> Max Lynch wrote: >> >>> Hi, >>> I am developing a search system that doesn't do pagination (searches are >>> >> run >> >&g

Re: TopDocCollector limits

2009-09-30 Thread Max Lynch
; Hi, > > I am developing a search system that doesn't do pagination (searches are > run > > in the background and machine analyzed). However, TopDocCollector makes > me > > put a limit on how many results I want back. For my system, each result > > found is importa

Re: TopDocCollector limits

2009-09-29 Thread Mark Miller
Max Lynch wrote: > Hi, > I am developing a search system that doesn't do pagination (searches are run > in the background and machine analyzed). However, TopDocCollector makes me > put a limit on how many results I want back. For my system, each result > found is important

TopDocCollector limits

2009-09-29 Thread Max Lynch
Hi, I am developing a search system that doesn't do pagination (searches are run in the background and machine analyzed). However, TopDocCollector makes me put a limit on how many results I want back. For my system, each result found is important. How can I make it collect every result

Re: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-11 Thread Paul J. Lucas
On Jun 11, 2009, at 1:49 AM, Ian Lea wrote: This thread seems to be veering well away from your original straightforward question on how to convert your straighforward code. So what? It's about Lucene and hence on-topic. Why do you care? If you want or need these advanced solutions, fine,

Re: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-11 Thread Ian Lea
This thread seems to be veering well away from your original straightforward question on how to convert your straighforward code. If you want or need these advanced solutions, fine, but if your existing code was fast enough the modified versions suggested earlier are probably fast enough too. -- I

Re: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Paul J. Lucas
On Jun 10, 2009, at 5:02 PM, Yonik Seeley wrote: On Wed, Jun 10, 2009 at 7:58 PM, Daniel Noll wrote: It's a shame we don't have an inverted kind of HitCollector where we can say "give me the next hit", so that we can get the best of both worlds (like what StAX gives us in the XML world.) You

Re: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Yonik Seeley
On Wed, Jun 10, 2009 at 7:58 PM, Daniel Noll wrote: > It's a shame we don't have an inverted kind of HitCollector where we > can say "give me the next hit", so that we can get the best of both > worlds (like what StAX gives us in the XML world.) You can get a scorer and call next() yourself. -Yo

Re: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Daniel Noll
On Wed, Jun 10, 2009 at 20:17, Uwe Schindler wrote: > You are right, you can, but if you just want to retrieve all hits, this is > ineffective. A HitCollector is the correct way to do this (especially > because the order of hits is mostly not interesting when retrieving all > hits). Hits and TopDoc

Re: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Paul J. Lucas
On Jun 10, 2009, at 10:49 AM, Uwe Schindler wrote: To optimize, store the filename not as stored field, but as a non- tokenized, indexed term. How do you do that? - Paul - To unsubscribe, e-mail: java-user-unsubscr...@lucen

RE: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Uwe Schindler
hetaphi.de > -Original Message- > From: Paul J. Lucas [mailto:p...@lucasmail.org] > Sent: Wednesday, June 10, 2009 5:26 PM > To: java-user@lucene.apache.org > Subject: Re: Migrating from Hit/Hits to TopDocs/TopDocCollector > > On Jun 10, 2009, at 3:17 AM, Uwe Schindler wrote: >

Re: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Paul J. Lucas
On Jun 10, 2009, at 3:17 AM, Uwe Schindler wrote: A HitCollector is the correct way to do this (especially because the order of hits is mostly not interesting when retrieving all hits). OK, here's what I came up with: Term t = /* ... */ Collection files = new LinkedList(); FieldS

RE: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Uwe Schindler
ll be >>10 times as fast! > > > > - > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: u...@thetaphi.de > > > >> -Original Message- > >> From: Wouter Heijke [mailto:whei...@xs4al

RE: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Wouter Heijke
sage- >> From: Wouter Heijke [mailto:whei...@xs4all.nl] >> Sent: Wednesday, June 10, 2009 11:44 AM >> To: java-user@lucene.apache.org >> Subject: Re: Migrating from Hit/Hits to TopDocs/TopDocCollector >> >> >> Will this do? >> >> IndexReader

RE: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Uwe Schindler
e 10, 2009 11:44 AM > To: java-user@lucene.apache.org > Subject: Re: Migrating from Hit/Hits to TopDocs/TopDocCollector > > > Will this do? > > IndexReader indexReader = searcher.getIndexReader(); > TopDocs topDocs = searcher.search(Query query, int n); > for (int i =

Re: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Wouter Heijke
// "FILE" is the field that recorded the original file indexed > final File f = new File( hit.get( "FILE" ) ); > // ... > } > > It's not clear to me how to rewrite the code using TopDocs/ > TopDocColle

Re: Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-10 Thread Ian Lea
Hi The code below might do the job. Based on the example at http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Hits.html Completely uncompiled and untested of course. TopDocCollector collector = new TopDocCollector(hitsPerPage); final Term t = /* ... */; Query query = new

Migrating from Hit/Hits to TopDocs/TopDocCollector

2009-06-09 Thread Paul J. Lucas
e original file indexed final File f = new File( hit.get( "FILE" ) ); // ... } It's not clear to me how to rewrite the code using TopDocs/ TopDocCollector and how to iterate over the results. A litt

Re: TopDocCollector

2009-02-28 Thread Yonik Seeley
On Sat, Feb 28, 2009 at 7:51 AM, wrote: >> Solr has always allowed all scores through w/o screening out <=0 > > Why? Partially historical... due to some limitations in Lucene back when Solr was first written (like undesired score normalization), Solr interfaces with Lucene search at the hit coll

RE: TopDocCollector

2009-02-28 Thread spring
> > * How can a hit have a score of <=0? > > A function query, or a negative boost would do it. Ah ok. > Solr has always allowed all scores through w/o screening out <=0 Why? - To unsubscribe, e-mail: java-user-unsubscr...@lu

RE: TopDocCollector

2009-02-28 Thread spring
> That works fine, because hq.size() is still less than numHits. So > nomatter what, the first numHits hits will be added to the queue. > > > public void collect(int doc, float score) { > > 57 if (score > 0.0f) { > > 59 if (hq.size() < numHits || score >= minScore) { Oh damned... it'

Re: TopDocCollector

2009-02-27 Thread Yonik Seeley
On Fri, Feb 27, 2009 at 6:43 AM, wrote: > Looking into TopDocCollector code, I have some questions: > > * How can a hit have a score of <=0? A function query, or a negative boost would do it. Solr has always allowed all scores through w/o screening out <=

Re: TopDocCollector

2009-02-27 Thread Michael McCandless
wrote: Looking into TopDocCollector code, I have some questions: * How can a hit have a score of <=0? I'm not sure... * What happens if the first hit has the highest score of all hits? It seems that topDocs whould then contain only this doc!? That works fine, because hq.s

TopDocCollector

2009-02-27 Thread spring
Looking into TopDocCollector code, I have some questions: * How can a hit have a score of <=0? * What happens if the first hit has the highest score of all hits? It seems that topDocs whould then contain only this doc!? public void collect(int doc, float score) { 57 if (score > 0.0f

Re: TopDocCollector vs Hits: TopDocCollector slowing....

2009-02-18 Thread AlexElba
How many results were you getting? > > > > -Grant > > On Feb 3, 2009, at 8:37 PM, AlexElba wrote: > >> >> Hello, >> >> I was using lucene 2.3.2 with hits and switch to lucene 2.4.0 and >> now I am >> using TopDocCollector. >> &

Re: TopDocCollector vs Hits inquiry

2009-02-05 Thread Jay Malaluan
Hi, Thanks for pointing me to the API. I found the explanation I'm looking for at: http://lucene.apache.org/java/2_4_0/api/core/index.html?org/apache/lucene/search/Hits.html There's an example on how to use the TopDocCollector instead of Hits. Regards, Jay Joel Malaluan Grant I

Re: TopDocCollector vs Hits inquiry

2009-02-05 Thread Grant Ingersoll
http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/search/Searcher.html#search(org.apache.lucene.search.Query,%20org.apache.lucene.search.HitCollector) The TopDocCollector is a HitCollector. On Feb 4, 2009, at 10:34 PM, Jay Malaluan wrote: Hi, As I was reading the post &qu

Re: TopDocCollector vs Hits inquiry

2009-02-04 Thread Jay Malaluan
Hi, As I was reading the post "Re: TopDocCollector vs Hits: TopDocCollector slowing", I just got curious on how he explained his change from Hits to TopDocCollector. I'm assuming that the Hits is returned from a call of: Searcher searcher = new Searcher(); searcher.search(x

Re: TopDocCollector vs Hits: TopDocCollector slowing....

2009-02-04 Thread Grant Ingersoll
2009, at 8:37 PM, AlexElba wrote: Hello, I was using lucene 2.3.2 with hits and switch to lucene 2.4.0 and now I am using TopDocCollector. I have two queries which are running against the same index. One query is returning 80bytes information other one is returning 2000bytes With ol

TopDocCollector vs Hits: TopDocCollector slowing....

2009-02-03 Thread AlexElba
Hello, I was using lucene 2.3.2 with hits and switch to lucene 2.4.0 and now I am using TopDocCollector. I have two queries which are running against the same index. One query is returning 80bytes information other one is returning 2000bytes With old Hits the query which was returning smaller

Re: TopDocCollector & Paging

2008-09-17 Thread Chris Hostetter
: I know in applications where we search for a words or phrases and expect : the result sorted by relevance, TopDocCollector would work like a dream. : But what about scenario where the result needs to be sorted : chronologically or by some kind of metadata. These two methods are available

Re: TopDocCollector & Paging

2008-09-17 Thread Grant Ingersoll
sorted by relevance, TopDocCollector would work like a dream. But what about scenario where the result needs to be sorted chronologically or by some kind of metadata. A very common application would be email applications. If someone is to search on the Inbox, the result will be expected to appear

Re: TopDocCollector & Paging

2008-09-17 Thread Dino Korah
Thanks Grant.. Please see my comments/response below. 2008/9/17 Grant Ingersoll <[EMAIL PROTECTED]> > > On Sep 17, 2008, at 4:39 PM, Dino Korah wrote: > > I know in applications where we search for a words or phrases and expect >> the >> result sorted by relevan

Re: TopDocCollector & Paging

2008-09-17 Thread Grant Ingersoll
On Sep 17, 2008, at 4:39 PM, Dino Korah wrote: I know in applications where we search for a words or phrases and expect the result sorted by relevance, TopDocCollector would work like a dream. But what about scenario where the result needs to be sorted chronologically or by some kind of

Re: TopDocCollector & Paging

2008-09-17 Thread Dino Korah
I know in applications where we search for a words or phrases and expect the result sorted by relevance, TopDocCollector would work like a dream. But what about scenario where the result needs to be sorted chronologically or by some kind of metadata. A very common application would be email

Re: TopDocCollector & Paging

2008-09-17 Thread Grant Ingersoll
On Sep 17, 2008, at 11:51 AM, Cam Bazz wrote: And how about queries that need starting position, like hits between 100 and 200? could we pass something to the collector that will count between 0 to 100 and then get the next 100 records? The collector uses a Priority Queue to store doc ids a

Re: TopDocCollector & Paging

2008-09-17 Thread Cam Bazz
; Doesn't TopDocCollector have a getTotalHits method? > > Remember that in order to get the top N documents, a > all documents must be examined. I believe that the > numHits parameter passed to the constructor just > limits the number of hits stored in (and thus the size) > of the Top

Re: TopDocCollector & Paging

2008-09-17 Thread Erick Erickson
Doesn't TopDocCollector have a getTotalHits method? Remember that in order to get the top N documents, a all documents must be examined. I believe that the numHits parameter passed to the constructor just limits the number of hits stored in (and thus the size) of the TopDocs object

TopDocCollector & Paging

2008-09-17 Thread Dino Korah
Hello All, Anyone has tried this? My UI has a requirement to show total number of results and then show results in pages. How do I do that with TopDocCollector, without having to run search() twice, one to get the total number of hits and then the next one to get the page being displayed

Re: Sorting and TopDocCollector

2007-11-26 Thread Chris Hostetter
: I am using TopDocCollector in IndexerSearher.search(...) for get the : BitSet of result, but I need of sort the result by two variable: by : any term of document and by score. Is possible do it using Collector ? : : Have any form of use the method search(..., sort) and after get the

Sorting and TopDocCollector

2007-11-23 Thread Haroldo Nascimento
Hi, I am using TopDocCollector in IndexerSearher.search(...) for get the BitSet of result, but I need of sort the result by two variable: by any term of document and by score. Is possible do it using Collector ? Have any form of use the method search(..., sort) and after get the BitSet of