Could someone please comment on the above?
Thanks,
Sai
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4045855.html
Sent from the Lucene - Java Users mailing list archive at N
Thanks for the response and really appreciate your help. I have read the
documentation but could not get it in the first read as I was new to Lucene.
I have changed it to AtomicReader and it seems to be working fine.
One last clarification is do we also need to use AtomicReader for the
following b
On 03/01/2013 07:56 AM, Uwe Schindler wrote:
The slowdown happens not on making the doc ids absolute (it is just an
addition), the slowdown appears when you retrieve the stored fields on the
top-level reader (because the composite top-level reader has to do a binary
search in the reader tree t
che.org
> Cc: Uwe Schindler
> Subject: Re: TopDocCollector vs TopScoreDocCollector (semantics changed in
> 4.0, not backward comptabile)
>
> On 2/28/2013 5:05 PM, Uwe Schindler wrote:
> > ... Collector instead of HitCollector (like your ancient Lucene from 2.4),
> >
On 2/28/2013 5:05 PM, Uwe Schindler wrote:
... Collector instead of HitCollector (like your ancient Lucene from 2.4), you have to
respect the new semantics that are *different* to old HitCollector. Collector works with
low-level atomic readers (also in Lucene 3.x), the calls to the "collect(in
Message-
> From: saisantoshi [mailto:saisantosh...@gmail.com]
> Sent: Thursday, February 28, 2013 10:55 PM
> To: java-user@lucene.apache.org
> Subject: RE: TopDocCollector vs TopScoreDocCollector (semantics changed in
> 4.0, not backward comptabile)
>
> Thanks a lot. Really a
Thanks a lot. Really appreciate your help here.
I have read through the document and understand that the IndexReader uses
sub readers (to look into the index files) and AtomicReader does not. But
how does this affect from the search stand point of view. I think search
results should be consistent
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: saisantoshi [mailto:saisantosh...@gmail.com]
> Sent: Thursday, February 28, 2013 7:26 PM
> To: java-user@lucene.apache.org
> Subject: RE
Could someone please comment on the above code snippet ?
Also, one observation is that our search results are not consistent if we
are using* IndexReader vs AtomicReader?* Could this be a problem?
Thanks,
Sai.
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector
Here is how I am using it:
public class MyCollector extends PositiveScoresOnlyCollector {
private IndexReader indexReader;
public MyCollector(IndexReader indexReader, PositiveScoresOnlyCollector
topScore) {
super(topScore);
this.indexReader = indexReader;
age-
> From: saisantoshi [mailto:saisantosh...@gmail.com]
> Sent: Wednesday, February 27, 2013 11:51 PM
> To: java-user@lucene.apache.org
> Subject: RE: TopDocCollector vs TopScoreDocCollector (semantics changed in
> 4.0, not backward comptabile)
>
> Thanks. Is there any issu
Thanks. Is there any issue the way we are calling the
indexReader.getDocument(doc)?
Not sure how do I get an AtomicReaderConext in the following below method?
Any pointers on how do I get that instance is appreciated?
public void collect(int doc) throws IOException {
// ADD YOUR CUSTOM LOGIC
.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: saisantoshi [mailto:saisantosh...@gmail.com]
> Sent: Wednesday, February 27, 2013 10:39 PM
> To: java-user@lucene.apache.org
> Subject: Re: TopDocCollector vs TopSc
I want to get the Document in the following below code and thats why I need
an indexReader
public void collect(int doc) throws IOException {
// ADD YOUR CUSTOM LOGIC HERE
*Document doc = indexReader.document(doc)*
delegate.collect(doc);
}
But this seems to be the problem as the in
I am not looking for negative scores and want to skip it.
Thanks,
Sai
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4036378.html
Sent from the Lucene - Java Users mailing li
On Fri, Jan 25, 2013 at 3:29 PM, saisantoshi wrote:
> Thanks a lot. If we want to wrap TopScoreDocCollector into
> PositiveScoresOnlyCollector. Can we do that?
> I need only positive scores and I dont think topscore collector can handle
> by itself right?
>
I guess so! But how do you get neg. sco
Thanks a lot. If we want to wrap TopScoreDocCollector into
PositiveScoresOnlyCollector. Can we do that?
I need only positive scores and I dont think topscore collector can handle
by itself right?
Thanks,
Sai
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-v
hey,
you don't need to set the indexreader in the constructor. An
AtomicReader is passed in for each segment to
Collector#setNextReader(AtomicReaderContext)
If you want to use a given collector and extend it with some custom
code in collect I would likely write a delegate Collector like this:
pub
Can someone please help us here to validate the above?
Thanks,
Sai.
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4036093.html
Sent from the Lucene - Java Users mailing list
Here is the way I implemented a collector class. Appreciate if you could let
me know of any issues..
public class MyCollector extends PositiveScoresOnlyCollector {
private IndexReader indexReader;
public MyCollector (IndexReader indexReader,PositiveScoresOnlyCollector
topScor
I am sorry but I am confused looking at the change logs and the enhancements
done. Since we are jumping from 2.4 - 4.0. Could you please point me to any
example code that extends one of the new collectors.. that would help a lot
or it would be great if you could give some pointers on how we can mo
This has been changed in Lucene 2.9, its nothing new in Lucene 4.0. Read the
changes logs of Lucene 2.9/3.0, there is explained what you need to do.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From:
Way the heck better - Hits is horrible for that. It caches like 100 hits
and then keeps searching when you exhaust the cache (been I while since
I've looked at the exact numbers). Its horribly inefficient for checking
every hit.
Hits will end up using a Collector anyway - and then throw a speed tr
Thanks Mark that's exactly what I need. How does the performance of
processing each document in the collect method of HitCollector compare to
looping through the Hits in the deprecated Hits class?
On Tue, Sep 29, 2009 at 7:40 PM, Mark Miller wrote:
> Max Lynch wrote:
> > Hi,
> > I am developing
Max Lynch wrote:
> Hi,
> I am developing a search system that doesn't do pagination (searches are run
> in the background and machine analyzed). However, TopDocCollector makes me
> put a limit on how many results I want back. For my system, each result
> found is important. How can I make it col
On Sat, Feb 28, 2009 at 7:51 AM, wrote:
>> Solr has always allowed all scores through w/o screening out <=0
>
> Why?
Partially historical... due to some limitations in Lucene back when
Solr was first written (like undesired score normalization), Solr
interfaces with Lucene search at the hit coll
> > * How can a hit have a score of <=0?
>
> A function query, or a negative boost would do it.
Ah ok.
> Solr has always allowed all scores through w/o screening out <=0
Why?
-
To unsubscribe, e-mail: java-user-unsubscr...@lu
> That works fine, because hq.size() is still less than numHits. So
> nomatter what, the first numHits hits will be added to the queue.
>
> > public void collect(int doc, float score) {
> > 57 if (score > 0.0f) {
> > 59 if (hq.size() < numHits || score >= minScore) {
Oh damned... it'
On Fri, Feb 27, 2009 at 6:43 AM, wrote:
> Looking into TopDocCollector code, I have some questions:
>
> * How can a hit have a score of <=0?
A function query, or a negative boost would do it.
Solr has always allowed all scores through w/o screening out <=0
-Yonik
http://www.lucidimagination.co
wrote:
Looking into TopDocCollector code, I have some questions:
* How can a hit have a score of <=0?
I'm not sure...
* What happens if the first hit has the highest score of all hits?
It seems
that topDocs whould then contain only this doc!?
That works fine, because hq.size() is sti
Grant Ingersoll-6 wrote:
>
> I presume they are both now slower, right? Otherwise you wouldn't
> mind the speedup on the bigger one. Hits did caching and prefetched
> things, which has it's tradeoffs. Can you describe how you were
> measuring the queries? How many results were you get
ote:
>
>>
>> Hi,
>>
>> As I was reading the post "Re: TopDocCollector vs Hits:
>> TopDocCollector
>> slowing", I just got curious on how he explained his change from
>> Hits to
>> TopDocCollector. I'm assuming that the Hits
http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/search/Searcher.html#search(org.apache.lucene.search.Query,%20org.apache.lucene.search.HitCollector)
The TopDocCollector is a HitCollector.
On Feb 4, 2009, at 10:34 PM, Jay Malaluan wrote:
Hi,
As I was reading the post &qu
Hi,
As I was reading the post "Re: TopDocCollector vs Hits: TopDocCollector
slowing", I just got curious on how he explained his change from Hits to
TopDocCollector. I'm assuming that the Hits is returned from a call of:
Searcher searcher = new Searcher();
searcher.search(x
I presume they are both now slower, right? Otherwise you wouldn't
mind the speedup on the bigger one. Hits did caching and prefetched
things, which has it's tradeoffs. Can you describe how you were
measuring the queries? How many results were you getting?
-Grant
On Feb 3, 2009, at 8:
: I know in applications where we search for a words or phrases and expect
: the result sorted by relevance, TopDocCollector would work like a dream.
: But what about scenario where the result needs to be sorted
: chronologically or by some kind of metadata.
These two methods are available, and
On Sep 17, 2008, at 6:53 PM, Dino Korah wrote:
Thanks Grant.. Please see my comments/response below.
2008/9/17 Grant Ingersoll <[EMAIL PROTECTED]>
On Sep 17, 2008, at 4:39 PM, Dino Korah wrote:
I know in applications where we search for a words or phrases and
expect
the
result sorted by
Thanks Grant.. Please see my comments/response below.
2008/9/17 Grant Ingersoll <[EMAIL PROTECTED]>
>
> On Sep 17, 2008, at 4:39 PM, Dino Korah wrote:
>
> I know in applications where we search for a words or phrases and expect
>> the
>> result sorted by relevance, TopDocCollector would work lik
On Sep 17, 2008, at 4:39 PM, Dino Korah wrote:
I know in applications where we search for a words or phrases and
expect the
result sorted by relevance, TopDocCollector would work like a dream.
But what about scenario where the result needs to be sorted
chronologically
or by some kind of me
I know in applications where we search for a words or phrases and expect the
result sorted by relevance, TopDocCollector would work like a dream.
But what about scenario where the result needs to be sorted chronologically
or by some kind of metadata.
A very common application would be email applic
On Sep 17, 2008, at 11:51 AM, Cam Bazz wrote:
And how about queries that need starting position, like hits between
100 and 200?
could we pass something to the collector that will count between 0 to
100 and then get the next 100 records?
The collector uses a Priority Queue to store doc ids a
And how about queries that need starting position, like hits between
100 and 200?
could we pass something to the collector that will count between 0 to
100 and then get the next 100 records?
Best.
On Wed, Sep 17, 2008 at 5:16 PM, Erick Erickson <[EMAIL PROTECTED]> wrote:
> Doesn't TopDocCollecto
Doesn't TopDocCollector have a getTotalHits method?
Remember that in order to get the top N documents, a
all documents must be examined. I believe that the
numHits parameter passed to the constructor just
limits the number of hits stored in (and thus the size)
of the TopDocs object
Best
Erick
43 matches
Mail list logo