Could someone please comment on the above?
Thanks,
Sai
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4045855.html
Sent from the Lucene - Java Users mailing list archive at
below as well?
IndexReader indexReader = DirectoryReader.open(directory); // Current
Should it be changed to:
AtomicReader indexReader = DirectoryReader.open(directory);
Thanks,
Sai
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector
On 03/01/2013 07:56 AM, Uwe Schindler wrote:
The slowdown happens not on making the doc ids absolute (it is just an
addition), the slowdown appears when you retrieve the stored fields on the
top-level reader (because the composite top-level reader has to do a binary
search in the reader tree t
che.org
> Cc: Uwe Schindler
> Subject: Re: TopDocCollector vs TopScoreDocCollector (semantics changed in
> 4.0, not backward comptabile)
>
> On 2/28/2013 5:05 PM, Uwe Schindler wrote:
> > ... Collector instead of HitCollector (like your ancient Lucene from 2.4),
> >
On 2/28/2013 5:05 PM, Uwe Schindler wrote:
... Collector instead of HitCollector (like your ancient Lucene from 2.4), you have to
respect the new semantics that are *different* to old HitCollector. Collector works with
low-level atomic readers (also in Lucene 3.x), the calls to the "collect(in
Message-
> From: saisantoshi [mailto:saisantosh...@gmail.com]
> Sent: Thursday, February 28, 2013 10:55 PM
> To: java-user@lucene.apache.org
> Subject: RE: TopDocCollector vs TopScoreDocCollector (semantics changed in
> 4.0, not backward comptabile)
>
> Thanks a lot. Really a
This seems to be a bug in the
IndexReader in 4.0
// indexReader.document(doc) is giving incorrect result in 4.0
// atomicReader.document(doc) is giving the correct result.
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-chan
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: saisantoshi [mailto:saisantosh...@gmail.com]
> Sent: Thursday, February 28, 2013 7:26 PM
> To: java-user@lucene.apache.org
> Subject: RE
/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4043719.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: java-user-unsubscr
://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4043502.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
To unsubscribe, e
age-
> From: saisantoshi [mailto:saisantosh...@gmail.com]
> Sent: Wednesday, February 27, 2013 11:51 PM
> To: java-user@lucene.apache.org
> Subject: RE: TopDocCollector vs TopScoreDocCollector (semantics changed in
> 4.0, not backward comptabile)
>
> Thanks. Is there any issu
LOGIC HERE
* How do I get an AtomicReader context here? *
delegate.collect(doc);
}
Thanks and appreciate your help here.
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile
.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: saisantoshi [mailto:saisantosh...@gmail.com]
> Sent: Wednesday, February 27, 2013 10:39 PM
> To: java-user@lucene.apache.org
> Subject: Re: TopDocCollector vs TopSc
indexReader is fetching an incorrect
document. Do you think that there are any concurrency issues here?
Thanks,
Sai.
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4043488.html
I am not looking for negative scores and want to skip it.
Thanks,
Sai
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4036378.html
Sent from the Lucene - Java Users mailing
do you get neg. scores?
>
> Thanks,
> Sai
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4036240.html
> Sent from the Lucene -
Thanks a lot. If we want to wrap TopScoreDocCollector into
PositiveScoresOnlyCollector. Can we do that?
I need only positive scores and I dont think topscore collector can handle
by itself right?
Thanks,
Sai
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector
gt; http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4036093.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---
Can someone please help us here to validate the above?
Thanks,
Sai.
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4036093.html
Sent from the Lucene - Java Users mailing
PositiveScoresOnlyCollector(topScore));
searcher.search(query, (Filter) null, collector);
} finally {
}
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not
modify
our existing collector.
Thanks in advance and really appreciate your help here... Any example code
is also fine...
--
View this message in context:
http://lucene.472066.n3.nabble.com/TopDocCollector-vs-TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-tp4035806p4035815
; From: saisantoshi [mailto:saisantosh...@gmail.com]
> Sent: Thursday, January 24, 2013 12:19 AM
> To: java-user@lucene.apache.org
> Subject: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0,
> not backward comptabile)
>
> Our current search implementation
Our current search implementation (based on 2.4.0) uses a collector extending
the TopDocCollector class
public class MyHitCollector extends TopDocsCollector {
private IndexReader indexReader;
private CustomFilter customFilter;
public MyHitCollector (IndexReader indexReader, int
the deprecated Hits class?
>
> On Tue, Sep 29, 2009 at 7:40 PM, Mark Miller wrote:
>
>
>> Max Lynch wrote:
>>
>>> Hi,
>>> I am developing a search system that doesn't do pagination (searches are
>>>
>> run
>>
>&g
; Hi,
> > I am developing a search system that doesn't do pagination (searches are
> run
> > in the background and machine analyzed). However, TopDocCollector makes
> me
> > put a limit on how many results I want back. For my system, each result
> > found is importa
Max Lynch wrote:
> Hi,
> I am developing a search system that doesn't do pagination (searches are run
> in the background and machine analyzed). However, TopDocCollector makes me
> put a limit on how many results I want back. For my system, each result
> found is important
Hi,
I am developing a search system that doesn't do pagination (searches are run
in the background and machine analyzed). However, TopDocCollector makes me
put a limit on how many results I want back. For my system, each result
found is important. How can I make it collect every result
On Jun 11, 2009, at 1:49 AM, Ian Lea wrote:
This thread seems to be veering well away from your original
straightforward question on how to convert your straighforward code.
So what? It's about Lucene and hence on-topic. Why do you care?
If you want or need these advanced solutions, fine,
This thread seems to be veering well away from your original
straightforward question on how to convert your straighforward code.
If you want or need these advanced solutions, fine, but if your
existing code was fast enough the modified versions suggested earlier
are probably fast enough too.
--
I
On Jun 10, 2009, at 5:02 PM, Yonik Seeley wrote:
On Wed, Jun 10, 2009 at 7:58 PM, Daniel Noll wrote:
It's a shame we don't have an inverted kind of HitCollector where we
can say "give me the next hit", so that we can get the best of both
worlds (like what StAX gives us in the XML world.)
You
On Wed, Jun 10, 2009 at 7:58 PM, Daniel Noll wrote:
> It's a shame we don't have an inverted kind of HitCollector where we
> can say "give me the next hit", so that we can get the best of both
> worlds (like what StAX gives us in the XML world.)
You can get a scorer and call next() yourself.
-Yo
On Wed, Jun 10, 2009 at 20:17, Uwe Schindler wrote:
> You are right, you can, but if you just want to retrieve all hits, this is
> ineffective. A HitCollector is the correct way to do this (especially
> because the order of hits is mostly not interesting when retrieving all
> hits). Hits and TopDoc
On Jun 10, 2009, at 10:49 AM, Uwe Schindler wrote:
To optimize, store the filename not as stored field, but as a non-
tokenized,
indexed term.
How do you do that?
- Paul
-
To unsubscribe, e-mail: java-user-unsubscr...@lucen
hetaphi.de
> -Original Message-
> From: Paul J. Lucas [mailto:p...@lucasmail.org]
> Sent: Wednesday, June 10, 2009 5:26 PM
> To: java-user@lucene.apache.org
> Subject: Re: Migrating from Hit/Hits to TopDocs/TopDocCollector
>
> On Jun 10, 2009, at 3:17 AM, Uwe Schindler wrote:
>
On Jun 10, 2009, at 3:17 AM, Uwe Schindler wrote:
A HitCollector is the correct way to do this (especially because the
order of hits is mostly not interesting when retrieving all hits).
OK, here's what I came up with:
Term t = /* ... */
Collection files = new LinkedList();
FieldS
ll be >>10 times as fast!
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >> -Original Message-
> >> From: Wouter Heijke [mailto:whei...@xs4al
sage-
>> From: Wouter Heijke [mailto:whei...@xs4all.nl]
>> Sent: Wednesday, June 10, 2009 11:44 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: Migrating from Hit/Hits to TopDocs/TopDocCollector
>>
>>
>> Will this do?
>>
>> IndexReader
e 10, 2009 11:44 AM
> To: java-user@lucene.apache.org
> Subject: Re: Migrating from Hit/Hits to TopDocs/TopDocCollector
>
>
> Will this do?
>
> IndexReader indexReader = searcher.getIndexReader();
> TopDocs topDocs = searcher.search(Query query, int n);
> for (int i =
// "FILE" is the field that recorded the original file indexed
> final File f = new File( hit.get( "FILE" ) );
> // ...
> }
>
> It's not clear to me how to rewrite the code using TopDocs/
> TopDocColle
Hi
The code below might do the job. Based on the example at
http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Hits.html
Completely uncompiled and untested of course.
TopDocCollector collector = new TopDocCollector(hitsPerPage);
final Term t = /* ... */;
Query query = new
e original file indexed
final File f = new File( hit.get( "FILE" ) );
// ...
}
It's not clear to me how to rewrite the code using TopDocs/
TopDocCollector and how to iterate over the results.
A litt
On Sat, Feb 28, 2009 at 7:51 AM, wrote:
>> Solr has always allowed all scores through w/o screening out <=0
>
> Why?
Partially historical... due to some limitations in Lucene back when
Solr was first written (like undesired score normalization), Solr
interfaces with Lucene search at the hit coll
> > * How can a hit have a score of <=0?
>
> A function query, or a negative boost would do it.
Ah ok.
> Solr has always allowed all scores through w/o screening out <=0
Why?
-
To unsubscribe, e-mail: java-user-unsubscr...@lu
> That works fine, because hq.size() is still less than numHits. So
> nomatter what, the first numHits hits will be added to the queue.
>
> > public void collect(int doc, float score) {
> > 57 if (score > 0.0f) {
> > 59 if (hq.size() < numHits || score >= minScore) {
Oh damned... it'
On Fri, Feb 27, 2009 at 6:43 AM, wrote:
> Looking into TopDocCollector code, I have some questions:
>
> * How can a hit have a score of <=0?
A function query, or a negative boost would do it.
Solr has always allowed all scores through w/o screening out <=
wrote:
Looking into TopDocCollector code, I have some questions:
* How can a hit have a score of <=0?
I'm not sure...
* What happens if the first hit has the highest score of all hits?
It seems
that topDocs whould then contain only this doc!?
That works fine, because hq.s
Looking into TopDocCollector code, I have some questions:
* How can a hit have a score of <=0?
* What happens if the first hit has the highest score of all hits? It seems
that topDocs whould then contain only this doc!?
public void collect(int doc, float score) {
57 if (score > 0.0f
How many results were you getting?
>
>
>
> -Grant
>
> On Feb 3, 2009, at 8:37 PM, AlexElba wrote:
>
>>
>> Hello,
>>
>> I was using lucene 2.3.2 with hits and switch to lucene 2.4.0 and
>> now I am
>> using TopDocCollector.
>>
&
Hi,
Thanks for pointing me to the API. I found the explanation I'm looking for
at:
http://lucene.apache.org/java/2_4_0/api/core/index.html?org/apache/lucene/search/Hits.html
There's an example on how to use the TopDocCollector instead of Hits.
Regards,
Jay Joel Malaluan
Grant I
http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/search/Searcher.html#search(org.apache.lucene.search.Query,%20org.apache.lucene.search.HitCollector)
The TopDocCollector is a HitCollector.
On Feb 4, 2009, at 10:34 PM, Jay Malaluan wrote:
Hi,
As I was reading the post &qu
Hi,
As I was reading the post "Re: TopDocCollector vs Hits: TopDocCollector
slowing", I just got curious on how he explained his change from Hits to
TopDocCollector. I'm assuming that the Hits is returned from a call of:
Searcher searcher = new Searcher();
searcher.search(x
2009, at 8:37 PM, AlexElba wrote:
Hello,
I was using lucene 2.3.2 with hits and switch to lucene 2.4.0 and
now I am
using TopDocCollector.
I have two queries which are running against the same index.
One query is returning 80bytes information other one is returning
2000bytes
With ol
Hello,
I was using lucene 2.3.2 with hits and switch to lucene 2.4.0 and now I am
using TopDocCollector.
I have two queries which are running against the same index.
One query is returning 80bytes information other one is returning 2000bytes
With old Hits the query which was returning smaller
: I know in applications where we search for a words or phrases and expect
: the result sorted by relevance, TopDocCollector would work like a dream.
: But what about scenario where the result needs to be sorted
: chronologically or by some kind of metadata.
These two methods are available
sorted by relevance, TopDocCollector would work like a dream.
But what about scenario where the result needs to be sorted
chronologically
or by some kind of metadata.
A very common application would be email applications. If someone
is to
search on the Inbox, the result will be expected to appear
Thanks Grant.. Please see my comments/response below.
2008/9/17 Grant Ingersoll <[EMAIL PROTECTED]>
>
> On Sep 17, 2008, at 4:39 PM, Dino Korah wrote:
>
> I know in applications where we search for a words or phrases and expect
>> the
>> result sorted by relevan
On Sep 17, 2008, at 4:39 PM, Dino Korah wrote:
I know in applications where we search for a words or phrases and
expect the
result sorted by relevance, TopDocCollector would work like a dream.
But what about scenario where the result needs to be sorted
chronologically
or by some kind of
I know in applications where we search for a words or phrases and expect the
result sorted by relevance, TopDocCollector would work like a dream.
But what about scenario where the result needs to be sorted chronologically
or by some kind of metadata.
A very common application would be email
On Sep 17, 2008, at 11:51 AM, Cam Bazz wrote:
And how about queries that need starting position, like hits between
100 and 200?
could we pass something to the collector that will count between 0 to
100 and then get the next 100 records?
The collector uses a Priority Queue to store doc ids a
; Doesn't TopDocCollector have a getTotalHits method?
>
> Remember that in order to get the top N documents, a
> all documents must be examined. I believe that the
> numHits parameter passed to the constructor just
> limits the number of hits stored in (and thus the size)
> of the Top
Doesn't TopDocCollector have a getTotalHits method?
Remember that in order to get the top N documents, a
all documents must be examined. I believe that the
numHits parameter passed to the constructor just
limits the number of hits stored in (and thus the size)
of the TopDocs object
Hello All,
Anyone has tried this?
My UI has a requirement to show total number of results and then show
results in pages. How do I do that with TopDocCollector, without having to
run search() twice, one to get the total number of hits and then the next
one to get the page being displayed
: I am using TopDocCollector in IndexerSearher.search(...) for get the
: BitSet of result, but I need of sort the result by two variable: by
: any term of document and by score. Is possible do it using Collector ?
:
: Have any form of use the method search(..., sort) and after get the
Hi,
I am using TopDocCollector in IndexerSearher.search(...) for get the
BitSet of result, but I need of sort the result by two variable: by
any term of document and by score. Is possible do it using Collector ?
Have any form of use the method search(..., sort) and after get the
BitSet of
64 matches
Mail list logo