Ok, i think i fully understand now and thanks.
https://stackoverflow.com/questions/15589186/lucene-4-pagination
This post was really good and i would like a similar text to this to
appear in the Javadocs please as it helps everyone.
/"I agree with the solution explained by Jaimie. But I wan
May i please again suggest?
the Javadocs need to be enhanced for Lucene
There needs to be more info and explain parameters and
more importantly in terms of performance why these two classes
(TopScoreDocsCollector vs IndexSearcher) differ for performance.
Thanks
On 6/8/21 2:07 PM, baris.ka
yes i see sometimes 4000+, sometimes 3000+ hits from totalhits.
So TopScoreDocsCollector is working underneath IndexSearcher.search api,
right?
in other words TopScoreDocsCollector will be saving time, right?
Thanks
On 6/8/21 1:27 PM, Adrien Grand wrote:
Yes, for instance if you care about
Yes, for instance if you care about the top 10 hits only, you could call
TopScoreDocsCollector.create(10, null, 10). By default, IndexSearcher is
configured to count at least 1,000 hits, and creates its top docs collector
with TopScoreDocsCollector.create(10, null, 1000).
On Tue, Jun 8, 2021 at 7:
Ok i think you meant something else here.
you are not refering to total number of hits calculation or the
mismatch, right?
so to make lucene minimum work to reach the matched docs
TopScoreDocCollector should be used, right?
Let me check this class.
Thanks
On 6/8/21 1:16 PM, baris.ka..
Adrien my concern is not actually the number mismatch
as i mentioned it is the performance.
seeing those numbers mismatch it seems that lucene is still doing same
amount of work to get results no matter how many results you need in the
indexsearcher search api.
i thought i was clear on tha
If you don't need any information about the total hit count, you could
create a TopScoreDocCollector that has the same value for numHits
and totalHitsThreshold. This way Lucene will spend as little energy as
possible computing the number of matches of the query.
On Tue, Jun 8, 2021 at 6:28 PM wro
I guess you can setup an experiment like
search your text against each field and then look at the score but you
need to normalize the score in order to compare and
normalization will include probably length of the field etc.
Maybe there is an api in lucene for this but i dont know.
Hope this
i am currently happy with Lucene performance but i want to understand
and speedup further
by limiting the results concretely. So i still donot know why totalHits
and scoredocs report
different number of hits.
Best regards
On 6/8/21 2:52 AM, Baris Kazar wrote:
my worry is actually about t
Hi,
I am creating a full text search API and one of my requirement is to find out
which exact field the input text is matched to if the document has say more
than 10 fields.
Is there any way I can find out what is the most relevant field in the document
against the input search text.
Thanks i
10 matches
Mail list logo