eks dev,
The most best way of looping through all results that I have come
across is using a HitCollector and grabbing the field values via
FieldCache. This is under two conditions: 1) The FieldCache arrays
are initialized only once, since creating these arrays creates
serious overhead,
have you tried to only collect doc-ids and see if the speed problem is there,
or maybe to fetch only field values? If you have dense results it can easily be
split() or addSymbolsToHash() what takes the time.
I see 3 possibilities what could be slow, getting doc-ids, fetching field
value or do
I haven't had the chance to use this new feature yet, but have you
tried with selective field loading, so that you can load only that
1 field from your index and not all of them?
I have not tried selective field loading, but it sounds like a good
idea. What class is it in? Any more inform
Provides a new api, IndexReader.document(int doc, String[] fields). A document
containing
only the specified fields is created. The other fields of the document are not
loaded, although
unfortunately uncompressed strings still have to be scanned because the length
information
in the index is
Perhaps I am speaking too quickly, but I would try by not grabbing
the value of the field for every document in the results set.
Someone will see that value or use it for a couple million hits?
Could be I suppose...but if not than axe it. Grab the first few
thousand (or MUCH less) and if th
Ryan O'Hara wrote:
My index contains approximately 5 millions documents. During a
search, I need to grab the value of a field for every document in the
result set. I am currently using a HitCollector to search. Below is
my code:
searcher.search(query, new HitCollector(){
I haven't had the chance to use this new feature yet, but have you tried with
selective field loading, so that you can load only that 1 field from your index
and not all of them?
Otis
- Original Message
From: Ryan O'Hara <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday,