Term Frequency in Lucene parlance = number of occurences of the term within a
single document.
If you're looking for "how many documents have term x" where x is unknown, see
SimpleFacets in Solr
http://lucene.apache.org/solr/api/org/apache/solr/request/SimpleFacets.html
- Original Message -
Perhaps the number (dates?) are being indexed in a separate field? Lucene will
only search the default field with the queries you have shown. If, for instance
the year was being stored in the "year" field, then your query should be
report AND year:2003
HTH
- Original Message
From: Xav
Chris,
Thanks for all your invaluable comments. The killer was the fact that the
timestamp for each document was unique. For a search with millions of results,
this resulted in allocation of millions of strings during the sorting step
(FieldCacheImpl.getStrings). With some loss of precision, I
Thanks for the insight Chris. You are right-- I was trying to avoid the
FieldCache hit. Because the index is updated frequently, we have to keep
discarding our IndexSearcher.
I used String because the timestamp is a Long and there wasn't any
SortField.LONG (I guess I should have used SortField.
Grant,
Is that on a single machine? If so, what kind of hardware specs does the
machine have? I guess you're using a 64-bit JVM?
A slightly unrelated question: if a query matches all the documents in the
index, does that cause the entire index to get loaded into RAM ?
- Original Message