Re: Term Frequency within Hits

2007-03-08 Thread Chiradeep Vittal
Term Frequency in Lucene parlance = number of occurences of the term within a single document. If you're looking for "how many documents have term x" where x is unknown, see SimpleFacets in Solr http://lucene.apache.org/solr/api/org/apache/solr/request/SimpleFacets.html - Original Message -

Re: Re : Re: Re : Re: Problem with a search engine

2007-02-05 Thread Chiradeep Vittal
Perhaps the number (dates?) are being indexed in a separate field? Lucene will only search the default field with the queries you have shown. If, for instance the year was being stored in the "year" field, then your query should be report AND year:2003 HTH - Original Message From: Xav

Re: Extending scoring to eliminate sorting on timestamp

2007-01-30 Thread Chiradeep Vittal
Chris, Thanks for all your invaluable comments. The killer was the fact that the timestamp for each document was unique. For a search with millions of results, this resulted in allocation of millions of strings during the sorting step (FieldCacheImpl.getStrings). With some loss of precision, I

Re: Extending scoring to eliminate sorting on timestamp

2007-01-26 Thread Chiradeep Vittal
Thanks for the insight Chris. You are right-- I was trying to avoid the FieldCache hit. Because the index is updated frequently, we have to keep discarding our IndexSearcher. I used String because the timestamp is a Long and there wasn't any SortField.LONG (I guess I should have used SortField.

Re: How many documents in the biggest Lucene index to date?

2007-01-26 Thread Chiradeep Vittal
Grant, Is that on a single machine? If so, what kind of hardware specs does the machine have? I guess you're using a 64-bit JVM? A slightly unrelated question: if a query matches all the documents in the index, does that cause the entire index to get loaded into RAM ? - Original Message