I have an out-of-memroy error when returning many hits.
I am still on Lucene 1.4.3 I have a simple term query. It returned 899810 documents. I try to retrieve the name of each document and nothing else and I ran out of memory. Instead of getting the names all at once, I tried to query again after every 10,000 document. I close the index reader, index searcher, and the fsDir and re-query for every 10000 documents. This still doesn't work.
From another entry in the forum, it appears that the information about
the hits that I have skipped over are still kept even though I don't access them. Am I understanding it correctly that if I start accessing from the 400000th documents onwards, some information about the 0-399999 documents are still cached even though I have skipped over those. Is there a way to get the file name (and perhaps other information) of the remaining documents ? (I tried a different term query that returned a hit size of 400000, and I was able to get the names of them all without re-quering) I think that I see someone mentioned about clearing the hit cache , though I don't how this is done. Thank you in advance for any hints on dealing with this.