Thank you for the links. I will go through them, and hopefully solve my problem.
On 5/14/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
please review the advice in these archived messages, I think you'll find them very applicable to your problem... http://www.nabble.com/eliminating-scoring-for-the-sake-of-efficiency-t1603827.html#a4351614 http://www.nabble.com/Exact-date-search-doesn%27t-work-with-1.9.1--t1418643.html#a3833741 : Date: Sun, 14 May 2006 15:34:08 -0400 : From: Beady Geraghty <[EMAIL PROTECTED]> : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subject: Re: out-of-memory when searching, paging does not work. : : Here is the gist of the code: : : Query query = new TermQuery( new Term("contents", q.toLowerCase())); : : : long start = new Date().getTime(); : Hits hits = is.search(query); : long end = new Date().getTime(); : : System.err.println("Found " + hits.length() + : " document(s) (in " + (end - start) + : " milliseconds) that matched query '" + : q + "'"); : : : int ct = hits.length() ; : int ct2 = 400000; : int step = 10000; : int startct; : while (ct2 < ct ) { : startct = ct2; : for (int i = startct; i < startct+step; i++ ) { : if (ct2 >= ct ) { : break; : } : Document doc = hits.doc(ct2); : doc.get("filename"); : ct2++; : } : System.out.println( "ct2 is " + ct2 ); : ir.close(); : is.close(); : fsDir.close(); : ir = null; : is = null; : fsDir = null; : fsDir = FSDirectory.getDirectory(indexDir, false); : ir = IndexReader.open(fsDir); : is = new IndexSearcher(ir); : hits = is.search(query); : : : } : : if ct2 is set to 40,000 as oppose to 400,000 , I see some output before I : get the out-of-memory. If not, I get out of memory error almost instantly : without any output. : : Is there a method call to clear the cache ? : : Thank you for your response. : : : On 5/14/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: : > : > Could you share at least some pseudo-code of what you're doing in the : > loop of retrieving the "name" of each document? Are you storing all : > of those names as you iterate? : > : > Have you profiled your application to see exactly where the memory is : > going? It is surely being eaten by your own code and not Lucene. : > : > Erik : > : > : > On May 14, 2006, at 12:07 PM, Beady Geraghty wrote: : > : > > I have an out-of-memroy error when returning many hits. : > > : > > I am still on Lucene 1.4.3 : > > : > > I have a simple term query. It returned 899810 documents. : > > I try to retrieve the name of each document and nothing else : > > and I ran out of memory. : > > : > > Instead of getting the names all at once, I tried to query again after : > > every 10,000 document. : > > I close the index reader, index searcher, and the fsDir and re-query : > > for every 10000 documents. This still doesn't work. : > > : > >> From another entry in the forum, it appears that the information : > >> about : > > the hits that I have skipped over are still kept even though I don't : > > access them. Am I understanding it correctly that if I start : > > accessing : > > from the 400000th documents onwards, some information about the : > > 0-399999 : > > documents are still cached even though I have skipped over those. : > > Is there a way to get the file name (and perhaps other information) : > > of the : > > remaining : > > documents ? : > > : > > (I tried a different term query that returned a hit size of 400000, : > > and I : > > was able : > > to get the names of them all without re-quering) : > > : > > I think that I see someone mentioned about clearing the hit cache , : > > though I don't how this is done. : > > : > > Thank you in advance for any hints on dealing with this. : > : > : > --------------------------------------------------------------------- : > To unsubscribe, e-mail: [EMAIL PROTECTED] : > For additional commands, e-mail: [EMAIL PROTECTED] : > : > : -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]