There have been some earlier messages, where memory consumption issue for Lucene Documents due to 64 bit (double that of 32 bit). We expect the index to grow very large, and we may end up maintaining more than one with different analyzers for the same data set. Hence we are concerned about the index size as well. If there are ways to overcome it, we're game for 64 bit version as well :-)
Any ideas, Thanks and regards, Sithu Sudarsan Graduate Research Assistant, UALR & Visiting Researcher, CDRH/OSEL [EMAIL PROTECTED] [EMAIL PROTECTED] -----Original Message----- From: Toke Eskildsen [mailto:[EMAIL PROTECTED] Sent: Friday, October 24, 2008 10:43 AM To: java-user@lucene.apache.org Subject: RE: Multi -threaded indexing of large number of PDF documents On Fri, 2008-10-24 at 16:01 +0200, Sudarsan, Sithu D. wrote: > 4. We've tried using larger JVM space by defining -Xms1800m and > -Xmx1800m, but it runs out of memory. Only -Xms1080m and -Xmx1080m seems > stable. That is strange as we have 32 GB of RAM and 34GB swap space. > Typically no other application is running. However, the CentOS version > is 32 bit. The Ungava project seems to be using 64 bit. The <2GB limit for Java is a known problem under Windows. I don't know about CentOS, but from your description it seems that the problem exists on that platform too. Anyway, you'll never get above 4GB for Java when you're running 32bit. Might I ask why you're not using 64bit for a 32GB machine? --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]