Re: Lucene OOM

2018-01-11 Thread dawn breaks
//www.thetaphi.de > eMail: u...@thetaphi.de > > > -Original Message- > > From: dawn breaks [mailto:2005dawnbre...@gmail.com] > > Sent: Thursday, January 11, 2018 10:22 AM > > To: java-user@lucene.apache.org > > Subject: Re: Lucene OOM > > > > Hi, Uwe &

RE: Lucene OOM

2018-01-11 Thread Uwe Schindler
Original Message- > From: dawn breaks [mailto:2005dawnbre...@gmail.com] > Sent: Thursday, January 11, 2018 10:22 AM > To: java-user@lucene.apache.org > Subject: Re: Lucene OOM > > Hi, Uwe > Thanks for your timely reply. Yes, those documents are huge text. We > have t

Re: Lucene OOM

2018-01-11 Thread dawn breaks
Hi, Uwe Thanks for your timely reply. Yes, those documents are huge text. We have ten indices, and each of them has approximate 75G index size on disk. Following is the directory content of one of the indices. Thanks & Best Regards! lubin total 74G -rw-r--r-- 1 root root 100K Jan 10 16:11 _2

RE: Lucene OOM

2018-01-11 Thread Uwe Schindler
Hi, If the index size on disk is about 750 GiB then a memory usage of 2.3 G heap space for the FST seems fine. It's just a bit strange that you only have 10 million documents! Are those documents huge and have lots of indexed text content, possibly OCR/scanned stuff? If this is the case, the t