I've had success limiting the number of documents by size, and doing them 1 at a time works OK with 2G heap. I'm also hoping to understand why memory usage would be so high to begin with, or maybe this is expected?
I agree that indexing 100+M of text is a bit silly, but the use case is a legal context where you need to be able to see, and eventually look at, all of the documents matching a query (even if they are 100+M). Thanks Erick! -- View this message in context: http://lucene.472066.n3.nabble.com/OutOfMemoryError-indexing-large-documents-tp4170983p4171212.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org