I've had success limiting the number of documents by size, and doing them 1
at a time works OK with 2G heap. I'm also hoping to understand why memory
usage would be so high to begin with, or maybe this is expected?

I agree that indexing 100+M of text is a bit silly, but the use case is a
legal context where you need to be able to see, and eventually look at, all
of the documents matching a query (even if they are 100+M).

Thanks Erick!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/OutOfMemoryError-indexing-large-documents-tp4170983p4171212.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to