Yes, in 4.x IndexWriter now takes an Iterable that enumerates the fields one at a time.
You can also pass a Reader to a Field. That said, there will still be massive RAM required by IW to hold the inverted postings for that one document, likely much more RAM than the original document's String contents. And, such huge documents are rarely useful in practice. E.g., how will you "deliver" that hit to the end user at search time? Will scores actually make sense for such enormous documents? It's better to break them up into more manageable sizes. Mike McCandless http://blog.mikemccandless.com On Thu, Feb 20, 2014 at 3:22 PM, Igor Shalyminov <ishalymi...@yandex-team.ru> wrote: > Hello! > > I'va faced a problem of indexing huge documents. The indexing itself goes > allright, but when the document processing becomes concurrent, OutOfMemories > start appearing (even with heap of about 32GB). > The issue, as I see it, is that I have to create a Document instance to send > it to IndexWriter, and Document is just a collection of all the fields, all > in RAM. > With my huge fields, it would be so much better to have the ability of > sending document fields for writing one by one, keeping no more than a single > field in RAM. > Is it possible in the latest Lucene? > > -- > Best Regards, > Igor Shalyminov > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org