Hi!

I'm working on an indexer that should process documents on hard-disk which are of arbitrary size and type. I use Apache Tika for plain text extraction which offers the feature to stream the parsers output through a reader.

My problem is following:
Is there a possibility to generate a document field that gets its data from an Reader-instance and where the plain text is also stored into the index (like the Store.YES field denotes)? If I can't stream the data, memory usage is exceeding the limits of my machine.


Thanks for your help,
Gregor

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to