Hi!I'm working on an indexer that should process documents on hard-disk which are of arbitrary size and type. I use Apache Tika for plain text extraction which offers the feature to stream the parsers output through a reader.
My problem is following:Is there a possibility to generate a document field that gets its data from an Reader-instance and where the plain text is also stored into the index (like the Store.YES field denotes)? If I can't stream the data, memory usage is exceeding the limits of my machine.
Thanks for your help, Gregor
smime.p7s
Description: S/MIME Cryptographic Signature