Index Field feeded from Reader that also stores cleartext

Gregor Dorfbauer Fri, 03 Sep 2010 02:24:13 -0700

Hi!

I'm working on an indexer that should process documents on hard-disk which are of arbitrary size and type. I use Apache Tika for plain text extraction which offers the feature to stream the parsers output through a reader.


My problem is following:

Is there a possibility to generate a document field that gets its data from an Reader-instance and where the plain text is also stored into the index (like the Store.YES field denotes)? If I can't stream the data, memory usage is exceeding the limits of my machine.



Thanks for your help,
Gregor

smime.p7s
Description: S/MIME Cryptographic Signature

Index Field feeded from Reader that also stores cleartext

Reply via email to