On Fri, Jul 15, 2011 at 4:45 PM, Uwe Schindler <u...@thetaphi.de> wrote:
> Hi,
>
>> The crappy thing is that to actually detect if there are any tokens in the 
>> field
>> you need to make a TokenStream which can be used to read the first token
>> and then rewind again.  I'm not sure if there is such a thing in Lucene at 
>> the
>> moment.  We had to write it ourselves but we were on a considerably older
>> version at the time.
>
> CachingTokenFilter plugged over any other TokenStream.

Ah, quite right.  If you can afford the memory it will eat (or if your
documents are all relatively small), CachingTokenFilter will work.  I
think in our case it caused OOME for larger character streams, which
is why we ended up falling back to one which only cached the first
token.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to