On Tue, Nov 20, 2012 at 6:26 AM, Carsten Schnober
wrote:
>
> Thanks, Uwe!
> I think what changed in comparison to Lucene 3.6 is that reset() is
> called upon initialization, too, instead of after processing the first
> document only, right?
There is no such change: this step was always mandator
Am 20.11.2012 10:22, schrieb Uwe Schindler:
Hi,
> The createComponents() method of Analyzers is only called *once* for each
> thread and the Tokenstream is *reused* for later documents. The Analyzer will
> call the final method Tokenizer#setReader() to notify the Tokenizer of a new
> Reader (t
er
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Carsten Schnober [mailto:schno...@ids-mannheim.de]
> Sent: Tuesday, November 20, 2012 10:15 AM
> To: java-user@lucene.apache.org
> Subject: Re: TokenStr
Am 19.11.2012 17:44, schrieb Carsten Schnober:
Hi,
> However, after switching to Lucene 4 and TokenStreamComponents, I'm
> getting a strange behaviour: only the first document in the collection
> is tokenized properly. The others do appear in the index, but
> un-tokenized, although I have tried n
Am 19.11.2012 17:44, schrieb Carsten Schnober:
Hi again,
just a little update:
> However, after switching to Lucene 4 and TokenStreamComponents, I'm
> getting a strange behaviour: only the first document in the collection
> is tokenized properly. The others do appear in the index, but
> un-tokeni