Re: Customer TokenFilter

tsuraan Thu, 27 May 2010 14:38:13 -0700

> Looks correct! Wrapping by CharBuffer is very intelligent! In Lucene 3.1 the
> new Term Attribute will implement CharSequence, then its even simplier. You
> may also look at 3.1's ICU contrib that has support even for Normalizer2.


Ok, I've only been looking at 3.0.1 so far; I'll check out the 3.1
work and see what's there.

> Overriding StandardAnalyzer is the wrong way, as in 3.1 its final (its only
> accidentially non-final)! You should always extend Analyzer and build the
> chain yourself. Or wrap another analyzer like in PerFieldAnalyzerWrapper.
> The whole Tokenization API in Lucene is based on the delegator pattern, so
> never extend any class not made for it!

Ok, I've changed it to just borrowing the guts of the StandardAnalyzer
rather than inheriting.  I also think I got the reuasableTokenStream
method correct this way, so that's a bonus.  Care to have another look
at it?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Customer TokenFilter

Reply via email to