Is there some advice around about when it's appropriate to create an
Analyzer class, as opposed to just Tokenizer and TokenFilter classes?

The advantage of the constituent elements is that they allow the
consuming application to add more filters. The only disadvantage I see
is that the following is a bit on the verbose side. Is there some
advantage or use of an Analyzer class that I'm missing?

private Analyzer newAnalyzer() {
        return new Analyzer() {
            @Override
            protected TokenStreamComponents createComponents(String fieldName,
                                                             Reader reader) {
                Tokenizer source = tokenizerFactory.create(reader,
LanguageCode.JAPANESE);
                com.basistech.rosette.bl.Analyzer rblAnalyzer;
                try {
                    rblAnalyzer = analyzerFactory.create(LanguageCode.JAPANESE);
                } catch (IOException e) {
                    throw new RuntimeException("Error creating RBL
analyzer", e);
                }
                BaseLinguisticsTokenFilter filter = new
BaseLinguisticsTokenFilter(source, rblAnalyzer);
                return new TokenStreamComponents(source, filter);
            }
        };
    }

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to