3.x and 4.0 Solr releases have nice analyzers just for Japanese. In 4.0
they are the "Kuromoji" package.
In 4.0, the JapaneseAnalyzer probably does what you need:
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.lucene/lucene-analyzers-kuromoji/4.0.0/org/apache/lucene/analysis/ja/JapaneseAnalyzer.java?av=f
3.6 also has the Kuromoji package, but I don't know how advanced it is
compared the 4.x version.
Cheers!
On 01/10/2013 11:19 AM, saisantoshi wrote:
We are using StandardAnalyzer for indexing some Japanese Keywords. It works
fine so far but just wanted to confirm if the StandardAnalyzer can fully
support it ( I have read somewhere in Lucene In Action book, that
StandardAnalyzer does support CJK). Just want to confirm if my understanding
is correct? or do we need to use a specific analyzer for processing Japanese
Keywords.
Alternatively, is there a stop words list for Japanese Language so that we
can add an extra filter to the Standard Analyzer.
Any thoughts on this is much appreciated.
Thanks,
Sai.
--
View this message in context:
http://lucene.472066.n3.nabble.com/StandardAnalyzer-Support-for-Japanese-tp4032290.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org