Hello Koji, Thanks for your kind reply.
Yes, I used QueryParser. normaly I used Query = QueryParser.parse( ) method. I put your sample code into lia.analysis.i18n package in LuceneAction and run JapaneseDemo using 1.4 and 1.9 results are [echo] Running lia.analysis.i18n.JapaneseDemo... [java] query = content:ラ?メン屋 I can't get hits result. For Korean [echo] Running lia.analysis.i18n.KoreanDemo... [java] phrase = 경 [java] query = I can't get query parse result. Thanks, Youngho ----- Original Message ----- From: "Koji Sekiguchi" <[EMAIL PROTECTED]> To: <java-user@lucene.apache.org>; "Youngho Cho" <[EMAIL PROTECTED]> Sent: Thursday, October 27, 2005 9:48 AM Subject: RE: korean and lucene > Hi Youngho, > > With regard to Japanese, using StandardAnalyzer, > I can search a word/phase. > > Did you use QueryParser? StandardAnalyzer tokenizes > CJK characters into a stream of single character. > Use QueryParser to get a PhraseQuery and search the query. > > Please see the following sample code. Replace Japanese > "contents" and (search target) "phrase" with Korean in the program and run. > > regards, > > Koji > > ============================================= > import java.io.IOException; > import org.apache.lucene.analysis.Analyzer; > import org.apache.lucene.analysis.standard.StandardAnalyzer; > import org.apache.lucene.analysis.cjk.CJKAnalyzer; > import org.apache.lucene.store.Directory; > import org.apache.lucene.store.RAMDirectory; > import org.apache.lucene.index.IndexWriter; > import org.apache.lucene.document.Document; > import org.apache.lucene.document.Field; > import org.apache.lucene.search.IndexSearcher; > import org.apache.lucene.search.Hits; > import org.apache.lucene.search.Query; > import org.apache.lucene.queryParser.QueryParser; > import org.apache.lucene.queryParser.ParseException; > > public class JapaneseByStandardAnalyzer { > > private static final String FIELD_CONTENT = "content"; > private static final String[] contents = { > "東京にはおいしいラーメン屋がたくさんあります。", > "北海道にもおいしいラーメン屋があります。" > }; > private static final String phrase = "ラーメン屋"; > //private static final String phrase = "屋"; > private static Analyzer analyzer = null; > > public static void main( String[] args ) throws IOException, > ParseException { > Directory directory = makeIndex(); > search( directory ); > directory.close(); > } > > private static Analyzer getAnalyzer(){ > if( analyzer == null ){ > analyzer = new StandardAnalyzer(); > //analyzer = new CJKAnalyzer(); > } > return analyzer; > } > > private static Directory makeIndex() throws IOException { > Directory directory = new RAMDirectory(); > IndexWriter writer = new IndexWriter( directory, getAnalyzer(), true ); > for( int i = 0; i < contents.length; i++ ){ > Document doc = new Document(); > doc.add( new Field( FIELD_CONTENT, contents[i], Field.Store.YES, > Field.Index.TOKENIZED ) ); > writer.addDocument( doc ); > } > writer.close(); > return directory; > } > > private static void search( Directory directory ) throws IOException, > ParseException { > IndexSearcher searcher = new IndexSearcher( directory ); > QueryParser parser = new QueryParser( FIELD_CONTENT, getAnalyzer() ); > Query query = parser.parse( phrase ); > System.out.println( "query = " + query ); > Hits hits = searcher.search( query ); > for( int i = 0; i < hits.length(); i++ ) > System.out.println( "doc = " + hits.doc( i ).get( FIELD_CONTENT ) ); > searcher.close(); > } > } > > > > -----Original Message----- > > From: Youngho Cho [mailto:[EMAIL PROTECTED] > > Sent: Thursday, October 27, 2005 8:18 AM > > To: java-user@lucene.apache.org; Cheolgoo Kang > > Subject: Re: korean and lucene > > > > > > Hello Cheolgoo, > > > > Now I updated my lucene version to 1.9 for using StandardAnalyzer > > for Korean. > > And tested your patch which is already adopted in 1.9 > > > > http://issues.apache.org/jira/browse/LUCENE-444 > > > > But Still I have no good results with Korean compare with CJKAnalyzer. > > > > Single character is good match but more two character word > > doesn't match at all. > > > > Am I something missing or still there need some more works ? > > > > > > Thanks, > > > > Youngho. > > > > > > ----- Original Message ----- > > From: "Cheolgoo Kang" <[EMAIL PROTECTED]> > > To: <java-user@lucene.apache.org>; "John Wang" <[EMAIL PROTECTED]> > > Sent: Tuesday, October 04, 2005 10:11 AM > > Subject: Re: korean and lucene > > > > > > > StandardAnalyzer's JavaCC based StandardTokenizer.jj cannot read > > > Korean part of Unicode character blocks. > > > > > > You should 1) use CJKAnalyzer or 2) add Korean character > > > block(0xAC00~0xD7AF) to the CJK token definition on the > > > StandardTokenizer.jj file. > > > > > > Hope it helps. > > > > > > > > > On 10/4/05, John Wang <[EMAIL PROTECTED]> wrote: > > > > Hi: > > > > > > > > We are running into problems with searching on korean > > documents. We are > > > > using the StandardAnalyzer and everything works with Chinese > > and Japanese. > > > > Are there known problems with Korean with Lucene? > > > > > > > > Thanks > > > > > > > > -John > > > > > > > > > > > > > > > > > -- > > > Cheolgoo > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED]