Paul, my frist guess would be that your source file encoding is set to something else than UTF-8. Those characters should be supported by lucene - none of them are > 16bit so I don't see why this should be caused by lucene.
I'm pretty sure thats a encoding issues. R u running on windows?! hope that helps, simon On Sat, Aug 22, 2009 at 1:17 PM, Paul Taylor<paul_t...@fastmail.fm> wrote: > public class Issue3341Test extends TestCase { > > public void testMatchHangul() throws Exception { > Analyzer analyzer = new StandardAnalyzer(); > RAMDirectory dir = new RAMDirectory(); > IndexWriter writer = new IndexWriter(dir, analyzer, true, > IndexWriter.MaxFieldLength.LIMITED); > Document doc = new Document(); > doc.add(new Field("name", "키드갱", Field.Store.YES, Field.Index.ANALYZED)); > writer.addDocument(doc); > writer.close(); > > IndexSearcher searcher = new IndexSearcher(dir,true); > Query q = new QueryParser("name", analyzer).parse("키드갱"); > System.out.println(q.toString()); > > > Hits hits = searcher.search(q); > assertEquals(1, hits.length()); > } > > } > > gives the following: > > org.apache.lucene.queryParser.ParseException: Cannot parse '???': '*' or '?' > not allowed as first character in WildcardQuery > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:181) > at > org.musicbrainz.search.analysis.Issue3341Test.testMatchHangul(Issue3341Test.java:32) > > Why does the parser think its a wildcard. > (I'm just using the standard analyser, because the search could be performed > in any language, but the user doesnt specify the language so we don't know > what analyser to use. But thats okay I dont expect lucene to do anything > clever but I would expect a match when index and query are identical.) > > > thanks Paul > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org