Paul, my frist guess would be that your source file encoding is set to
something else than UTF-8. Those characters should be supported by
lucene - none of them are > 16bit so I don't see why this should be
caused by lucene.
I'm pretty sure thats a encoding issues. R u running on windows?!
hope th
public class Issue3341Test extends TestCase {
public void testMatchHangul() throws Exception {
Analyzer analyzer = new StandardAnalyzer();
RAMDirectory dir = new RAMDirectory();
IndexWriter writer = new IndexWriter(dir, analyzer, true,
IndexWriter.MaxFieldLength.LIMITED);
Document doc = new Doc