Re: writeChars method in IndexOutput

2006-03-30 Thread Yonik Seeley
EMAIL PROTECTED] > Sent: Thursday, March 30, 2006 11:56 AM > To: java-user@lucene.apache.org > Subject: Re: writeChars method in IndexOutput > > Lucene doesn't currently output totally valid UTF-8 > Patches to make it do so are here: > http://www.mail-archive.com/java-dev@l

RE: writeChars method in IndexOutput

2006-03-30 Thread Dennis Kubes
Is this modified UTF-8 such as is found in DataInput interface? Dennis -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Thursday, March 30, 2006 11:56 AM To: java-user@lucene.apache.org Subject: Re: writeChars method in IndexOutput Lucene doesn't currently o

Re: writeChars method in IndexOutput

2006-03-30 Thread Yonik Seeley
Lucene doesn't currently output totally valid UTF-8 Patches to make it do so are here: http://www.mail-archive.com/java-dev@lucene.apache.org/msg01987.html Should this be tackled pre or post 2.0? -Yonik http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server On 3/30/06, Denni

writeChars method in IndexOutput

2006-03-30 Thread Dennis Kubes
I was reading up on conversion of characters to UTF-8 and I now understand why it is writing out UTF-8 (to be able to support most of the worlds languages with minimal space?). But after reading up on the algorithms for conversion as given below, does the writeChars method not support the U+1→U