subject:"RE\: encoding question."

Re: encoding question.

2007-07-19 Thread Peter Keegan

The source data for my index is already in standard UTF-8 and available as a simple byte array. I need to do some simple tokenization of the data (check for whitespace and special characters that control position increment). What is the most efficient way to index this data and avoid unnecessary c

RE: encoding question.

2007-02-14 Thread Benson Margulies

@lucene.apache.org Subject: Re: encoding question. Internally Lucene deals with pure Java Strings; when writing those strings to and reading those strings back from disk, Lucene allways uses the stock Java "modified UTF-8" format, regardless of what your file.encoding system property may be. typc

Re: encoding question.

2007-02-14 Thread Chris Hostetter

Internally Lucene deals with pure Java Strings; when writing those strings to and reading those strings back from disk, Lucene allways uses the stock Java "modified UTF-8" format, regardless of what your file.encoding system property may be. typcially when people have encoding problems in their

Re: encoding question.

RE: encoding question.

Re: encoding question.

3 matches

Site Navigation

Mail list logo

Footer information