Ugh, indeed FieldInfos fails to properly read 2.3.x indices if the field name contains non-ascii characters. I'll open an issue, make a test case and work out a fix. Hmm.
Thanks for raising this! Mike On Tue, Apr 28, 2009 at 7:53 AM, Mike Streeton <mike.stree...@connexica.com> wrote: > I have an index that works fine on Lucene 2.3.2 but fails to open in 2.4.1, > it always fails with an Read past EOF. The index does contain some field > names with german umlaut characters in > > Any ideas? > > Many Thanks > > Mike > > CheckIndex v2.3.2 > > > NOTE: testing will be more thorough if you run java with > '-ea:org.apache.lucene', so assertions are enabled > > Opening index @ C:/index/german > > Segments file=segments_9 numSegments=1 version=FORMAT_SHARED_DOC_STORE > [Lucene 2.3] > 1 of 1: name=_3 docCount=235535 > compound=true > numFiles=1 > size (MB)=301.684 > no deletions > test: open reader.........OK > test: fields, norms.......OK [70 fields] > test: terms, freq, prox...OK [1475862 terms; 25448796 terms/docs pairs; > 28642994 tokens] > test: stored fields.......OK [13560464 total field count; avg 57.573 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq vector > fields per doc] > > No problems were detected with this index. > > CheckIndex v2.4.1 > > > NOTE: testing will be more thorough if you run java with > '-ea:org.apache.lucene...', so assertions are enabled > > Opening index @ C:/index/german > > Segments file=segments_9 numSegments=1 version=FORMAT_SHARED_DOC_STORE > [Lucene 2.3] > 1 of 1: name=_3 docCount=235535 > compound=true > hasProx=true > numFiles=1 > size (MB)=301.684 > no deletions > test: open reader.........FAILED > WARNING: fixIndex() would remove reference to this segment; full exception: > java.io.IOException: read past EOF > at org.apache.lucene.store.BufferedIndexInput.refill(Unknown Source) > at org.apache.lucene.store.BufferedIndexInput.readBytes(Unknown Source) > at org.apache.lucene.store.BufferedIndexInput.readBytes(Unknown Source) > at org.apache.lucene.store.IndexInput.readString(Unknown Source) > at org.apache.lucene.index.FieldInfos.read(Unknown Source) > at org.apache.lucene.index.FieldInfos.<init>(Unknown Source) > at org.apache.lucene.index.SegmentReader.initialize(Unknown Source) > at org.apache.lucene.index.SegmentReader.get(Unknown Source) > at org.apache.lucene.index.SegmentReader.get(Unknown Source) > at org.apache.lucene.index.CheckIndex.checkIndex(Unknown Source) > at org.apache.lucene.index.CheckIndex.main(Unknown Source) > > WARNING: 1 broken segments (containing 235535 documents) detected > WARNING: would write new segments file, and 235535 documents would be lost, > if -fix were specified > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org