Ugh, indeed FieldInfos fails to properly read 2.3.x indices if the
field name contains non-ascii characters.  I'll open an issue, make a
test case and work out a fix.  Hmm.

Thanks for raising this!

Mike

On Tue, Apr 28, 2009 at 7:53 AM, Mike Streeton
<mike.stree...@connexica.com> wrote:
> I have an index that works fine on Lucene 2.3.2 but fails to open in 2.4.1, 
> it always fails with an Read past EOF. The index does contain some field 
> names with german umlaut characters in
>
> Any ideas?
>
> Many Thanks
>
> Mike
>
> CheckIndex v2.3.2
>
>
> NOTE: testing will be more thorough if you run java with 
> '-ea:org.apache.lucene', so assertions are enabled
>
> Opening index @ C:/index/german
>
> Segments file=segments_9 numSegments=1 version=FORMAT_SHARED_DOC_STORE 
> [Lucene 2.3]
>  1 of 1: name=_3 docCount=235535
>    compound=true
>    numFiles=1
>    size (MB)=301.684
>    no deletions
>    test: open reader.........OK
>    test: fields, norms.......OK [70 fields]
>    test: terms, freq, prox...OK [1475862 terms; 25448796 terms/docs pairs; 
> 28642994 tokens]
>    test: stored fields.......OK [13560464 total field count; avg 57.573 
> fields per doc]
>    test: term vectors........OK [0 total vector count; avg 0 term/freq vector 
> fields per doc]
>
> No problems were detected with this index.
>
> CheckIndex v2.4.1
>
>
> NOTE: testing will be more thorough if you run java with 
> '-ea:org.apache.lucene...', so assertions are enabled
>
> Opening index @ C:/index/german
>
> Segments file=segments_9 numSegments=1 version=FORMAT_SHARED_DOC_STORE 
> [Lucene 2.3]
>  1 of 1: name=_3 docCount=235535
>    compound=true
>    hasProx=true
>    numFiles=1
>    size (MB)=301.684
>    no deletions
>    test: open reader.........FAILED
>    WARNING: fixIndex() would remove reference to this segment; full exception:
> java.io.IOException: read past EOF
>      at org.apache.lucene.store.BufferedIndexInput.refill(Unknown Source)
>      at org.apache.lucene.store.BufferedIndexInput.readBytes(Unknown Source)
>      at org.apache.lucene.store.BufferedIndexInput.readBytes(Unknown Source)
>      at org.apache.lucene.store.IndexInput.readString(Unknown Source)
>      at org.apache.lucene.index.FieldInfos.read(Unknown Source)
>      at org.apache.lucene.index.FieldInfos.<init>(Unknown Source)
>      at org.apache.lucene.index.SegmentReader.initialize(Unknown Source)
>      at org.apache.lucene.index.SegmentReader.get(Unknown Source)
>      at org.apache.lucene.index.SegmentReader.get(Unknown Source)
>      at org.apache.lucene.index.CheckIndex.checkIndex(Unknown Source)
>      at org.apache.lucene.index.CheckIndex.main(Unknown Source)
>
> WARNING: 1 broken segments (containing 235535 documents) detected
> WARNING: would write new segments file, and 235535 documents would be lost, 
> if -fix were specified
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to