Index corruption after 'read past EOF' under heavy update load
--------------------------------------------------------------
Key: LUCENE-2729
URL: https://issues.apache.org/jira/browse/LUCENE-2729
Project: Lucene - Java
Issue Type: Bug
Components: Index
Affects Versions: 3.0.2, 3.0.1
Environment: Happens on both OS X 10.6 and Windows 2008 Server.
Integrated with zoie.
Reporter: Nico Krijnen
We have a system running lucene and zoie. We use lucene as a content store for
a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled
backups of the index. This works fine for small indexes and when there are not
a lot of changes to the index when the backup is made.
On large indexes (about 5 GB to 19 GB), when a backup is made while the index
is being changed a lot (lots of document additions and/or deletions), we almost
always get a 'read past EOF' at some point, followed by lots of 'Lock obtain
timed out'.
At that point we get lots of 0 kb files in the index, data gets lots, and the
index is unusable.
When we stop our server, remove the 0kb files and restart our server, the index
is operational again, but data has been lost.
I'm not sure if this is a zoie or a lucene issue, so i'm posting it to both.
Hopefully someone has some ideas where to look to fix this.
Some more details...
Stack trace of the read past EOF and following Lock obtain timed out:
{code}
78307 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR
proj.zoie.impl.indexing.internal.BaseSearchIndex - read past EOF
java.io.IOException: read past EOF
at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154)
at
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
at
org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37)
at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245)
at
org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:166)
at
org.apache.lucene.index.DirectoryReader.doCommit(DirectoryReader.java:725)
at org.apache.lucene.index.IndexReader.commit(IndexReader.java:987)
at org.apache.lucene.index.IndexReader.commit(IndexReader.java:973)
at org.apache.lucene.index.IndexReader.decRef(IndexReader.java:162)
at org.apache.lucene.index.IndexReader.close(IndexReader.java:1003)
at
proj.zoie.impl.indexing.internal.BaseSearchIndex.deleteDocs(BaseSearchIndex.java:203)
at
proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:223)
at
proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153)
at
proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134)
at
proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171)
at
proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373)
579336 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085]
ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying
segments: Lock obtain timed out:
org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:84)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1060)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:957)
at
proj.zoie.impl.indexing.internal.DiskSearchIndex.openIndexWriter(DiskSearchIndex.java:176)
at
proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:228)
at
proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153)
at
proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134)
at
proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171)
at
proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373)
{code}
We get exactly the same behavour on both OS X and on Windows. On both zoie is
using a SimpleFSDirectory.
We also use a SingleInstanceLockFactory (since our process is the only one
working with the index), but we get the same behaviour with a NativeFSLock.
The snapshot backup is being made by calling:
*proj.zoie.impl.indexing.ZoieSystem.exportSnapshot(WritableByteChannel)*
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]