[
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shai Erera updated LUCENE-4975:
-------------------------------
Attachment: LUCENE-4975.patch
Patch fixes a bug in IndexReplicationHandler (still need to fix in
IndexAndTaxonomy) and adds some nocommits which I want to take care before I
commit it.
However, I hit a new test failure, which reproduces with the following command
{{ant test -Dtestcase=IndexReplicationClientTest
-Dtests.method=testConsistencyOnExceptions
-Dtests.seed=EAF5294292642F1:6EE70BB59A9FC3CA}}.
The error is weird. I ran the test w/ -Dtests.verbose=true and here's the
troubling parts from the log:
{noformat}
ReplicationThread-index: MockDirectoryWrapper: now throw random exception
during open file=segments_a
java.lang.Throwable
at
org.apache.lucene.store.MockDirectoryWrapper.maybeThrowIOExceptionOnOpen(MockDirectoryWrapper.java:364)
at
org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:522)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:281)
at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:340)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:668)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:515)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:343)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:682)
at
org.apache.lucene.replicator.IndexReplicationHandler.revisionReady(IndexReplicationHandler.java:208)
at
org.apache.lucene.replicator.ReplicationClient.doUpdate(ReplicationClient.java:248)
at
org.apache.lucene.replicator.ReplicationClient.access$1(ReplicationClient.java:188)
at
org.apache.lucene.replicator.ReplicationClient$ReplicationThread.run(ReplicationClient.java:76)
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: init: current
segments file is "segments_9";
deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@117da39a
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: init: load
commit "segments_9"
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: init: load
commit "segments_a"
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: now checkpoint
"_0(5.0):C1 _1(5.0):C1 _2(5.0):c1 _3(5.0):c1 _4(5.0):c1 _5(5.0):c1 _6(5.0):c1
_7(5.0):c1 _8(5.0):c1" [9 segments ; isCommit = false]
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: 0 msec to
checkpoint
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: deleteCommits:
now decRef commit "segments_9"
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: delete
"segments_9"
IW 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: init: create=false
....
IW 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: startCommit():
start
IW 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: startCommit
index=_0(5.0):C1 _1(5.0):C1 _2(5.0):c1 _3(5.0):c1 _4(5.0):c1 _5(5.0):c1
_6(5.0):c1 _7(5.0):c1 _8(5.0):c1 changeCount=1
IW 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: done all syncs:
[_2.si, _7.si, _5.cfs, _1.fnm, _4.cfs, _8.si, _4.cfe, _5.cfe, _0.si, _0.fnm,
_6.cfe, _8.cfs, _3.cfs, _4.si, _7.cfe, _2.cfs, _5.si, _6.cfs, _1.fdx, _8.cfe,
_1.fdt, _1.si, _7.cfs, _0.fdx, _3.si, _6.si, _3.cfe, _2.cfe, _0.fdt]
IW 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: commit:
pendingCommit != null
IW 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: commit: wrote
segments file "segments_a"
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: now checkpoint
"_0(5.0):C1 _1(5.0):C1 _2(5.0):c1 _3(5.0):c1 _4(5.0):c1 _5(5.0):c1 _6(5.0):c1
_7(5.0):c1 _8(5.0):c1" [9 segments ; isCommit = true]
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: deleteCommits:
now decRef commit "segments_a"
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: delete "_9.cfe"
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: delete "_9.cfs"
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: delete "_9.si"
IFD 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: 0 msec to
checkpoint
IW 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: commit: done
IW 0 [Wed May 08 22:47:46 WST 2013; ReplicationThread-index]: at close:
_0(5.0):C1 _1(5.0):C1 _2(5.0):c1 _3(5.0):c1 _4(5.0):c1 _5(5.0):c1 _6(5.0):c1
_7(5.0):c1 _8(5.0):c1
IndexReplicationHandler 0 [Wed May 08 22:47:46 WST 2013;
ReplicationThread-index]: updateHandlerState(): currentVersion=a
currentRevisionFiles={index=[Lorg.apache.lucene.replicator.RevisionFile;@9bc2e26e}
IndexReplicationHandler 0 [Wed May 08 22:47:46 WST 2013;
ReplicationThread-index]: {version=9}
{noformat}
I debug traced it and here's what I think is happening:
* MDW throws FNFE for segments_a on sis.read(dir), therefore the read
SegmentInfos sees segments_9 as the current good commit. IW's
segmentInfos.commitData stores version=9, which corresponds to segments_9.
* IFD lists the files in the Directory, and finds both segments_a and
segments_9 and through a series of calls, deletes segments_9 and keeps
segments_a, since it is newer.
* IW ctor, line 719, increments changeCount, since IFD.startingCommitDeleted is
true -- which happens b/c IFD is initialized with segments_9, but finds
segments_a and therefore deletes it.
* IW then makes a commit, with the commit data from segments_9 ("version=9"),
to a new commit point generation 10 (a in hex).
* The Replicator's latest version is gen=10, the handler reads gen=10 from the
index, but with the wrong commitData, and therefore the test fails.
I still want to review all this again, to double-check my understanding, but it
looks like something bad happening between IW and IFD. At least from the
perspective of the replicator, the index shouldn't "go forward" by new
IW().close().
If I modify the handler to do:
{code}
IndexWriter writer = new IndexWriter();
writer.deleteUnusedFiles();
writer.rollback();
{code}
The test passes. But is this the right solution -- i.e. guarantee that IW never
commits? Or is this a bug in IW?
> Add Replication module to Lucene
> --------------------------------
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Shai Erera
> Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch,
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who
> want to replicate their indexes for e.g high-availability, taking hot backups
> etc.
> I will upload a patch soon where I'll describe in general how it works.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]