FYI, I'm now running this in a loop on my ubuntu box, without the retry-loop, trying to replicate a failure.
-Yonik http://www.lucidimagination.com On Sat, Jul 31, 2010 at 11:52 AM, Yonik Seeley <[email protected]> wrote: > OK, can you try to reproduce now? > Since the comments indicated that all the commits were to bump up the > index version number, I kept them all and just inserted an additional > commit in the query retry loop. > > But actually... there may still be a bug somewhere (even if this fixes > the test failures). > Each commit should wait for a new searcher to be registered before > returning... hence it should be impossible for overlapping warming > searchers to be responsible for the failure. Hence when the test > fails, either the doc add, or the commit is failing. > > -Yonik > http://www.lucidimagination.com > > > > On Sat, Jul 31, 2010 at 11:35 AM, Yonik Seeley > <[email protected]> wrote: >> Do the logs give any hints? >> Downside of only logging SEVERE is that it's much harder to >> investigate the cause of any intermittent failures that do happen. >> >> Looking at this test code, you shouldn't have to wait at all. The >> test disables replication, indexes docs to the slave, commits (and >> waits for a new searcher to be registered), and then queries the >> slave. >> >> We should just remove that wait loop. >> >> Oh... i just figured it out while writing this I think... >> >> index(slaveClient, "id", 551, "name", "name = " + 551); >> slaveClient.commit(true, true); >> index(slaveClient, "id", 552, "name", "name = " + 552); >> slaveClient.commit(true, true); >> index(slaveClient, "id", 553, "name", "name = " + 553); >> slaveClient.commit(true, true); >> index(slaveClient, "id", 554, "name", "name = " + 554); >> slaveClient.commit(true, true); >> index(slaveClient, "id", 555, "name", "name = " + 555); >> slaveClient.commit(true, true); >> >> I bet that last commit can fail due to max warming searchers. >> I'll fix. >> >> -Yonik >> http://www.lucidimagination.com >> >> On Sat, Jul 31, 2010 at 8:41 AM, Mark Miller <[email protected]> wrote: >>> >>> >>> This looks like it might actually be an issue - it fails once every 20 >>> runs or so as a guess. >>> >>> [junit] Testsuite: org.apache.solr.handler.TestReplicationHandler >>> [junit] Testcase: >>> testReplicateAfterWrite2Slave(org.apache.solr.handler.TestReplicationHandler): >>> FAILED >>> [junit] expected:<1> but was:<0> >>> [junit] junit.framework.AssertionFailedError: expected:<1> but was:<0> >>> [junit] at >>> org.apache.solr.handler.TestReplicationHandler.testReplicateAfterWrite2Slave(TestReplicationHandler.java:464) >>> [junit] >>> [junit] >>> [junit] Tests run: 7, Failures: 1, Errors: 0, Time elapsed: 343.909 sec >>> >>> At first I tried to extend the wait for it, but that's obviously no help >>> - in this case the test failed after running for 343 seconds. I've seen it >>> as high as 968 seconds. >>> >>> - Mark >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
