[
https://issues.apache.org/jira/browse/BOOKKEEPER-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253176#comment-15253176
]
Andrey Yegorov commented on BOOKKEEPER-919:
-------------------------------------------
Got the same error with patch applied.
Unfortunately it only happens on jenkins server and I cannot repro it locally.
This is the first failure I got with the patch, 1 out of 3 builds failed so far.
{noformat}
Error Message
latch should not have completed
Stacktrace
java.lang.AssertionError: latch should not have completed
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertFalse(Assert.java:64)
at
org.apache.bookkeeper.replication.AuditorLedgerCheckerTest.testReadOnlyBookieExclusionFromURLedgersCheck(AuditorLedgerCheckerTest.java:281)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.lang.Thread.run(Thread.java:745)
{noformat}
Relevant logs:
{noformat}
2016-04-21 16:21:50,288 - INFO - [main-EventThread:NetworkTopology@394] -
Adding a new node: /default-rack/127.0.1.1:15004
2016-04-21 16:21:50,288 - INFO - [main-EventThread:NetworkTopology@394] -
Adding a new node: /default-rack/127.0.1.1:15005
2016-04-21 16:21:50,289 - INFO - [main-EventThread:NetworkTopology@394] -
Adding a new node: /default-rack/127.0.1.1:15006
2016-04-21 16:21:50,289 - INFO - [AuditorElector-127.0.1.1:15004:Auditor@195]
- I'm starting as Auditor Bookie. ID: 127.0.1.1:15004
2016-04-21 16:21:50,290 - INFO - [AuditorElector-127.0.1.1:15004:Auditor@206]
- Auditor periodic ledger checking enabled 'auditorPeriodicCheckInterval'
604800 seconds
2016-04-21 16:21:50,291 - INFO - [AuditorElector-127.0.1.1:15004:Auditor@252]
- Auditor periodic bookie checking enabled 'auditorPeriodicBookieCheckInterval'
86400 seconds
2016-04-21 16:21:50,294 - INFO - [Time-limited test:Bookie@964] -
Transitioning Bookie to ReadOnly mode, and will serve only read requests from
clients!
2016-04-21 16:21:50,296 - INFO - [Time-limited test:Bookie@868] - Registered
myself in ZooKeeper at /ledgers/available/readonly/127.0.1.1:15006.
2016-04-21 16:21:50,297 - INFO - [AuditorBookie-127.0.1.1:15004:Auditor@330] -
Following are the failed bookies: [127.0.1.1:15006] and searching its ledgers
for re-replication
2016-04-21 16:21:50,297 - INFO - [AuditorBookie-127.0.1.1:15004:Auditor@348] -
Following ledgers: [4] of bookie: 127.0.1.1:15006 are identified as
underreplicated
2016-04-21 16:21:50,298 - INFO - [main-EventThread:NetworkTopology@463] -
Removing a node: /default-rack/127.0.1.1:15006
2016-04-21 16:21:50,298 - INFO - [ProcessThread(sid:0
cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when
processing sessionid:0x1543b212af40000 type:create cxid:0x37 zxid:0x1f
txntype:-1 reqpath:n/a Error
Path:/ledgers/underreplication/ledgers/0000/0000/0000/0004
Error:KeeperErrorCode = NoNode for
/ledgers/underreplication/ledgers/0000/0000/0000/0004
2016-04-21 16:21:50,300 - INFO - [ProcessThread(sid:0
cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when
processing sessionid:0x1543b212af40000 type:create cxid:0x3b zxid:0x20
txntype:-1 reqpath:n/a Error
Path:/ledgers/underreplication/ledgers/0000/0000/0000 Error:KeeperErrorCode =
NoNode for /ledgers/underreplication/ledgers/0000/0000/0000
2016-04-21 16:21:50,300 - INFO - [main-EventThread:NetworkTopology@463] -
Removing a node: /default-rack/127.0.1.1:15006
2016-04-21 16:21:50,301 - INFO - [ProcessThread(sid:0
cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when
processing sessionid:0x1543b212af40000 type:create cxid:0x3c zxid:0x21
txntype:-1 reqpath:n/a Error Path:/ledgers/underreplication/ledgers/0000/0000
Error:KeeperErrorCode = NoNode for /ledgers/underreplication/ledgers/0000/0000
2016-04-21 16:21:50,301 - INFO - [ProcessThread(sid:0
cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when
processing sessionid:0x1543b212af40000 type:create cxid:0x3d zxid:0x22
txntype:-1 reqpath:n/a Error Path:/ledgers/underreplication/ledgers/0000
Error:KeeperErrorCode = NoNode for /ledgers/underreplication/ledgers/0000
2016-04-21 16:21:50,306 - INFO -
[main-EventThread:AuditorLedgerCheckerTest$ChildWatcher@435] - Received
notification for the ledger path :
/ledgers/underreplication/ledgers/0000/0000/0000/0004/urL0000000004
2016-04-21 16:21:50,306 - INFO - [main:Auditor@520] - Shutting down auditor
2016-04-21 16:21:50,306 - INFO -
[AuditorElector-127.0.1.1:15004:AuditorElector$2@217] - Shutting down
AuditorElector
2016-04-21 16:21:50,311 - INFO - [main:BookKeeperClusterTestCase@110] -
TearDown
2016-04-21 16:21:50,311 - INFO -
[AuditorElector-127.0.1.1:15006:AuditorElector$2@217] - Shutting down
AuditorElector
{noformat}
> Auditor is sometimes marking as failed a bookie switching from available to
> read-only mode
> ------------------------------------------------------------------------------------------
>
> Key: BOOKKEEPER-919
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-919
> Project: Bookkeeper
> Issue Type: Bug
> Reporter: Matteo Merli
> Assignee: Matteo Merli
> Priority: Minor
> Fix For: 4.4.0
>
>
> AuditorLedgerCheckerTest.testReadOnlyBookieExclusionFromURLedgersCheck
> intermittently failing
> This test too, I've seen it failing in different occasions.
> https://builds.apache.org/job/bookkeeper-master-git-pullrequest/59/testReport/junit/org.apache.bookkeeper.replication/AuditorLedgerCheckerTest/testReadOnlyBookieExclusionFromURLedgersCheck_2_/
> {noformat}
> Error Message
> latch should not have completed
> Stacktrace
> java.lang.AssertionError: latch should not have completed
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertFalse(Assert.java:64)
> at
> org.apache.bookkeeper.replication.AuditorLedgerCheckerTest.testReadOnlyBookieExclusionFromURLedgersCheck(AuditorLedgerCheckerTest.java:279)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)