[ https://issues.apache.org/jira/browse/GEODE-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen Nichols closed GEODE-9802. ------------------------------- > LoggingWithReconnectDistributedTest uses ephemeral port to create servers, > leading to occasional failures with java.net.BindException: Address already > in use > ------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: GEODE-9802 > URL: https://issues.apache.org/jira/browse/GEODE-9802 > Project: Geode > Issue Type: Bug > Affects Versions: 1.15.0 > Reporter: Donal Evans > Assignee: Donal Evans > Priority: Major > Labels: flaky, pull-request-available > Fix For: 1.15.0 > > > Seen originally in distributed mass test run: > {noformat} > > Task :geode-core:distributedTest > LoggingWithReconnectDistributedTest > logFileContainsBannerOnlyOnce FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest$$Lambda$547/1860776670.run > in VM -1 running on Host > heavy-lifter-e58d94dc-0688-534f-8361-75ac377b5300.c.apachegeode-ci.internal > with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631) > at org.apache.geode.test.dunit.VM.invoke(VM.java:448) > at > org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.logFileContainsBannerOnlyOnce(LoggingWithReconnectDistributedTest.java:141) > Caused by: > org.apache.geode.distributed.DistributedSystemDisconnectedException: > Reconnect attempts terminated due to exception, caused by > org.apache.geode.GemFireIOException: While starting cache server CacheServer > on port=46103 client subscription config policy=none client subscription > config capacity=1 client subscription config overflow directory=. > at > org.apache.geode.distributed.internal.InternalDistributedSystem.waitUntilReconnected(InternalDistributedSystem.java:2916) > at > org.apache.geode.logging.internal.LoggingWithReconnectDistributedTest.lambda$logFileContainsBannerOnlyOnce$bb17a952$2(LoggingWithReconnectDistributedTest.java:147) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123) > at > org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78) > at > org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:628) > ... 2 more > Caused by: > org.apache.geode.GemFireIOException: While starting cache server > CacheServer on port=46103 client subscription config policy=none client > subscription config capacity=1 client subscription config overflow directory=. > at > org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2773) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2653) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254) > at > org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329) > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190) > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1794) > at java.lang.Thread.run(Thread.java:748) > Caused by: > java.net.BindException: Failed to create server socket on > 10.0.0.107[46103] > at > org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:75) > at > org.apache.geode.internal.net.SCClusterSocketCreator.createServerSocket(SCClusterSocketCreator.java:55) > at > org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:524) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.<init>(AcceptorImpl.java:573) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorBuilder.create(AcceptorBuilder.java:291) > at > org.apache.geode.internal.cache.CacheServerImpl.createAcceptor(CacheServerImpl.java:420) > at > org.apache.geode.internal.cache.CacheServerImpl.start(CacheServerImpl.java:377) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.createAndStartCacheServers(InternalDistributedSystem.java:2769) > ... 7 more > Caused by: > java.net.BindException: Address already in use (Bind > failed) > at java.net.PlainSocketImpl.socketBind(Native Method) > at > java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387) > at java.net.ServerSocket.bind(ServerSocket.java:390) > at > org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.createServerSocket(ClusterSocketCreatorImpl.java:72) > ... 14 more > 8334 tests completed, 1 failed, 414 skipped > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-results/distributedTest/1636187130/ > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0646/test-artifacts/1636187130/distributedtestfiles-openjdk8-1.15.0-build.0646.tgz > {noformat} > The createServer method in LoggingWithReconnectDistributedTest uses a port > number of 0, which results in an ephemeral port being assigned: > {noformat} > private void createServer(String serverName, File serverDir, int > locatorPort) { > ServerLauncher.Builder builder = new ServerLauncher.Builder(); > builder.setMemberName(serverName); > builder.setWorkingDirectory(serverDir.getAbsolutePath()); > builder.setServerPort(0); > builder.set(LOCATORS, "localHost[" + locatorPort + "]"); > builder.set(DISABLE_AUTO_RECONNECT, "false"); > builder.set(ENABLE_CLUSTER_CONFIGURATION, "false"); > builder.set(MAX_WAIT_TIME_RECONNECT, "1000"); > builder.set(MEMBER_TIMEOUT, "2000"); > serverLauncher = builder.build(); > serverLauncher.start(); > system = (InternalDistributedSystem) > serverLauncher.getCache().getDistributedSystem(); > } > {noformat} > When the server is restarted, this port may no longer be free, causing the > BindException. The test should be changed to use AvailablePortHelper instead. -- This message was sent by Atlassian Jira (v8.20.7#820007)