Update - it seems this is fixed if I updated my activemq.xml config to include: <shutdownHooks> <bean xmlns="http://www.springframework.org/schema/beans" class="org.apache.activemq.hooks.SpringContextHook" /> </shutdownHooks>
I'm not sure why the Puppet ActiveMQ module doesn't set this. Now, I just have to work out why the master is getting the NFS exclusive lock back at all. It shouldn't - the slave should still have an exclusive lock on the lock file, so the master should be polling for it. - Korny On 19 July 2016 at 14:07, Korny Sietsma <ko...@sietsma.com> wrote: > Hi folks - I'm testing a system using ActiveMQ 5.9.1 with an NFS server > for shared filesystem master/slave > > I'm trying to set up and test failover and fail-back. I'm using Jepsen to > fake a network split, where the Master broker can't communicate with the > NFS server - after some headaches I have it at a stage where the Master > times out writing to the file system, and shuts down and restarts - but it > fails to restart cleanly, with a BindException error starting the web app > at http://0.0.0.0:8161/fileserver > > Relevant bits from the activemq logs: > > 2016-07-19 11:18:36,837 [main ] INFO AbstractConnector > - Started SelectChannelConnector@0.0.0.0:8161 > 2016-07-19 11:22:55,991 [KeepAlive Timer] ERROR LockableServiceSupport > - activemq-server.corporate.thoughtworks.com, no longer able to keep > the exclusive lock so giving up being a master > 2016-07-19 11:22:55,992 [KeepAlive Timer] INFO BrokerService > - Apache ActiveMQ 5.9.1 (activemq-server.corporate.thoughtworks.com, > ID:activemq-server-48763-1468927115062-0:1) is shutting down > 2016-07-19 11:23:00,999 [KeepAlive Timer] INFO TransportConnection > - The connection to 'tcp://172.10.10.1:49275' is taking a long time > to shutdown. > ... this repeated several times, every 5 seconds - not quite sure why, nor > if it's relevant, until: > 2016-07-19 11:34:21,195 [KeepAlive Timer] INFO TransportConnection > - The connection to 'tcp://172.10.10.1:49275' is taking a long time > to shutdown. > 2016-07-19 11:34:26,012 [ata File Writer] INFO DataFileAppender > - Journal failed while writing at: 417675 > 2016-07-19 11:34:26,013 [toreAndDispatch] ERROR MessageDatabase > - KahaDB failed to store to Journal > java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.force0(Native Method) > at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) > at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:386) > at > org.apache.activemq.util.RecoverableRandomAccessFile.sync(RecoverableRandomAccessFile.java:402) > at > org.apache.activemq.store.kahadb.disk.journal.DataFileAppender.processQueue(DataFileAppender.java:373) > at > org.apache.activemq.store.kahadb.disk.journal.DataFileAppender$1.run(DataFileAppender.java:193) > 2016-07-19 11:34:26,029 [toreAndDispatch] INFO DefaultIOExceptionHandler > - Stopping BrokerService[activemq-server.corporate.thoughtworks.com] > due to exception, > > ... some I/O errors removed, naturally there are I/O errors as we've > broken the NFS connection > > 2016-07-19 11:34:26,161 [KeepAlive Timer] INFO TransportConnector > - Connector openwire stopped > 2016-07-19 11:34:26,162 [KeepAlive Timer] INFO TransportConnector > - Connector amqp stopped > 2016-07-19 11:34:26,491 [KeepAlive Timer] INFO PListStoreImpl > - PListStore:[/share/nfs_share/db/ > activemq-server.corporate.thoughtworks.com/tmp_storage] stopped > 2016-07-19 11:34:26,494 [KeepAlive Timer] INFO KahaDBStore > - Stopping async queue tasks > 2016-07-19 11:34:26,494 [KeepAlive Timer] INFO KahaDBStore > - Stopping async topic tasks > 2016-07-19 11:34:26,496 [KeepAlive Timer] INFO KahaDBStore > - Stopped KahaDB > 2016-07-19 11:34:26,509 [KeepAlive Timer] INFO MessageDatabase > - KahaDB: Recovering checkpoint thread after death > 2016-07-19 11:34:26,638 [KeepAlive Timer] INFO BrokerService > - Apache ActiveMQ 5.9.1 (activemq-server.corporate.thoughtworks.com, > ID:activemq-server-48763-1468927115062-0:1) uptime 15 minutes > 2016-07-19 11:34:26,638 [KeepAlive Timer] INFO BrokerService > - Apache ActiveMQ 5.9.1 (activemq-server.corporate.thoughtworks.com, > ID:activemq-server-48763-1468927115062-0:1) is shutdown > 2016-07-19 11:34:26,639 [main ] INFO XBeanBrokerFactory$1 > - Refreshing org.apache.activemq.xbean.XBeanBrokerFactory$1@7f132176: > startup date [Tue Jul 19 11:34:26 UTC 2016]; root of context hierarchy > 2016-07-19 11:34:30,971 [main ] INFO PListStoreImpl > - PListStore:[/share/nfs_share/db/ > activemq-server.corporate.thoughtworks.com/tmp_storage] started > > ... removed a bunch of valid looking startup messages, including > recovering from journal corruption... > > 2016-07-19 11:34:36,754 [main ] INFO StatisticsBrokerPlugin > - Installing StaticsBroker > 2016-07-19 11:34:36,754 [main ] INFO BrokerService > - Apache ActiveMQ 5.9.1 (activemq-server.corporate.thoughtworks.com, > ID:activemq-server-48763-1468927115062-0:2) is starting > 2016-07-19 11:34:36,776 [main ] INFO StatisticsBroker > - Starting StatisticsBroker > 2016-07-19 11:34:36,782 [main ] INFO > TransportServerThreadSupport - Listening for connections at: > tcp://activemq-server:61616?maximumConnections=1000&wireFormat.maxFrameSize=104857600 > 2016-07-19 11:34:36,783 [main ] INFO TransportConnector > - Connector openwire started > 2016-07-19 11:34:36,788 [main ] INFO > TransportServerThreadSupport - Listening for connections at: > amqp://activemq-server:5672?maximumConnections=1000&wireFormat.maxFrameSize=104857600 > 2016-07-19 11:34:36,788 [main ] INFO TransportConnector > - Connector amqp started > 2016-07-19 11:34:36,788 [main ] INFO BrokerService > - Apache ActiveMQ 5.9.1 (activemq-server.corporate.thoughtworks.com, > ID:activemq-server-48763-1468927115062-0:2) started > 2016-07-19 11:34:36,788 [main ] INFO BrokerService > - For help or more information please see: http://activemq.apache.org > 2016-07-19 11:34:37,035 [main ] INFO Server > - jetty-7.6.9.v20130131 > 2016-07-19 11:34:37,478 [main ] INFO ContextHandler > - started > o.e.j.w.WebAppContext{/admin,file:/var/lib/activemq/webapps/admin/} > 2016-07-19 11:34:37,504 [main ] INFO /admin > - Initializing Spring FrameworkServlet 'dispatcher' > > ... removed a bunch of spring mappings > > 2016-07-19 11:34:37,574 [main ] INFO WebAppContext > - ActiveMQ Console at http://0.0.0.0:8161/admin > 2016-07-19 11:34:37,601 [main ] INFO ContextHandler > - started > o.e.j.w.WebAppContext{/camel,file:/var/lib/activemq/webapps/camel} > 2016-07-19 11:34:37,615 [main ] INFO WebAppContext > - WebApp@396485834 at http://0.0.0.0:8161/camel > 2016-07-19 11:34:37,648 [main ] INFO ContextHandler > - started > o.e.j.w.WebAppContext{/demo,file:/var/lib/activemq/webapps/demo} > 2016-07-19 11:34:37,658 [main ] INFO WebAppContext > - WebApp@396485834 at http://0.0.0.0:8161/demo > 2016-07-19 11:34:37,673 [main ] INFO ContextHandler > - started > o.e.j.w.WebAppContext{/fileserver,file:/var/lib/activemq/webapps/fileserver} > 2016-07-19 11:34:37,683 [main ] INFO WebAppContext > - WebApp@396485834 at http://0.0.0.0:8161/fileserver > 2016-07-19 11:34:37,704 [main ] WARN AbstractLifeCycle > - FAILED SelectChannelConnector@0.0.0.0:8161: > java.net.BindException: Address already in use > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:437) > at sun.nio.ch.Net.bind(Net.java:429) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187) > at > org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316) > at > org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265) > at > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) > at org.eclipse.jetty.server.Server.doStart(Server.java:290) > at > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeCustomInitMethod(AbstractAutowireCapableBeanFactory.java:1638) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1579) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1509) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:521) > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:458) > at > org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:296) > at > org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:223) > at > org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:293) > at > org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194) > at > org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:628) > at > org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:932) > at > org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:479) > at > org.apache.xbean.spring.context.ResourceXmlApplicationContext.<init>(ResourceXmlApplicationContext.java:64) > at > org.apache.xbean.spring.context.ResourceXmlApplicationContext.<init>(ResourceXmlApplicationContext.java:52) > at > org.apache.activemq.xbean.XBeanBrokerFactory$1.<init>(XBeanBrokerFactory.java:104) > at > org.apache.activemq.xbean.XBeanBrokerFactory.createApplicationContext(XBeanBrokerFactory.java:104) > at > org.apache.activemq.xbean.XBeanBrokerFactory.createBroker(XBeanBrokerFactory.java:67) > at > org.apache.activemq.broker.BrokerFactory.createBroker(BrokerFactory.java:71) > at > org.apache.activemq.broker.BrokerFactory.createBroker(BrokerFactory.java:54) > at > org.apache.activemq.console.command.StartCommand.runTask(StartCommand.java:87) > at > org.apache.activemq.console.command.AbstractCommand.execute(AbstractCommand.java:57) > at > org.apache.activemq.console.command.ShellCommand.runTask(ShellCommand.java:150) > at > org.apache.activemq.console.command.AbstractCommand.execute(AbstractCommand.java:57) > at > org.apache.activemq.console.command.ShellCommand.main(ShellCommand.java:104) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.activemq.console.Main.runTaskClass(Main.java:262) > at org.apache.activemq.console.Main.main(Main.java:115) > 2016-07-19 11:34:37,706 [main ] WARN AbstractLifeCycle > - FAILED org.eclipse.jetty.server.Server@7103cb56: > java.net.BindException: Address already in use > > > So it seems that something in the admin app failed to shut down properly? > Or something else? I'm going to try disabling the admin app and trying > again. But any other thoughts would be appreciated. > > - Korny > > -- > Kornelis Sietsma korny at my surname dot com http://korny.info > .fnord { display: none !important; } > -- Kornelis Sietsma korny at my surname dot com http://korny.info .fnord { display: none !important; }