You may be hitting ARTEMIS-3117 [1]. I recommend you upgrade to the latest release (i.e. 2.20.0) and see if that resolves the issue.
Justin [1] https://issues.apache.org/jira/browse/ARTEMIS-3117 On Thu, Jan 7, 2021 at 4:08 AM <s...@cecurity.com> wrote: > Hello, > > No, The failed connections are from the external clients (I do not have > the client environments, nor its code). On the embedded broker, the > server-side use vm connectors which to not seems to have such issues > (and do not use netty-ssl). > > We made a deployment with a standalone artemis (2.16) to act as a sort > of proxy broker for the embedded one. We have connections failures from > clients on it too. The bridges used to forward locally seems fine (but > its a different context, the clients use JMS on openwire) > > No i did not do a sampling with visualvm. It happens mostly on a > production environnement and trying to produce reliably the exact > problem on test have been a mixed bag. > > I did capture more stacktrace last night at a point where the issue was > occuring more frequently and it seems the netty-threads were much less > free than during previous observations > > Name: Thread-50 (activemq-netty-threads) > State: BLOCKED on > org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptor@662692e8 > owned by: Thread-95 (activemq-netty-threads) > Total blocked: 145Â 739 Total waited: 4Â 186 > > Stack trace: > > org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptor.getSslHandler(NettyAcceptor.java:492) > > org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptor$4.initChannel(NettyAcceptor.java:403) > > io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:129) > > io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:112) > > io.netty.channel.AbstractChannelHandlerContext.callHandlerAdded(AbstractChannelHandlerContext.java:953) > > io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:610) > > io.netty.channel.DefaultChannelPipeline.access$100(DefaultChannelPipeline.java:46) > > io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1461) > > io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1126) > > io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:651) > > io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:515) > > io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:428) > > io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:487) > > io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) > > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) > io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:333) > > io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905) > > org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) > > I found a bit odd than a lot of netty threads where stuck at this point, > but i'm not familiar with netty internals > > > Le 07/01/2021 à 03:12, Tim Bain a écrit : > > For the embedded 2.10.1 broker case, are you saying that connections > failed > > when made from other threads in the process in which the broker was > > embedded? If so, that would seem to rule out the network, since traffic > > would never leave the host. > > > > You mentioned capturing a stack trace, but have you done CPU sampling via > > VisualVM or a similar tool? CPU sampling isn't a perfectly accurate > > technique, but often it gives enough information to narrow in on the > cause > > of a problem (or to rule out certain possibilities). > > > > Tim > > > > On Wed, Jan 6, 2021, 10:34 AM Sébastien LETHIELLEUX < > > sebastien.lethiell...@cecurity.com> wrote: > > > >> Hello (again), > >> > >> I'm trying to find the root cause of a significant number of failed > >> connexions attempts / broken existing connections on an artemis broker. > >> > >> The issue have been produced on an embedded artemis 2.10.1 and a > >> standalone 2.16.0 (tomcat9, openjdk11) > >> > >> Two type of errors occurs : timeouts during handshakes and broken > >> existing connexions. > >> > >> such as > >> > >> 2021-01-04 15:28:53,243 ERROR [org.apache.activemq.artemis.core.server] > >> AMQ224088: Timeout (10 seconds) on acceptor "netty-ssl" during protocol > >> handshake with /xxx.xxx.xxx.xxx:41760 has occurred. > >> > >> 2021-01-06 16:56:28,016 WARN {Thread-16 > >> > >> > (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@f493a59 > >> )} > >> [org.apache.activemq.artemis.core.client] : AMQ212037: Connection > >> failure to /xxx.xxx.xxx.xxx:49918 has been detected: AMQ229014: Did not > >> receive data from /xxx.xxx.xxx.xxx:49918 within the 30,000ms connection > >> TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT] > >> > >> Both brokers were deployed on RHEL7 with artemis-native and libaio (32 > >> logical cores, plenty of RAM). Clients use JMS with openwire > >> (activemq-client). > >> > >> The investigations on network infrastructures came up empty handed, so > >> I'm trying to explore the possibility that something went wrong in > >> artemis underpants. > >> > >> Is there a possibility that the thread pool configured with > >> remotingThreads is too small (default values) ? The observation of the > >> thread stack in JMX seems to expose plenty of threads happily idle. > >> > >> The clients are known to open and close a lot of connections (we know > >> it's wrong, and now they know it too, but it still should work). The > >> number of open connections is usually around 90-100 which hardly seems > >> like an unbearable burden. > >> > >> Any ideas or suggestions on what to check/monitor/etc ? > >> > >> Regards, > >> > >> SL > >> > >> > >