I hope you don't mind if I reply to this thread. I'd also like to report 
messages getting lost.

I've had 2 occurrences of losing messages when using simple replication (1 live 
and 1 backup server).
I was using artemis v2.22.0.
I was not able to replicate the issue, and I think it happened when I rebooted 
the live server.

The messages lost were stored persistently, in a durable queue, with no 
consumers online. Not sure about producers.

All I see in the logs are warnings like these two:

- 2022-07-01 14:52:16,282 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222092: Connection to the backup node failed, removing replication now: 
ActiveMQRemoteDisconnectException[errorType=REMOTE_DISCONNECT message=null]

- 2022-07-01 14:52:16,295 WARN  [org.apache.activemq.artemis.core.client] 
AMQ212037: Connection failure to 10.108.28.52/10.108.28.52:9000 has been 
detected: AMQ219015: The connection was disconnected because of server shutdown 
[code=DISCONNECTED]

The only thing that comes to mind that could be the problem is changing the 
port for cluster communication from default 61616 to 9000 (i've experienced 
some problems unrelated to message loss when changing the port).

Any advice on reproducing the issue or where to look for more data appreciated.

-----Original Message-----
From: Clebert Suconic <clebert.suco...@gmail.com>
Sent: Wednesday, October 19, 2022 4:58 PM
To: users@activemq.apache.org
Subject: Re: Messages getting lost on Artemis 2.25


To sporočilo izvira izven naše organizacije. Bodite pozorni pri vsebini in 
odpiranju povezav ali prilog.




Basically I'm telling you how to investigate it.. and if you find an issue on 
the broker, we will need a way to reproduce it.

I have no other report about a message loss situation...

(we do have situations with page-counters going wrong while paging..which I'm 
working now to fix it...  but no message loss).

On Wed, Oct 19, 2022 at 10:55 AM Clebert Suconic <clebert.suco...@gmail.com> 
wrote:
>
> I am not aware of any issues that would lead to message loss...
>
> Garbage Collection itself has no effect on anything regarding paging or 
> journal.
>
>
> Are you able to chase which message is lost on a test?
>
>
> you could use the retention feature, replay the message.. and you
> could also look on the ./artemis data print on what happened to the
> message.
>
>
> One other suggestion I could make is to use Federation instead of
> clustering. Perhaps message are stranded on the Store and forward
> queue?
>
>
> also.. you have consumers in all the nodes.. you should use clustering
> with OFF-WITH-REDISTRIBUTION, or use Federation. you should always
> favor the local consumers.
>
> On Wed, Oct 19, 2022 at 8:16 AM Walter de Boer <walterdeb...@dbso.nl> wrote:
> >
> > All,
> >
> > This week we lost 23.000 messages in a few days time on our
> > production Cluster running Artemis 2.26.0, see our settings below.
> > We've reverted back to Artemis 2.20.0 just in case
> >
> > A few observatoins:
> >
> >   * In version 2.24.0, 2.25.0 and 2.26.0 running on ZGC we noticed
> >     messages being produced to a queue without errors, that we didn't
> >     find in that queue. At the same time we saw incorrect counters. We
> >     did restart nodes to resolve, but on one occasion the error
> >     continued for some time after that, and we never found the messages
> >     again. Not even when exporting the journal files. The errors showed
> >     after running a few days
> >   * In version 2.20.0 running on G1GC and on ZGC we did not lose any
> >     messages. We did experience memory issues resulting in (to) long
> >     garbage collection times every other week, maybe due to lack of JVM
> >     tuning on our side. We were running 2.20 on G1GC for serveral
> > months
> >
> > We're running a symetric Cluser of 3 live/backup pairs in Docker JRE
> > (temurin) containers on VMWare CentOS7 hosts. Each live node has
> > around
> > 1.000 producers & consumers continuously.
> >
> > I hope the Artemis community can advise us in this?
> >
> > Best Regards,
> >
> > Walter
> >
> >
> > Our setup:
> >
> > *
> > **docker-compose.yaml**
> > *
> >       version: "3.8"
> >
> >       services:
> >         artemis:
> >           container_name: 'artemis'
> >           network_mode: "host"
> >           image: "cdplatform/activemq-artemis:2.26.0"
> >           restart: 'always'
> >           hostname: cjiblx8408.ato.cjib.minjus.nl
> >           volumes:
> >             - "/data/artemis/data:/var/lib/artemis/data"
> >             - "/data/artemis/plugins:/var/lib/artemis/lib"
> >             - "/data/artemis/etc:/var/lib/artemis/etc"
> >             - "/data/artemis/etc-override:/var/lib/artemis/etc-override"
> >             - "/logging/artemis:/var/lib/artemis/log"
> >           environment:
> >             ARTEMIS_MIN_MEMORY:  "14051615047"
> >             ARTEMIS_MAX_MEMORY:  "14051615047"
> >             JAVA_XTRA_ARGS: "-XX:ActiveProcessorCount=4 -XX:+UseZGC
> > -XX:+UseDynamicNumberOfGCThreads -XX:+UseStringDeduplication "
> >             BROKER_SETTINGS_FILE: "broker-settings.xml"
> >             ENABLE_JMX:          "true"
> >             JMX_PORT:            "3333"
> >             ENABLE_JMX_EXPORTER: "true"
> >             JMX_RMI_PORT:        "1098"
> >           mem_swappiness: 0
> >           memswap_limit: 20073735782
> >           deploy:
> >             resources:
> >               limits:
> >                 memory: "20073735782"
> >               reservations:
> >                 memory: "20073735782"
> >
> > *Command line options:*
> >
> >     /opt/java/openjdk/bin/java
> >     
> > -javaagent:/opt/jmx-exporter/jmx_prometheus_javaagent.jar=9404:/opt/jmx-exporter/etc/jmx-exporter-config.yaml
> >        -Xmx17564518809
> >        -Xms17564518809
> >        -Dcom.sun.management.jmxremote.authenticate=true
> >     
> > -Dcom.sun.management.jmxremote.password.file=/var/lib/artemis/etc/jmxremote.password
> >     
> > -Dcom.sun.management.jmxremote.access.file=/var/lib/artemis/etc/jmxremote.access
> >        -Dcom.sun.management.jmxremote.port=3333
> >        -Dcom.sun.management.jmxremote.rmi.port=1098
> >        -Dcom.sun.management.jmxremote.ssl=false
> >        -Djava.net.preferIPv4Addresses=true
> >        -Djava.net.preferIPv4Stack=true
> >        -XX:ActiveProcessorCount=4
> >        -XX:+UseZGC
> >        -XX:+UseDynamicNumberOfGCThreads
> >        -XX:+UseStringDeduplication
> >        -Dhawtio.realm=activemq
> >        -Dhawtio.offline=true
> >        -Dhawtio.role=gs-auth-Artemis_Admin,gs-auth-Artemis_User
> >     
> > -DPrincipalClasses=org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal
> >     -Djolokia.policyLocation=file:/var/lib/artemis/etc/jolokia-access.xml
> >        -Dcom.sun.management.jmxremote.ssl=false
> >        -Xbootclasspath/a:/var/lib/artemis/lib/javax.json-1.1.4.jar
> >        -Dhawtio.role=gs-auth-Artemis_Admin,gs-auth-Artemis_User
> >     
> > -Xbootclasspath/a:/opt/apache-artemis/lib/jboss-logmanager-2.1.18.Final.jar:/opt/apache-artemis/lib/wildfly-common-1.5.2.Final.jar:/opt/apache-artemis/lib/javax.json-1.1.4.jar
> >     -Djava.security.auth.login.config=/var/lib/artemis/etc/login.config
> >        -classpath /opt/apache-artemis/lib/artemis-boot.jar
> >        -Dartemis.home=/opt/apache-artemis
> >        -Dartemis.instance=/var/lib/artemis
> >        -Djava.library.path=/opt/apache-artemis/bin/lib/linux-x86_64
> >        -Djava.io.tmpdir=/var/lib/artemis/tmp
> >        -Ddata.dir=/var/lib/artemis/data
> >        -Dartemis.instance.etc=/var/lib/artemis/etc
> >        -Djava.util.logging.manager=org.jboss.logmanager.LogManager
> >     -Dlogging.configuration=file:/var/lib/artemis/etc//logging.properties
> >        -Dartemis.default.sensitive.string.codec.key=
> >     org.apache.activemq.artemis.boot.Artemis
> >        run
> >
> > *broker-settings.xml**:*
> >
> >     <core xmlns="urn:activemq:core">
> >        <global-max-size>2810323009</global-max-size>
> >        <name>xxxxxxx.xxxxxx.xx</name>
> >        <graceful-shutdown-enabled
> >     xmlns="urn:activemq:core">true</graceful-shutdown-enabled>
> >        <graceful-shutdown-timeout
> >     xmlns="urn:activemq:core">10000</graceful-shutdown-timeout>
> >        <management-address
> >     xmlns="urn:activemq:core">activemq.management</management-address>
> >        <persistence-enabled
> >     xmlns="urn:activemq:core">true</persistence-enabled>
> >        <id-cache-size xmlns="urn:activemq:core">20000</id-cache-size>
> >        <persist-id-cache xmlns="urn:activemq:core">true</persist-id-cache>
> >        <paging-directory
> >     xmlns="urn:activemq:core">data/paging</paging-directory>
> >        <bindings-directory
> >     xmlns="urn:activemq:core">data/bindings</bindings-directory>
> >        <large-messages-directory
> >     xmlns="urn:activemq:core">data/large-messages</large-messages-directory>
> >        <journal-directory
> >     xmlns="urn:activemq:core">data/journal</journal-directory>
> >        <journal-type xmlns="urn:activemq:core">ASYNCIO</journal-type>
> >        <journal-datasync xmlns="urn:activemq:core">true</journal-datasync>
> >        <journal-min-files xmlns="urn:activemq:core">2</journal-min-files>
> >        <journal-pool-files xmlns="urn:activemq:core">10</journal-pool-files>
> >        <journal-device-block-size
> >     xmlns="urn:activemq:core">4096</journal-device-block-size>
> >        <journal-file-size xmlns="urn:activemq:core">10MB</journal-file-size>
> >        <journal-buffer-size
> >     xmlns="urn:activemq:core">490KB</journal-buffer-size>
> >        <journal-compact-min-files
> >     xmlns="urn:activemq:core">10</journal-compact-min-files>
> >        <journal-compact-percentage
> >     xmlns="urn:activemq:core">30</journal-compact-percentage>
> >        <journal-lock-acquisition-timeout
> >     xmlns="urn:activemq:core">-1</journal-lock-acquisition-timeout>
> >        <journal-file-open-timeout
> >     xmlns="urn:activemq:core">5</journal-file-open-timeout>
> >        <journal-sync-non-transactional
> >     xmlns="urn:activemq:core">true</journal-sync-non-transactional>
> >        <journal-sync-transactional
> >     xmlns="urn:activemq:core">true</journal-sync-transactional>
> >        <disk-scan-period xmlns="urn:activemq:core">5000</disk-scan-period>
> >        <max-disk-usage xmlns="urn:activemq:core">90</max-disk-usage>
> >        <critical-analyzer xmlns="urn:activemq:core">true</critical-analyzer>
> >        <critical-analyzer-timeout
> >     xmlns="urn:activemq:core">120000</critical-analyzer-timeout>
> >        <critical-analyzer-check-period
> >     xmlns="urn:activemq:core">60000</critical-analyzer-check-period>
> >        <critical-analyzer-policy
> >     xmlns="urn:activemq:core">LOG</critical-analyzer-policy>
> >        <page-sync-timeout
> >     xmlns="urn:activemq:core">548000</page-sync-timeout>
> >        <acceptors xmlns="urn:activemq:core">
> >          <acceptor
> >     
> > name="artemis">tcp://0.0.0.0:61616?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;amqpMinLargeMessageSize=102400;connectionsAllowed=1536;directDeliver=false;useEpoll=true;amqpCredits=1000;amqpLowCredits=300;amqpDuplicateDetection=true;protocols=CORE,AMQP,STOMP,HORNETQ,OPENWIRE;</acceptor>
> >        </acceptors>
> >        <connectors xmlns="urn:activemq:core">
> >          <connector name="artemis">tcp://xxxxxxx.xxxxxx.xx:61616</connector>
> >        </connectors>
> >        <cluster-user xmlns="urn:activemq:core">artemis</cluster-user>
> >        <cluster-password
> >     xmlns="urn:activemq:core">xxxxxxxx</cluster-password>
> >        <broadcast-groups xmlns="urn:activemq:core">
> >          <broadcast-group name="bg-group1">
> >            <group-address>231.7.7.10</group-address>
> >            <group-port>9876</group-port>
> >            <broadcast-period>5000</broadcast-period>
> >            <connector-ref>artemis</connector-ref>
> >          </broadcast-group>
> >        </broadcast-groups>
> >        <discovery-groups xmlns="urn:activemq:core">
> >          <discovery-group name="dg-group1">
> >            <group-address>231.7.7.10</group-address>
> >            <group-port>9876</group-port>
> >            <refresh-timeout>10000</refresh-timeout>
> >          </discovery-group>
> >        </discovery-groups>
> >        <cluster-connections xmlns="urn:activemq:core">
> >          <cluster-connection name="artemis-ato">
> >            <connector-ref>artemis</connector-ref>
> >            <retry-interval>2000</retry-interval>
> >     <initial-connect-attempts>1000</initial-connect-attempts>
> >            <reconnect-attempts>1000</reconnect-attempts>
> >     <message-load-balancing>ON_DEMAND</message-load-balancing>
> >            <max-hops>1</max-hops>
> >            <discovery-group-ref discovery-group-name="dg-group1"/>
> >          </cluster-connection>
> >        </cluster-connections>
> >        <ha-policy xmlns="urn:activemq:core">
> >          <replication>
> >            <master>
> >     <check-for-live-server>true</check-for-live-server>
> >     <vote-on-replication-failure>true</vote-on-replication-failure>
> >              <group-name>ato-hapair-1</group-name>
> >            </master>
> >          </replication>
> >        </ha-policy>
> >        <metrics xmlns="urn:activemq:core">
> >          <jvm-memory>true</jvm-memory>
> >          <jvm-gc>true</jvm-gc>
> >          <jvm-threads>true</jvm-threads>
> >          <plugin
> >     
> > class-name="org.apache.activemq.artemis.core.server.metrics.plugins.ArtemisPrometheusMetricsPlugin"/>
> >        </metrics>
> >        <security-settings xmlns="urn:activemq:core">
> >          <security-setting match="activemq.management">
> >            <permission type="manage" roles="amq,service"/>
> >          </security-setting>
> >          <security-setting match="#">
> >            <permission type="manage" roles="amq,service"/>
> >            <permission type="send" roles="amq,service,b2bi"/>
> >            <permission type="consume" roles="amq,service,b2bi"/>
> >            <permission type="browse" roles="amq,service"/>
> >            <permission type="createAddress" roles="amq,service"/>
> >            <permission type="deleteAddress" roles="amq,service"/>
> >            <permission type="createDurableQueue" roles="amq,service"/>
> >            <permission type="deleteDurableQueue" roles="amq,service"/>
> >            <permission type="createNonDurableQueue" roles="amq,service"/>
> >            <permission type="deleteNonDurableQueue" roles="amq,service"/>
> >          </security-setting>
> >          <role-mapping from="gs-auth-Artemis_Admin" to="amq"/>
> >          <role-mapping from="gs-auth-Artemis_User" to="service"/>
> >        </security-settings>
> >        <address-settings xmlns="urn:activemq:core">
> >          <address-setting match="activemq.management#">
> >            <dead-letter-address>DLQ</dead-letter-address>
> >            <expiry-address>ExpiryQueue</expiry-address>
> >            <redelivery-delay>0</redelivery-delay>
> >     
> > <message-counter-history-day-limit>10</message-counter-history-day-limit>
> >            <max-size-bytes>-1</max-size-bytes>
> >            <max-size-messages>-1</max-size-messages>
> >     <address-full-policy>PAGE</address-full-policy>
> >            <auto-create-queues>true</auto-create-queues>
> >     <auto-create-addresses>true</auto-create-addresses>
> >     <auto-create-jms-queues>true</auto-create-jms-queues>
> >     <auto-create-jms-topics>true</auto-create-jms-topics>
> >          </address-setting>
> >          <address-setting match="#">
> >            <dead-letter-address>DLQ</dead-letter-address>
> >            <expiry-address>ExpiryQueue</expiry-address>
> >            <redelivery-delay>0</redelivery-delay>
> >     
> > <message-counter-history-day-limit>10</message-counter-history-day-limit>
> >            <max-size-bytes>-1</max-size-bytes>
> >            <max-size-messages>-1</max-size-messages>
> >     <address-full-policy>PAGE</address-full-policy>
> >            <auto-create-queues>true</auto-create-queues>
> >     <auto-create-addresses>true</auto-create-addresses>
> >     <auto-create-jms-queues>true</auto-create-jms-queues>
> >     <auto-create-jms-topics>true</auto-create-jms-topics>
> >          </address-setting>
> >          <address-setting match="jms.#">
> >            <dead-letter-address>DLQ</dead-letter-address>
> >            <expiry-address>ExpiryQueue</expiry-address>
> >     <max-delivery-attempts>5</max-delivery-attempts>
> >            <redelivery-delay>500</redelivery-delay>
> >     <redelivery-delay-multiplier>1.5</redelivery-delay-multiplier>
> >     
> > <redelivery-collision-avoidance-factor>0.5</redelivery-collision-avoidance-factor>
> >     <redistribution-delay>30000</redistribution-delay>
> >     <send-to-dla-on-no-route>true</send-to-dla-on-no-route>
> >            <max-size-bytes>-1</max-size-bytes>
> >            <max-size-messages>-1</max-size-messages>
> >     <address-full-policy>PAGE</address-full-policy>
> >     
> > <message-counter-history-day-limit>10</message-counter-history-day-limit>
> >            <auto-create-queues>false</auto-create-queues>
> >            <auto-delete-queues>false</auto-delete-queues>
> >     <auto-delete-created-queues>false</auto-delete-created-queues>
> >     <auto-delete-queues-delay>30000</auto-delete-queues-delay>
> >     <config-delete-queues>OFF</config-delete-queues>
> >     <auto-create-addresses>false</auto-create-addresses>
> >     <auto-delete-addresses>false</auto-delete-addresses>
> >     <auto-delete-addresses-delay>30000</auto-delete-addresses-delay>
> >     <config-delete-addresses>OFF</config-delete-addresses>
> >          </address-setting>
> >          <address-setting match="activemq.notifications">
> >            <max-size-bytes>-1</max-size-bytes>
> >            <max-size-messages>-1</max-size-messages>
> >     <address-full-policy>PAGE</address-full-policy>
> >          </address-setting>
> >          <address-setting match="jms.queue.#">
> >     <default-address-routing-type>ANYCAST</default-address-routing-type>
> >     <default-queue-routing-type>ANYCAST</default-queue-routing-type>
> >          </address-setting>
> >          <address-setting match="jms.topic.#">
> >     <default-address-routing-type>MULTICAST</default-address-routing-type>
> >     <default-queue-routing-type>MULTICAST</default-queue-routing-type>
> >          </address-setting>
> >        </address-settings>
> >        <addresses xmlns="urn:activemq:core">
> >          <address name="DLQ">
> >            <anycast>
> >              <queue name="DLQ"/>
> >            </anycast>
> >          </address>
> >          <address name="ExpiryQueue">
> >            <anycast>
> >              <queue name="ExpiryQueue"/>
> >            </anycast>
> >          </address>
> >        </addresses>
> >        <broker-plugins xmlns="urn:activemq:core">
> >          <broker-plugin
> >     
> > class-name="org.apache.activemq.artemis.core.server.plugin.impl.LoggingActiveMQServerPlugin">
> >            <property key="LOG_ALL_EVENTS" value="false"/>
> >            <property key="LOG_CONNECTION_EVENTS" value="false"/>
> >            <property key="LOG_SESSION_EVENTS" value="false"/>
> >            <property key="LOG_CONSUMER_EVENTS" value="false"/>
> >            <property key="LOG_DELIVERING_EVENTS" value="false"/>
> >            <property key="LOG_SENDING_EVENTS" value="false"/>
> >            <property key="LOG_INTERNAL_EVENTS" value="false"/>
> >          </broker-plugin>
> >        </broker-plugins>
> >     </core>
> >
> >
> >
>
>
> --
> Clebert Suconic



--
Clebert Suconic

NOTICE - NOT TO BE REMOVED.
This e-mail and any attachments are confidential and may contain legally 
privileged information and/or copyright material of Actual I.T. or third 
parties. If you are not an authorised recipient of this e-mail, please contact 
Actual I.T. immediately by return email or by telephone or facsimile on the 
above numbers.
You should not read, print, re-transmit, store or act in reliance on this email 
or any attachments and you should destroy all copies of them.

Reply via email to