[ 
https://issues.apache.org/jira/browse/ARTEMIS-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18060494#comment-18060494
 ] 

Clebert Suconic commented on ARTEMIS-5895:
------------------------------------------

The fix is part of the PR: https://github.com/apache/artemis/pull/6247

> Message loss during failover switch in shared store configuration
> -----------------------------------------------------------------
>
>                 Key: ARTEMIS-5895
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-5895
>             Project: Artemis
>          Issue Type: Bug
>          Components: OpenWire
>    Affects Versions: 2.44.0
>            Reporter: Claudiu Chioasca
>            Assignee: Clebert Suconic
>            Priority: Critical
>              Labels: pull-request-available
>         Attachments: 2372.png, 2373_missing.png, 2374.png, ARTEMIS-5895.zip, 
> FailoverApplicationTests.java, QueueSender.java, artemis.log, 
> failover-queue.png, failover-test-automation.ps1, 
> producer-bug-detected-iteration-82.log
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Sometimes, a message producer connected via OPENWIRE protocol and calling 
> commit() over a transacted session is not signaled with an exception when 
> failover switch happens and the commit fails. 
>  
> My test consists of: 
>  
>  - primary/backup artemis instances deployed with shared store configuration 
> (2.44.0)
>  
>  - a JDK21 spring boot (4.0.1) based producer:
>  
> <dependency>
> <groupId>org.springframework.boot</groupId>
> <artifactId>spring-boot-starter-activemq</artifactId>
> </dependency>
>  
> that connects to the broker via failover url: 
> failover:(ssl://LOCAL-DEV:5176,ssl://LOCAL-DEV:4176)
>  
>  - this scenario: while both primary & backup are up, producer starts sending 
> 10000 messages to "failover-queue" destination, during this time the primary 
> instance is shut down using "artemis stop". The producer is configured to 
> retry when session.commit() fails
>  
>  - a script to repeat the same sequence of steps until message loss is 
> detected: restart brokers, purge test destination, execute spring boot test, 
> shut down primary when messages start to appear in test destination, count 
> the messages when the test finishes
>  
> I let the script running for a couple of hours until it replicated, 
> producer-bug-detected-iteration-82.log shows the output of the producer + 
> script detecting the loss.
> I attached the primary instance log at the time it was stopping and message 
> #2373 was lost. 
> The 2373_missing.png is a capture of Artemis console for the failover-queue 
> destination, where it can be noticed 2372 & 2374 are consecutive.
> The producer log shows the 2374 first send is rolled-back, then retried as 
> expected, but 2373 send appears successful.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to