SeanJi213 opened a new issue, #8158: URL: https://github.com/apache/rocketmq/issues/8158
### Before Creating the Bug Report - [X] I found a bug, not just asking a question, which should be created in [GitHub Discussions](https://github.com/apache/rocketmq/discussions). - [X] I have searched the [GitHub Issues](https://github.com/apache/rocketmq/issues) and [GitHub Discussions](https://github.com/apache/rocketmq/discussions) of this repository and believe that this is not a duplicate. - [X] I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ. ### Runtime platform environment MacOS 14.1 ### RocketMQ version 4.9.7 ### JDK Version _No response_ ### Describe the Bug 1. Transaction messages can be committed after check is triggered and returns COMMIT, causing committing the same message with identical messageId twice. 2. Transactional messages can still be committed by the producer after check is triggered and returns ROLLBACK, causing inconsistency. 3. As long as check is triggered before the producer sends the state of local transaction to the broker, check won't stop until it reaches the maximum check times even the producer already sends the final state of local transaction(COMMIT/ROLLBACK) to the broker, which is certainly not expected. 4. If check is triggered and reaches its maximum check times and after this, the producer sends a COMMIT to the broker, this message, instead of being rollbacked as expected, can still be consumed by the consumer. ### Steps to Reproduce ### For ISSUE 1: The producer executes its local transaction for a relatively long time and does not return the state of its local transaction, long enough to let the check be triggered and return COMMIT before the producer does. In this case, there is a consumer subscribing the topic which we send transaction message to. So, as soon as the check method got a COMMIT, this message is consumed by the consumer. Right after check method returns COMMIT, the producer sends COMMIT to the broker, and the same message is consumed once again. 生产端控制台输出: ``` 发送消息:{"body":"dGVzdCB0cmFuc2FjdGlvbmFsIG1lc3NhZ2UgTk8uMzAx","delayTimeLevel":0,"flag":0,"keys":"Trans_Message_Key","properties":{"KEYS":"Trans_Message_Key","WAIT":"true","TAGS":"trans msg tag 301"},"tags":"trans msg tag 301","topic":"test_topic_2","waitStoreMsgOK":true} Executing local transaction... 当前时间戳: 2024-05-17 16:44:17 Executing local transaction... 当前时间戳: 2024-05-17 16:44:27 Executing local transaction... 当前时间戳: 2024-05-17 16:44:36 开始回查的时间戳:2024-05-17 16:44:38 事务开始执行 31 秒后Broker端回查 返回回查结果: COMMIT_MESSAGE 本地事务执行完成,生产端返回事务状态: COMMIT_MESSAGE SendResult [sendStatus=SEND_OK, msgId=7F00000124A87C53A9EB55FD0D130000, offsetMsgId=null, messageQueue=MessageQueue [topic=test_topic_2, brokerName=broker-a, queueId=1], queueOffset=310] ``` 消费端消费到该条消息两次,重复消费: ![Xnip2024-05-17_16-46-30](https://github.com/apache/rocketmq/assets/97076993/fccee4cc-8882-4798-b57a-9c4ba5bccf09) ### For ISSUE 2: The producer executes its local transaction for a relatively long time and does not return the state of its local transaction, long enough to let the check be triggered and return ROLLBACK first. In this case, there is also a consumer subscribing the topic which we send transaction message to. And right after the check method returns ROLLBACK, the producer finishes executing local transaction and sends a COMMIT to the broker, and this message can be consumed by the consumer somehow, which causes severe inconsistency. 生产端控制台输出: ``` 发送消息:{"body":"dGVzdCB0cmFuc2FjdGlvbmFsIG1lc3NhZ2UgTk8uNDU1","delayTimeLevel":0,"flag":0,"keys":"Trans_Message_Key","properties":{"KEYS":"Trans_Message_Key","WAIT":"true","TAGS":"trans msg tag 455"},"tags":"trans msg tag 455","topic":"test_topic_2","waitStoreMsgOK":true} Executing local transaction... 当前时间戳: 2024-05-17 16:51:52 Executing local transaction... 当前时间戳: 2024-05-17 16:52:02 Executing local transaction... 当前时间戳: 2024-05-17 16:52:12 Executing local transaction... 当前时间戳: 2024-05-17 16:52:22 开始回查的时间戳:2024-05-17 16:52:38 事务开始执行 56 秒后Broker端回查 返回回查结果: ROLLBACK_MESSAGE 本地事务执行完成,生产端返回事务状态: COMMIT_MESSAGE SendResult [sendStatus=SEND_OK, msgId=7F0000012B117C53A9EB5603FF1D0000, offsetMsgId=null, messageQueue=MessageQueue [topic=test_topic_2, brokerName=broker-a, queueId=1], queueOffset=312] ``` 消费端仍然消费到了本应被回滚的消息: ![Xnip2024-05-17_16-54-55](https://github.com/apache/rocketmq/assets/97076993/40c0e454-6244-43a3-aa2b-d8c80da8a586) (针对这种情况,如果反之,回查先于生产端返回了COMMIT,在这之后生产端返回了本地事务状态-ROLLBACK,该条消息仍然会被消费到,消费时机为回查返回COMMIT时,造成不一致) ### For ISSUE 3: The producer executes its local transaction for a relatively long time and does not return the state of its local transaction, long enough for the check method to be triggered. In this scenario, the check method will return UNKNOW every time. And after a few checks, the producer finishes executing local transaction and sends a COMMIT to the broker. However, after this, check continues until it reaches the maximum check times, which is by default 15. ``` {"body":"dGVzdCB0cmFuc2FjdGlvbmFsIG1lc3NhZ2UgTk8uNTA1","delayTimeLevel":0,"flag":0,"keys":"Trans_Message_Key","properties":{"KEYS":"Trans_Message_Key","WAIT":"true","TAGS":"trans msg tag 505"},"tags":"trans msg tag 505","topic":"test_topic_2","waitStoreMsgOK":true} Executing local transaction... 当前时间戳: 2024-05-17 17:10:57 Executing local transaction... 当前时间戳: 2024-05-17 17:11:07 Executing local transaction... 当前时间戳: 2024-05-17 17:11:17 Executing local transaction... 当前时间戳: 2024-05-17 17:11:27 Executing local transaction... 当前时间戳: 2024-05-17 17:11:37 开始回查的时间戳:2024-05-17 17:11:38 事务开始执行 51 秒后Broker端回查 返回回查结果: UNKNOW Executing local transaction... 当前时间戳: 2024-05-17 17:11:47 Executing local transaction... 当前时间戳: 2024-05-17 17:11:57 Executing local transaction... 当前时间戳: 2024-05-17 17:12:07 Executing local transaction... 当前时间戳: 2024-05-17 17:12:17 Executing local transaction... 当前时间戳: 2024-05-17 17:12:27 Executing local transaction... 当前时间戳: 2024-05-17 17:12:37 开始回查的时间戳:2024-05-17 17:12:38 事务开始执行 111 秒后Broker端回查 返回回查结果: UNKNOW Executing local transaction... 当前时间戳: 2024-05-17 17:12:47 Executing local transaction... 当前时间戳: 2024-05-17 17:12:57 Executing local transaction... 当前时间戳: 2024-05-17 17:13:07 Executing local transaction... 当前时间戳: 2024-05-17 17:13:17 Executing local transaction... 当前时间戳: 2024-05-17 17:13:27 Executing local transaction... 当前时间戳: 2024-05-17 17:13:37 开始回查的时间戳:2024-05-17 17:13:38 事务开始执行 171 秒后Broker端回查 返回回查结果: UNKNOW Executing local transaction... 当前时间戳: 2024-05-17 17:13:47 Executing local transaction... 当前时间戳: 2024-05-17 17:13:57 Executing local transaction... 当前时间戳: 2024-05-17 17:14:07 Executing local transaction... 当前时间戳: 2024-05-17 17:14:17 Executing local transaction... 当前时间戳: 2024-05-17 17:14:27 Executing local transaction... 当前时间戳: 2024-05-17 17:14:37 开始回查的时间戳:2024-05-17 17:14:38 事务开始执行 231 秒后Broker端回查 返回回查结果: UNKNOW 本地事务执行完成,生产端返回事务状态: COMMIT_MESSAGE SendResult [sendStatus=SEND_OK, msgId=7F0000013AE37C53A9EB561578AC0000, offsetMsgId=null, messageQueue=MessageQueue [topic=test_topic_2, brokerName=broker-a, queueId=1], queueOffset=314] 开始回查的时间戳:2024-05-17 17:15:38 事务开始执行 291 秒后Broker端回查 返回回查结果: UNKNOW 开始回查的时间戳:2024-05-17 17:16:38 事务开始执行 351 秒后Broker端回查 返回回查结果: UNKNOW 开始回查的时间戳:2024-05-17 17:17:38 事务开始执行 411 秒后Broker端回查 返回回查结果: UNKNOW 开始回查的时间戳:2024-05-17 17:18:38 事务开始执行 471 秒后Broker端回查 返回回查结果: UNKNOW 开始回查的时间戳:2024-05-17 17:19:38 事务开始执行 531 秒后Broker端回查 返回回查结果: UNKNOW 开始回查的时间戳:2024-05-17 17:20:38 事务开始执行 591 秒后Broker端回查 返回回查结果: UNKNOW 开始回查的时间戳:2024-05-17 17:21:38 事务开始执行 651 秒后Broker端回查 返回回查结果: UNKNOW 开始回查的时间戳:2024-05-17 17:22:38 事务开始执行 711 秒后Broker端回查 返回回查结果: UNKNOW 开始回查的时间戳:2024-05-17 17:23:38 事务开始执行 771 秒后Broker端回查 返回回查结果: UNKNOW 开始回查的时间戳:2024-05-17 17:24:38 事务开始执行 831 秒后Broker端回查 返回回查结果: UNKNOW 开始回查的时间戳:2024-05-17 17:25:38 事务开始执行 891 秒后Broker端回查 返回回查结果: UNKNOW ``` 消费端能消费到该条消息: ![Xnip2024-05-17_17-45-40](https://github.com/apache/rocketmq/assets/97076993/f865315f-49df-4776-b15a-5e55abd4a713) ### For ISSUE 4: If check is triggered and reaches its maximum check times and after this, the producer sends a COMMIT to the broker, this message, instead of being rollbacked as expected, can still be consumed by the consumer. This is the message trace of this transaction message, indicating the message can be consumed after the number of check times reaches its maximum, which is by default 15. ``` [ { "clientHost": "127.0.0.1", "costTime": 37, "fromTransactionCheck": false, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_Msg_Half", "offSetMsgId": "7F00000100002A9F000000000006F8A5", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149262016, "topic": "test_topic_1", "traceType": "Pub", "transactionState": "NOT_YET_CHECK_STATE" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149274610, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149334622, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149394625, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149454664, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149514694, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149574692, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149634705, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149694709, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149754717, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149814730, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149874738, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149934754, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715149994756, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715150054765, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": true, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715150114782, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "UNKNOW" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": false, "groupName": "TEST_PRODUCER_GROUP", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "msgType": "Trans_msg_Commit", "offSetMsgId": "", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1:10911", "tags": "trans msg tag 553", "timeStamp": 1715150232508, "topic": "test_topic_1", "traceType": "EndTransaction", "transactionId": "7F0000017E837C53A9EB2720D0BF0000", "transactionState": "COMMIT_MESSAGE" }, { "clientHost": "127.0.0.1", "costTime": -1, "fromTransactionCheck": false, "groupName": "trans_msg_consumer_group_621", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "offSetMsgId": "", "requestId": "7F00000129F87C53A9EB272F9FDA000B", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1", "tags": "", "timeStamp": 1715150232539, "topic": "", "traceType": "SubBefore", "transactionState": "NOT_YET_CHECK_STATE" }, { "clientHost": "127.0.0.1", "costTime": 0, "fromTransactionCheck": false, "groupName": "trans_msg_consumer_group_621", "keys": "trans msg key 553", "msgId": "7F0000017E837C53A9EB2720D0BF0000", "offSetMsgId": "", "requestId": "7F00000129F87C53A9EB272F9FDA000B", "retryTimes": 0, "status": "success", "storeHost": "127.0.0.1", "tags": "", "timeStamp": 1715150232539, "topic": "", "traceType": "SubAfter", "transactionState": "NOT_YET_CHECK_STATE" } ] ``` ### What Did You Expect to See? For scenario 1: The message should have been consumed only once. For scenario 2: The message shouldn't have been consumed after being rollbacked. For scenario 3: The check method should have stopped after the producer sends the final state of local transaction(commit/rollback). For scenario 4: The message should have been rollbacked after the number of check times reaches its maximum, which is by default 15. ### What Did You See Instead? For scenario 1: The message is consumed twice. For scenario 2: The message can still be consumed after being rollbacked first. For scenario 3: The check method continues after the producer sends the final state of local transaction(commit/rollback). For scenario 4: The message can still be consumed by sending COMMIT to the broker even after the number of check times reaches its maximum, which is by default 15. ### Additional Context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@rocketmq.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org