[
https://issues.apache.org/jira/browse/GEODE-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17358858#comment-17358858
]
ASF subversion and git services commented on GEODE-9248:
--------------------------------------------------------
Commit 7289b77b2ccad9423e5b7dd7e7367951d8574007 in geode's branch
refs/heads/develop from Jakov Varenina
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=7289b77 ]
GEODE-9248: Server hosting CQ queue uneccessary fills bucketToTempQue… (#6477)
* Issue reproduces when following conditions are fulfilled:
- Redundant partition region must be configured
- Number of servers must be greater than number of redundant copies of
partition region
- Parallel gateway sender must be configured on partition region
- Client must register CQs for the region
- Transactions must be used with put operations
- Events must be enqueued in parallel gateway senders (remote site is
unavailable)
* Server that is hosting primary bucket will send TXCommitMessage to the
servers that are hosting secondary buckets, and also to the server that are
hosting CQ subscription queue (if CQ condition is fulfilled). The problem
occures when the server that is hosting CQ subscription queue does not hosting
the bucket for which event is received.
* In this case the server will store the event in bucketToTempQueueMap because
it assumes that the bucket is in the process of the creation, which is not
correct.
* Solution:
* Normal put operation (transaction is not used) is distributed to adjunct
member (only host sub. queue, and not bucket) using seperate
partition.PutMessage which indicates that it is notificationOnly, and also sets
TailKey to -1. Adjunct member will not store event in queue, because TailKey is
always expected to be set in case of parallel gateway-sender. Members that host
secondary bucket will receive the UpdateOperation message with valid TailKey
value.
* So this new solution implements the similar approach to what is done in case
of normal put operation. TxCommitMessage with TailKey set to -1 will be sent
only to adjunct member that host subscription queue, and not secondary bucket.
Other members will get valid TailKey value. Also, member that host both
secondary bucket and subscription queue will get valid TailKey value.
* Fix for NullPointerException
* Added test case that verifies bucketToTempQueueMap behavior. This test case
verifies that the server during bucket recovery enqueues all events intended
for that bucket in temporary queue, and that after bucket redundancy is
restored events are transferred from temporary queue to bucket queue.
* Fix for TXEntryStateWithRegionAndKey parameterization warnings
* Minor changes in ParallelGatewaySenderAndCQDurableClientDUnitTest test
Co-authored-by: Jakov Varenina <[email protected]>
> Server hosting cq subscription queue uneccessary fills bucketToTempQueueMap
> while in multi site split brain
> -----------------------------------------------------------------------------------------------------------
>
> Key: GEODE-9248
> URL: https://issues.apache.org/jira/browse/GEODE-9248
> Project: Geode
> Issue Type: Bug
> Reporter: Jakov Varenina
> Assignee: Jakov Varenina
> Priority: Major
> Labels: pull-request-available
>
> The problem reproduces when you use transactions and have more servers than
> redundant copies of the partition region, and also events are queued in
> parallel gateway-senders due to ongoing multi-site split brain. In this case
> all members send events to the member with subscription queue, which then
> fills variable *bucketToTempQueueMap* with traffic intended for the buckets
> that it doesn't host.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)