Barrett Oglesby created GEODE-9664:
--------------------------------------
Summary: Two different clients with the same durable id will both
connect to the servers and receive messages
Key: GEODE-9664
URL: https://issues.apache.org/jira/browse/GEODE-9664
Project: Geode
Issue Type: Bug
Components: client queues
Reporter: Barrett Oglesby
There are two cases:
# The number of queues is the same as the number of servers (e.g. client with
subscription-redundancy=1 and 2 servers)
# The number of queues is less than the number of servers (e.g. client with
subscription-redundancy=0 and 2 servers)
h2. Case 1
In this case, the client first attempts to connect to the primary and fails.
{noformat}
[warn 2021/10/01 14:37:56.209 PDT server-1 <Client Queue Initialization Thread
1> tid=0x4b] XXX CacheClientNotifier.registerClientInternal about to register
clientProxyMembershipID=identity(127.0.0.1(client-a-2:89832:loner):61596:fad3ca3d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
timeout=300])
[warn 2021/10/01 14:37:56.209 PDT server-1 <Client Queue Initialization Thread
1> tid=0x4b] XXX CacheClientNotifier.registerClientInternal existing
proxy=CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
timeout=300]); port=61581; primary=true; version=GEODE 1.15.0]
[warn 2021/10/01 14:37:56.210 PDT server-1 <Client Queue Initialization Thread
1> tid=0x4b] XXX CacheClientNotifier.registerClientInternal existing proxy
isPaused=false
[warn 2021/10/01 14:37:56.210 PDT server-1 <Client Queue Initialization Thread
1> tid=0x4b] The requested durable client has the same identifier ( client-a )
as an existing durable client (
CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
timeout=300]); port=61581; primary=true; version=GEODE 1.15.0] ). Duplicate
durable clients are not allowed.
[warn 2021/10/01 14:37:56.210 PDT server-1 <Client Queue Initialization Thread
1> tid=0x4b] CacheClientNotifier: Unsuccessfully registered client with
identifier
identity(127.0.0.1(client-a-2:89832:loner):61596:fad3ca3d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
timeout=300]) and response code 64
{noformat}
It then attempts to connect to the secondary and succeeds.
{noformat}
[warn 2021/10/01 14:37:56.215 PDT server-2 <Client Queue Initialization Thread
1> tid=0x47] XXX CacheClientNotifier.registerClientInternal about to register
clientProxyMembershipID=identity(127.0.0.1(client-a-2:89832:loner):61596:fad3ca3d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
timeout=300])
[warn 2021/10/01 14:37:56.215 PDT server-2 <Client Queue Initialization Thread
1> tid=0x47] XXX CacheClientNotifier.registerClientInternal existing
proxy=CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
timeout=300]); port=61578; primary=false; version=GEODE 1.15.0]
[warn 2021/10/01 14:37:56.216 PDT server-2 <Client Queue Initialization Thread
1> tid=0x47] XXX CacheClientNotifier.registerClientInternal existing proxy
isPaused=true
[warn 2021/10/01 14:37:56.217 PDT server-2 <Client Queue Initialization Thread
1> tid=0x47] XXX CacheClientNotifier.registerClientInternal reinitialized
existing
proxy=CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
timeout=300]); port=61578; primary=true; version=GEODE 1.15.0]
{noformat}
The previous secondary is reinitialized and made into a primary. Both queues
will dispatch events.
The CacheClientNotifier.registerClientInternal method invoked when a client
connects does:
{noformat}
if (cacheClientProxy.isPaused()) {
...
cacheClientProxy.reinitialize(...);
} else {
unsuccessfulMsg = String.format("The requested durable client has the same
identifier ( %s ) as an existing durable client...);
logger.warn(unsuccessfulMsg);
}
{noformat}
The CacheClientProxy is paused when the durable client it represents has
disconnected. Unfortunately, a secondary CacheClientProxy is also paused. So,
this check is not good enough to prevent a duplicate durable client from
connecting.
There are a few things that can also be checked. One of them is:
{noformat}
cacheClientProxy.getCommBuffer() == null
{noformat}
With that check added, when the client attempts to connect to the secondary, it
fails just like the it does with the primary.
The client then exits with this exception:
{noformat}
geode.cache.NoSubscriptionServersAvailableException:
org.apache.geode.cache.NoSubscriptionServersAvailableException: Could not
initialize a primary queue on startup. No queue servers available.
at
org.apache.geode.cache.client.internal.QueueManagerImpl.getAllConnections(QueueManagerImpl.java:191)
at
org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnQueuesAndReturnPrimaryResult(OpExecutorImpl.java:428)
at
org.apache.geode.cache.client.internal.PoolImpl.executeOnQueuesAndReturnPrimaryResult(PoolImpl.java:875)
at
org.apache.geode.cache.client.internal.RegisterInterestOp.execute(RegisterInterestOp.java:58)
at
org.apache.geode.cache.client.internal.ServerRegionProxy.registerInterest(ServerRegionProxy.java:364)
at
org.apache.geode.internal.cache.LocalRegion.processSingleInterest(LocalRegion.java:3815)
at
org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3911)
at
org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3890)
at
org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3885)
at
org.apache.geode.cache.Region.registerInterestForAllKeys(Region.java:1709)
Caused by: org.apache.geode.cache.NoSubscriptionServersAvailableException:
Could not initialize a primary queue on startup. No queue servers available.
at
org.apache.geode.cache.client.internal.QueueManagerImpl.initializeConnections(QueueManagerImpl.java:575)
at
org.apache.geode.cache.client.internal.QueueManagerImpl.start(QueueManagerImpl.java:293)
at
org.apache.geode.cache.client.internal.PoolImpl.start(PoolImpl.java:359)
at
org.apache.geode.cache.client.internal.PoolImpl.finishCreate(PoolImpl.java:183)
at
org.apache.geode.cache.client.internal.PoolImpl.create(PoolImpl.java:169)
at
org.apache.geode.internal.cache.PoolFactoryImpl.create(PoolFactoryImpl.java:378)
{noformat}
h2. Case 2
In this case, the client first attempts to connect to the primary and fails
just like case 1.
It will then attempt to connect a server with no existing queue and succeed.
{noformat}
[warn 2021/10/01 15:02:50.798 PDT server-1 <Client Queue Initialization Thread
1> tid=0x54] XXX CacheClientNotifier.registerClientInternal about to register
clientProxyMembershipID=identity(127.0.0.1(client-a-2:91683:loner):62089:24a2e13d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
timeout=300])
[warn 2021/10/01 15:02:50.799 PDT server-1 <Client Queue Initialization Thread
1> tid=0x54] XXX CacheClientNotifier.registerClientInternal existing proxy=null
[warn 2021/10/01 15:02:50.810 PDT server-1 <Client Queue Initialization Thread
1> tid=0x54] XXX CacheClientNotifier.registerClientInternal created
proxy=CacheClientProxy[identity(127.0.0.1(client-a-2:91683:loner):62089:24a2e13d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
timeout=300]); port=62094; primary=true; version=GEODE 1.15.0]
{noformat}
One way to address this case would be to prevent the durable client from
retrying to another server if it can't connect to the primary.
That would have be addressed in QueueManagerImpl.initializeConnections. That
method would have to know that the server refused the connection
(ServerRefusedConnectionException) and then return out of that method.
Thats a bit more work since that method currently doesn't get any exceptions
from initializeQueueConnection which does:
{noformat}
} catch (Exception e) {
if (logger.isDebugEnabled()) {
logger.debug("error creating subscription connection to server {}",
connection.getEndpoint(), e);
}
}
{noformat}
An exception handler like this would need to be added:
{noformat}
} catch (ServerRefusedConnectionException e) {
throw e;
}
{noformat}
QueueManagerImpl.initializeConnections would have to handle that exception in a
few places. I'm not sure exactly what should be done in that method.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)