Pierre Salagnac created SOLR-17972:
--------------------------------------
Summary: DistributedMultiLock can fail to release some locks if ZK
connection loss occurs
Key: SOLR-17972
URL: https://issues.apache.org/jira/browse/SOLR-17972
Project: Solr
Issue Type: Bug
Reporter: Pierre Salagnac
This bug occurs only when run the cluster with distributed cluster processing
(no overseer).
If a Zookeeper connection loss occurs when creating one of the locks of a
{{DistributedMultiLock}}, any other locks of the same multi-lock that were
already created will not be released., This will prevent the non releases lock
to be acquired again by other operations until the session is lost, causing
removal of the ephemeral node.
Additionally, cluster maintenance operations will wait forever to acquire the
required locks, and consume thread from the node pool for distributed cluster
operations. Eventually, all thread will be used future operation will be
rejected because the queue is full.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]