[ 
https://issues.apache.org/jira/browse/SOLR-17720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17939027#comment-17939027
 ] 

David Smiley commented on SOLR-17720:
-------------------------------------

Collection Properties is definitely ripe for a Curator oriented overhaul.  We 
shouldn't have so much low level code around it!

> Deadlock in CollectionPropertiesZkStateReader
> ---------------------------------------------
>
>                 Key: SOLR-17720
>                 URL: https://issues.apache.org/jira/browse/SOLR-17720
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrJ
>    Affects Versions: 9.7
>            Reporter: Houston Putman
>            Priority: Blocker
>             Fix For: 9.9
>
>
> {{CollectionPropertiesZkStateReader}} has multiple different mechanisms for 
> synchronizing when modifying its concurrent data structures.
>  # {{synchronized (getCollectionLock(collection))}} 
>  # {{collectionPropsObservers}} is a ConcurrentHashMap, and therefore locks 
> on updating a single key within the map.
> Unfortunately this can cause a deadlock.
> In {{CollectionPropertiesZkStateReader.removeCollectionPropsWatcher()}},  
> {{collectionPropsObservers.compute(collection, <function>)}} is used which 
> will create a lock in {{collectionPropsObservers}} on the {{collection}} key. 
> Within this locked {{<function>}} command, {{synchronized 
> (getCollectionLock(collection))}} is called.
> In {{CollectionPropertiesZkStateReader.refreshAndWatch()}}, {{synchronized 
> (getCollectionLock(coll))}} is used for the whole method. And within this 
> synchronized block, {{collectionPropsObservers.remove(coll)}} is called 
> (which will obviously get a lock on the {{coll}} key for 
> {{collectionPropsObservers}}.
> So {{CollectionPropertiesZkStateReader.removeCollectionPropsWatcher()}} has 
> the lock for {{collectionPropsObservers}} but is waiting on the lock for 
> {{getCollectionLock(coll)}}. And 
> {{CollectionPropertiesZkStateReader.refreshAndWatch()}} has the lock for 
> {{getCollectionLock(coll)}} and is waiting on the lock for 
> {{collectionPropsObservers}}. Hence deadlock.
> This code is quite complex, and I think it can really be simplified, but 
> that's just a gut reaction. I think moving the {{synchronized 
> (getCollectionLock(collection))}} block in {{removeCollectionPropsWatcher()}} 
> outside of the {{compute()}} call would solve this one deadlock though.
> Hopefully we can really simplify this with Curator though.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to