[
https://issues.apache.org/jira/browse/SOLR-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705052#comment-14705052
]
Yonik Seeley commented on SOLR-7836:
------------------------------------
bq. pulls out the problematic open searcher in ulog.add to a separate method.
There are a few areas with complex synchronization that should not be changed
unless one is confident about understanding why all the synchronization was
there in the first place. Having the tests pass isn't a high enough bar for
these areas because of the difficulty in actually getting a test to expose
subtle race conditions or thread safety issues. This comes back to my original
"get it back in my head" - I don't fee comfortable messing with this stuff
either until I've really internalized the bigger picture again... and it
doesn't last ;-)
For the specific case above, one can't just take what was one synchronized
block and break it up into two. It certainly creates race conditions and
breaks the invariants we try to keep. The specific invariant here is that if
it's not in the tlog maps, then it is guaranteed to be in the realtime reader.
Hopefully some of our tests would fail with this latest patch... but it's hard
stuff to test.
I worked up a patch that passed down the IndexWriter (it needs to be passed
*all* the way down to SolrCore.openSearcher to actually avoid deadlocks). That
ended up changing more code than I'd like... so now I'm working up a patch to
make IW locking re-entrant. That approach should be less fragile going forward
(i.e. less likely to easily introduce a deadlock through seemingly unrelated
changes).
> Possible deadlock when closing refcounted index writers.
> --------------------------------------------------------
>
> Key: SOLR-7836
> URL: https://issues.apache.org/jira/browse/SOLR-7836
> Project: Solr
> Issue Type: Bug
> Reporter: Erick Erickson
> Assignee: Erick Erickson
> Fix For: Trunk, 5.4
>
> Attachments: SOLR-7836-reorg.patch, SOLR-7836-synch.patch,
> SOLR-7836.patch, SOLR-7836.patch, SOLR-7836.patch, deadlock_3.res.zip,
> deadlock_5_pass_iw.res.zip, deadlock_test
>
>
> Preliminary patch for what looks like a possible race condition between
> writerFree and pauseWriter in DefaultSorlCoreState.
> Looking for comments and/or why I'm completely missing the boat.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]