[ https://issues.apache.org/jira/browse/SOLR-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17527522#comment-17527522 ]
Michael Gibney commented on SOLR-16165: --------------------------------------- Could open this as a separate issue/PR, but I thought it'd be worth checking for other instances of this pattern in the codebase. I attached [^StaticInitializerReferencesSubClass.xml], a report from Intellij/IDEA that points out other cases (including the SlotAcc case). I'd guess that the SlotAcc case is particularly likely to manifest as deadlock because the subclass reference is buried near the end of the class. The others seem to be near the _beginning_ of their class, so perhaps a narrower window to manifest as deadlock? Or perhaps there's something about the context in which the other cases are called that makes them not vulnerable (or less vulnerable) in practice? As a proof-of-concept I tried integrating palantir's `baseline-error-prone` gradle plugin, which [adds a check for ClassInitializationDeadlock|https://blog.palantir.com/using-static-analysis-to-prevent-java-class-initialization-deadlocks-c2f31ca967d6]. It caught the DocRouter and TimeSource cases, but not any of the others (including SlotAcc). > Deadlock in SlotAcc initialization > ---------------------------------- > > Key: SOLR-16165 > URL: https://issues.apache.org/jira/browse/SOLR-16165 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting > Affects Versions: 9.0, 8.11.1, 9.1 > Reporter: Justin Sweeney > Assignee: Noble Paul > Priority: Minor > Attachments: StaticInitializerReferencesSubClass.xml > > Time Spent: 10m > Remaining Estimate: 0h > > The core of the issue is that if a parent class reference an instance of its > own child class as a static field, a deadlock can be created if 1 thread > tries to access the parent class and another thread to the child class. > h3. Thread A > "qtp1393828949-98" #98 prio=5 os_prio=0 cpu=294.10ms elapsed=6252.75s > allocated=53246K defined_classes=243 tid=0x00007fa47c007000 nid=0x349c4e in > Object.wait() [0x00007f9896620000] > java.lang.Thread.State: RUNNABLE > at org.apache.solr.search.facet.SlotAcc.<clinit>(SlotAcc.java:830) > at > org.apache.solr.search.facet.FacetFieldProcessorByHashDV.createCollectAcc(FacetFieldProcessorByHashDV.java:271) > at > org.apache.solr.search.facet.FacetFieldProcessorByHashDV.calcFacets(FacetFieldProcessorByHashDV.java:255) > h3. Thread B > "qtp1393828949-2379" #2379 prio=5 os_prio=0 cpu=34.52ms elapsed=6013.46s > allocated=20426K defined_classes=0 tid=0x00007fa49c081800 nid=0x34a58b in > Object.wait() [0x00007f5fcfae7000] > java.lang.Thread.State: RUNNABLE > at > org.apache.solr.search.facet.FacetFieldProcessorByArray.createCollectAcc(FacetFieldProcessorByArray.java:85) > at > org.apache.solr.search.facet.FacetFieldProcessorByArray.calcFacets(FacetFieldProcessorByArray.java:144) > at > org.apache.solr.search.facet.FacetFieldProcessorByArray.process(FacetFieldProcessorByArray.java:94) > ... # Thread A : FacetFieldProcessorByHashDV.java:271 {{{}indexOrderAcc = new > SlotAcc(fcontext) {{}}}, which accesses class {{{}SlotAcc{}}}, it would have > a class init lock on {{{}SlotAcc{}}}(assuming first time loading {{SlotAcc}} > in classloader) but BEFORE run to line SlotAcc.java:830 > # Thread B: FacetFieldProcessorByArray.java:85 {{{}countAcc = new > SweepingCountSlotAcc(numSlots, this);{}}}. Accesses {{SweepingCountSlotAcc}} > (also assuming first time loading {{SweepingCountSlotAcc}} in classloader), > loads and initialize based on hierarchy {{SweepingCountSlotAcc}} -> > {{CountSlotArrAcc}} -> {{CountSlotAcc}} -> {{{}SlotAcc{}}}, obtain lock and > initialize > {{{}SweepingCountSlotAcc{}}},{{{}CountSlotArrAcc{}}},{{{}CountSlotAcc{}}} but > blocked on loading/initializing parent class {{{}SlotAcc{}}}, since Thread A > has lock and is already initializing it > # Thread A: run to line 830 {{static final CountSlotAcc DEV_NULL_SLOT_ACC = > new CountSlotAcc(null)...}} Found {{{}CountSlotAcc{}}}, it will attempt to > load {{CountSlotAcc}} as well, but such lock is held by Thread B -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org