On Fri, 24 Sep 2021 14:36:07 GMT, Jaikiran Pai <j...@openjdk.org> wrote:
> Can I please get a review for this change which proposes to fix the issue > reported in https://bugs.openjdk.java.net/browse/JDK-8273790? > > As noted in that issue, trying to class load > `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` > concurrently in separate threads can lead to a deadlock because of the cyclic > nature of the code in their static initialization. More specifically, > consider this case: > > - Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. > - This gives T1 the implicit class init lock on `CalendarSystem`. > - Consider thread T2 which at the same time initiates a classload on > `sun.util.calendar.Gregorian` class. > - This gives T2 a implicit class init lock on `Gregorian`. > - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` > since it wants to create a (singleton) instance of `Gregorian` and assign it > to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class > init lock on `Gregorian`, T1 ends up waiting > - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` > itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, > T2 starts travelling up the class hierarchy and asks for a lock on > `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up > waiting on T1 which is waiting on T2. That triggers this deadlock. > > The linked JBS issue has a thread dump which shows this in action. > > The commit here delays the instance creation of `Gregorian` by moving that > instance creation logic from the static initialization of the > `CalendarSystem` class, to the first call to > `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` > from needing a lock on `Gregorian` during its static init (of course, unless > some code in this static init flow calls > `CalendarSystem#getGregorianCalendar()`, in which case it is back to square > one. I have verified, both manually and through the jtreg test, that the code > in question doesn't have such calls) > > A new jtreg test has been introduced to reproduce the issue and verify the > fix. The test in addition to loading these 2 classes in question, also > additionally loads a few other classes concurrently. These classes have > specific static initialization which leads the calls to > `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. > Including these classes in the tests ensures that this deadlock hasn't > "moved" to a different location. I have run multiple runs (approximately 25) > of this test with the fix and I haven't seen it deadlock anymore. Thanks, Jaikiran. Looks good. Some minor comments. src/java.base/share/classes/sun/util/calendar/CalendarSystem.java line 123: > 121: */ > 122: public static Gregorian getGregorianCalendar() { > 123: var gCal = GREGORIAN_INSTANCE; Do we need the local variable `gCal`? test/jdk/sun/util/calendar/CalendarSystemDeadLockTest.java line 43: > 41: * @run main/othervm CalendarSystemDeadLockTest > 42: * @run main/othervm CalendarSystemDeadLockTest > 43: * @run main/othervm CalendarSystemDeadLockTest Just curious, before the fix, how many test instances caused the deadlock? I'd think these 5 runs are arbitrary numbers, Just wanted to have those 5 runs are appropriate. test/jdk/sun/util/calendar/CalendarSystemDeadLockTest.java line 75: > 73: tasks.add(new GetGregorianCalTask(taskTriggerLatch)); > 74: tasks.add(new GetGregorianCalTask(taskTriggerLatch)); > 75: final ExecutorService executor = > Executors.newFixedThreadPool(tasks.size()); Asserting `tasks.size() == numTasks` may help here. test/jdk/sun/util/calendar/CalendarSystemDeadLockTest.java line 171: > 169: } > 170: } > 171: } Need a new line at the EOF. ------------- PR: https://git.openjdk.java.net/jdk/pull/5683