Chuck,
On 24. 10. 2016, at 18:23, Chuck Hill <[email protected]> wrote:
> Running “sudo jstack –F <process id>” should dump a trace of all threads.
> Should…
Luckily, it did; just it is sort of weird that it identifies threads by name in
the deadlock report, but by some unknown ID (perhaps hash of the Thread
instance, looks like that) in the traces, which makes for sort of difficult
parsing. I guess I'm not the first one to swear :)
> ... As for the hang, the code in the logging does seem like a likely culprit.
Actually it looks like the culprit is log4j itself?!? What the heck! Anyroad,
here's the deadlock report:
===
"model.ReportPDFTask@5f6ce9a5":
waiting to lock Monitor@0x000000012c0008a8 (Object@0x00000003f7efb980, a
org/apache/log4j/spi/RootLogger),
which is held by "WorkerThread6"
"WorkerThread6":
waiting for ownable synchronizer 0x00000003f9538960, (a
java/util/concurrent/locks/ReentrantLock$NonfairSync),
which is held by "model.ReportPDFTask@5f6ce9a5"
===
the WorkerThread6 problem happens in the “log.info("$tch")” part, it looks like
this (and although changesFromCommittedSnapshot happens to be part of the
culprit, I believe the problem is actually caused by the logger itself):
instead of rendering its argument immediately, as soon as it is determined that
we should log, it does that self-evidently under a lock:
=== Worker6 (cleaned up considerably) ===
- er.extensions.eof.ERXEC.lockObjectStore() // the lock W6 [2] hangs on,
caused by...
- er.extensions.eof.ERXGenericRecord.changesFromCommittedSnapshot() // ...
changesFromCommittedSnapshot ...
- cz.ocs.model.OCSEnterpriseObject.toString() // ... which yours truly dumbly
uses in toString, alas!
- org.apache.log4j.or.DefaultRenderer.doRender(java.lang.Object) // ... this
thing happens INSIDE of log4j code
... lot of frames here; self-evidently, some of them acquires and holds
Monitor@0x000000012c0008a8 [1] ...
- org.apache.log4j.Category.info(java.lang.Object) // this is the
log.info("$tch") of mine
... the loop and the other standard stuff up to dispatchRequest and
WOWorkerThread.run() ...
===
The "model.ReportPDFTask" actually does not do anything wrong (far as I can
say) -- it just fetches. A fetch presumably (and understandably) would lock the
object store. With locked object store, it... logs! (I do not think the stack
trace is important here; anyway, I have attached it at the end of this message
for reference.)
Nothing wrong there I believe, it feels right to be able to log with a locked
object store; but it causes deadlock, for the log tries to acquire its
Monitor@0x000000012c0008a8 -- which is held by Worker6.
Now, although in my case the culprit happens to be unneeded (and generally
dangerous) changesFromCommittedSnapshot, I believe that
(a) whatever which locks object store in the log would cause deadlock just as
well
(b) which can be e.g., a fault fired, or lots of other perfectly valid things
to log.
I do understand why log4j methods accept generic Object and render the result
inside -- it is reasonable not to render anything unless we want to log with
the current setting.
What seems to me to be a grave bug though is that this rendering happens under
a lock, which causes a deadlock if two threads log concurrently and it so
happens that
(i) one of them logs under object store lock (which I believe is valid and
correct)
(ii) the other's log contents causes an object store lock (which again I
believe is valid and correct, see (a) and (b) above).
So, well, is there indeed a grave bug inside of the ubiquitous log4j, or am I
overlooking something of importance?
And if there is a bug, is there anything better one can do to work around it
than using toStrings for all non-trivial logs, like
“log.info("non-trivial-contents".toString())”, which would affect efficiency
pretty bad?
Thanks and all the best,
OC
=== "model.ReportPDFTask@5f6ce9a5" slightly cleaned up ===
- org.apache.log4j.Category.callAppenders(org.apache.log4j.spi.LoggingEvent)
@bci=12, line=204 (Compiled frame) // hangs on Monitor@0x000000012c0008a8,
which is held by [1] above
- org.apache.log4j.Category.forcedLog(java.lang.String,
org.apache.log4j.Priority, java.lang.Object, java.lang.Throwable) @bci=14,
line=391 (Compiled frame)
- org.apache.log4j.Category.info(java.lang.Object) @bci=38, line=666 (Compiled
frame)
-
er.extensions.eof.ERXEOAccessUtilities.logExpression(com.webobjects.eoaccess.EOAdaptorChannel,
com.webobjects.eoaccess.EOSQLExpression, long) @bci=406, line=1307 (Compiled
frame)
-
er.extensions.eof.ERXAdaptorChannelDelegate.adaptorChannelDidEvaluateExpression(com.webobjects.eoaccess.EOAdaptorChannel,
com.webobjects.eoaccess.EOSQLExpression) @bci=76, line=81 (Compiled frame)
- com.webobjects.foundation._NSDelegate.perform(java.lang.String,
java.lang.Object, java.lang.Object) @bci=14, line=163 (Compiled frame)
-
er.extensions.jdbc.ERXJDBCAdaptor$Channel.evaluateExpression(com.webobjects.eoaccess.EOSQLExpression)
@bci=2, line=286 (Compiled frame)
-
com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
com.webobjects.eocontrol.EOEditingContext) @bci=97, line=488 (Compiled frame)
// somewhere in this code probably object store is locked [2], which sort of
makes sense
-
com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
com.webobjects.eocontrol.EOEditingContext) @bci=79, line=4069 (Interpreted
frame)
-
er.extensions.eof.ERXEC.objectsWithFetchSpecification(com.webobjects.eocontrol.EOFetchSpecification,
com.webobjects.eocontrol.EOEditingContext) @bci=10, line=1308 (Interpreted
frame)
- model._DBReport.fetchDBReports(com.webobjects.eocontrol.EOEditingContext,
com.webobjects.eocontrol.EOQualifier, com.webobjects.foundation.NSArray)
@bci=103, line=141 (Interpreted frame) // my code which fetches
... the rest, I believe, is irrelevant ...
===
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list ([email protected])
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com
This email sent to [email protected]