[ https://issues.apache.org/jira/browse/FLINK-8856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391298#comment-16391298 ]
ASF GitHub Bot commented on FLINK-8856: --------------------------------------- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/5658#discussion_r173170798 --- Diff: flink-core/src/main/java/org/apache/flink/util/ExceptionUtils.java --- @@ -81,12 +81,15 @@ public static String stringifyException(final Throwable e) { * <p>Currently considered fatal exceptions are Virtual Machine errors indicating * that the JVM is corrupted, like {@link InternalError}, {@link UnknownError}, * and {@link java.util.zip.ZipError} (a special case of InternalError). + * The {@link ThreadDeath} exception is also treated as a fatal error, because when + * a thread is forcefully stopped, there is a high chance that parts of the system + * is in an inconsistent state. --- End diff -- nit/typo: "are in an inconsistent state"? > Move all interrupt() calls to TaskCanceler > ------------------------------------------ > > Key: FLINK-8856 > URL: https://issues.apache.org/jira/browse/FLINK-8856 > Project: Flink > Issue Type: Bug > Components: TaskManager > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Priority: Blocker > Fix For: 1.5.0, 1.6.0 > > > We need this to work around the following JVM bug: > https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8138622 > To circumvent this problem, the {{TaskCancelerWatchDog}} must not call > {{interrupt()}} at all, but only join on the executing thread (with timeout) > and cause a hard exit once cancellation takes to long. > A user affected by this problem reported this in FLINK-8834 > Personal note: The Thread.join(...) method unfortunately is not 100% reliable > as well, because it uses {{System.currentTimeMillis()}} rather than > {{System.nanoTime()}}. Because of that, sleeps can take overly long when the > clock is adjusted. I wonder why the JDK authors do not follow their own > recommendations and use {{System.nanoTime()}} for all relative time > measures... > EDIT: I am not the only one wondering why: > https://stackoverflow.com/questions/42544387/why-does-thread-join-use-currenttimemillis -- This message was sent by Atlassian JIRA (v7.6.3#76005)