[ https://issues.apache.org/jira/browse/FLINK-33121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Panagiotis Garefalakis closed FLINK-33121. ------------------------------------------ Release Note: Closing in favor of https://issues.apache.org/jira/browse/FLINK-34922 Resolution: Won't Fix > Failed precondition in JobExceptionsHandler due to concurrent global failures > ----------------------------------------------------------------------------- > > Key: FLINK-33121 > URL: https://issues.apache.org/jira/browse/FLINK-33121 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination > Reporter: Panagiotis Garefalakis > Assignee: Panagiotis Garefalakis > Priority: Major > Labels: pull-request-available > > We make the assumption that Global Failures (with null Task name) may only be > RootExceptions and and Local/Task exception may be part of concurrent > exceptions List (see {{{}JobExceptionsHandler#createRootExceptionInfo{}}}). > However, when the Adaptive scheduler is in a Restarting phase due to an > existing failure (that is now the new Root) we can still, in rare occasions, > capture new Global failures, violating this condition (with an assertion is > thrown as part of {{{}assertLocalExceptionInfo{}}}) seeing something like: > {code:java} > The taskName must not be null for a non-global failure. {code} > We want to ignore Global failures while being in a Restarting phase on the > Adaptive scheduler until we properly support multiple Global failures in the > Exception History as part of https://issues.apache.org/jira/browse/FLINK-34922 > Note: DefaultScheduler does not suffer from this issue as it treats failures > directly as HistoryEntries (no conversion step) -- This message was sent by Atlassian Jira (v8.20.10#820010)