Guoliang Sun created KYLIN-6059: ----------------------------------- Summary: When the model is in a "broken" state, the build task status becomes abnormal Key: KYLIN-6059 URL: https://issues.apache.org/jira/browse/KYLIN-6059 Project: Kylin Issue Type: Bug Affects Versions: 5.0.0 Reporter: Guoliang Sun Assignee: Guoliang Sun Fix For: 5.0.2
During the execution of the build task, the fact table of the model was deleted, causing the model to enter a "broken" state. The build task should have transitioned to the "discard" state via `suicideJob`, but an exception occurred instead. The `kylin.log` is as follows: {code:java} 2024-12-12T14:26:42,651 WARN [JobCheckThreadPool] runners.JobCheckUtil : [UNEXPECTED_THINGS_HAPPENED] job e6a33368-6f6f-eac4-8295-62d04d30e443-ce504 fd8-30e7-67b9-9670-dc67d59e8ecd should be suicidal but discard failed org.apache.kylin.common.persistence.metadata.PersistException: persist messages failed at org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTransaction(JdbcUtil.java:149) ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTransaction(JdbcUtil.java:122) ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTxAndRetry(JdbcUtil.java:84) ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTxAndRetry(JdbcUtil.java:64) ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.job.util.JobContextUtil.withTxAndRetry(JobContextUtil.java:292) ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.job.util.JobContextUtil.withTxAndRetry(JobContextUtil.java:287) ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.job.execution.ExecutableManager.suicideJob(ExecutableManager.java:1706) ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.job.runners.JobCheckUtil.markSuicideJob(JobCheckUtil.java:99) ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.job.runners.JobCheckUtil.markSuicideJob(JobCheckUtil.java:91) ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.job.runners.JobCheckRunner.markSuicideForErrorOrPausedJobs(JobCheckRunner.java:134) ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.job.runners.JobCheckRunner.run(JobCheckRunner.java:116) ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_181] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_181] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_181] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181] Caused by: java.lang.NullPointerException: current thread is not accompanied by a UnitOfWork at org.apache.kylin.guava30.shaded.common.base.Preconditions.checkNotNull(Preconditions.java:897) ~[kylin-external-guava30-5.0.0.jar:?] at org.apache.kylin.common.persistence.transaction.UnitOfWork.get(UnitOfWork.java:227) ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.common.persistence.transaction.UnitOfWork.isReadonly(UnitOfWork.java:369) ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.common.persistence.InMemResourceStore.checkEnv(InMemResourceStore.java:231) ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.common.persistence.InMemResourceStore.deleteResourceImpl(InMemResourceStore.java:178) ~[kylin-core-common-5.0.0-SNAPSHOT.j ar:?] at org.apache.kylin.common.persistence.ResourceStore.deleteResource(ResourceStore.java:346) ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.metadata.cachesync.CachedCrudAssist.delete(CachedCrudAssist.java:306) ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.metadata.cachesync.CachedCrudAssist.delete(CachedCrudAssist.java:297) ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.metadata.Manager.delete(Manager.java:195) ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.metadata.cube.model.NDataflowManager.lambda$updateDataflowWithoutIndex$21(NDataflowManager.java:686) ~[kylin-core-metadata -5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.metadata.cube.model.NDataflowManager.updateDataflow(NDataflowManager.java:600) ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.metadata.cube.model.NDataflowManager.updateDataflowWithoutIndex(NDataflowManager.java:661) ~[kylin-core-metadata-5.0.0-SNA PSHOT.jar:?] at org.apache.kylin.metadata.cube.model.NDataflowManager.updateDataflow(NDataflowManager.java:648) ~[kylin-core-metadata-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.engine.spark.job.NSparkCubingJob.cancelJob(NSparkCubingJob.java:274) ~[kylin-engine-spark-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.job.execution.ExecutableManager.suicideJob(ExecutableManager.java:1741) ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.job.execution.ExecutableManager.lambda$suicideJob$88(ExecutableManager.java:1708) ~[kylin-core-job-5.0.0-SNAPSHOT.jar:?] at org.apache.kylin.common.persistence.metadata.jdbc.JdbcUtil.withTransaction(JdbcUtil.java:133) ~[kylin-core-common-5.0.0-SNAPSHOT.jar:?] ... 17 more {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)