Zhiting Guo created KYLIN-5646:
----------------------------------

             Summary: The build job reports an error at the step of detecting 
time partition columns in the Yarn Cluster mode
                 Key: KYLIN-5646
                 URL: https://issues.apache.org/jira/browse/KYLIN-5646
             Project: Kylin
          Issue Type: Bug
          Components: Tools, Build and Test
    Affects Versions: 5.0-alpha
            Reporter: Zhiting Guo
             Fix For: 5.0-alpha


When building Spark YARN-Cluster mode, when detecting incremental time 
partition columns, initializing KylinConfig reports an error Didn't find 
KYLIN_HOME or KYLIN_HOME

*Reproduce method*

Build the partition table model incrementally using Spark YARN_Cluster mode, 
and set kylin.engine.check-partition-col-enabled=true (the default value is 
true)

*Root Cause*

Modified the autoSetShufflePartitions of the pushdown query in [KYLIN-5571], no 
need to execute when the pre-modification build task detects the delta time 
column format (only the pushdown query is executed)

After modification, autoSetShufflePartitions is executed asynchronously, the 
following two methods will get KylinConfig through 
KylinConfig.getInstanceFromEnv,

At this time, the asynchronous execution of the new thread cannot use the built 
KylinConfig, so the KylinConfig will be initialized,

However, the build task jvm and the KE main process are not the same machine, 
and KYLIN_CONF and KYLIN_HOME cannot be obtained, so the build task fails to run
 * ResourceDetectUtils.getResourceSizeWithTimeoutByConcurrency
 * ResourceDetectUtils.getResourceSizBySerial

*fix design*

In all the logic of newly opened threads, if KylinConfig is used, this method 
KylinConfig.getInstanceFromEnv() is not used. Unified is obtained by an 
external thread and passed to the place where it needs to be used



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to