Zhiting Guo created KYLIN-5636:
----------------------------------

             Summary: automatically clean up dependent files after the build 
task
                 Key: KYLIN-5636
                 URL: https://issues.apache.org/jira/browse/KYLIN-5636
             Project: Kylin
          Issue Type: Improvement
          Components: Tools, Build and Test
    Affects Versions: 5.0-alpha
            Reporter: Zhiting Guo
             Fix For: 5.0-alpha


*question:*
The files uploaded under the path spark.kubernetes.file.upload.path are not 
automatically deleted
1: When spark creates a driverPod, it uploads dependencies to the specified 
path. The build task is in cluster mode and needs to create a driverPod. 
Running the build task multiple times results in a large path file.
2: At present, the upload.path path we configured (s3a://kylin/spark-on-k8s) is 
a fixed path, and spark will create a subdirectory in this directory, the 
spark-upload-uuid directory, and then store the dependencies in it.
*dev design*
Core idea, add dynamic subdirectory under the original upload.path path, delete 
the entire subdirectory when the task is over
Build task: upload.path + jobId (e.g. s3a://kylin/spark-on-k8s/uuid)
Delete the dependency directory when the build task is finished
 
Automatically delete dependent function is called, kill-9 situation will lead 
to the deletion function is not called, garbage cleaning function needs to be 
added to the bottom of the policy, such as greater than three months before the 
directory is automatically deleted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to