[ 
https://issues.apache.org/jira/browse/FLINK-35833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888190#comment-17888190
 ] 

Mate Czagany edited comment on FLINK-35833 at 10/10/24 8:03 AM:
----------------------------------------------------------------

I have looked into this yesterday, and I think we could move the call to 
ArtifactUtils.createMissingParents to HttpArtifactFetcher and FsArtifactFetcher 
as advised here. Or we could leave it like this, and don't crash the program 
upon any IO exceptions during creating that directory, just log it.

WDYT [~fcsaky] ?


was (Author: JIRAUSER298834):
I have looked into this yesterday, and I think we could move the call to 
ArtifactUtils.createMissingParents to HttpArtifactFetcher and FsArtifactFetcher 
as advised in the other ticket. Or we could leave it like this, and don't crash 
the program upon any IO exceptions during creating that directory, just log it.

WDYT [~fcsaky] ?

> ArtifactFetchManager always creates artifact dir
> ------------------------------------------------
>
>                 Key: FLINK-35833
>                 URL: https://issues.apache.org/jira/browse/FLINK-35833
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / Kubernetes
>    Affects Versions: 1.19.0, 1.19.1
>            Reporter: Dylan Meissner
>            Priority: Critical
>
> FLINK-28915 added support for remote job jar fetching (HTTPS, S3, etc) but 
> broke the default behavior of local jar when running application on 
> non-writable filesystems. ArtifactFetchManager always attempts to create an 
> artifact directory, even when jar is using "local" protocol.
> Running application on non-writable filesystem is a common scenario in 
> environments when jar is published with the Docker container image.
> A local jar has no need to be fetched to an intermediate directory, since 
> it's already available on the local filesytem. The LocalArtifactFetcher does 
> not write to the filesystem. However, the ArtifactFetchManager always 
> attempts to create a directory before fetching, regardless of which fetcher 
> would do the work. On non-writable filesystem and environments lacking 
> permissions, the outcome is a runtime exception:
> {{java.lang.RuntimeException: org.apache.flink.util.FlinkRuntimeException: 
> Failed}}
> {{to create parent(s) for given base dir:}}
> {{/opt/flink/artifacts/<namesapce>/<job name>}}
> {{    at 
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:158)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.getPackagedProgramRetriever(KubernetesApplicationClusterEntrypoint.java:129)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.getPackagedProgram(KubernetesApplicationClusterEntrypoint.java:111)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.lambda$main$0(KubernetesApplicationClusterEntrypoint.java:85)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.main(KubernetesApplicationClusterEntrypoint.java:85)
>  [flink-dist-1.19.1.jar:1.19.1]}}
> {{Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to create 
> parent(s) for given base dir: 
> /opt/flink/artifacts/app07772/sample-app-flink-1-19}}
> {{    at 
> org.apache.flink.client.program.artifact.ArtifactUtils.createMissingParents(ArtifactUtils.java:50)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.flink.client.program.artifact.ArtifactFetchManager.fetchArtifacts(ArtifactFetchManager.java:123)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:156)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    ... 5 more}}
> {{Caused by: java.io.IOException: Cannot create directory 
> '/opt/flink/artifacts/<namespace>'.}}
> {{    at org.apache.commons.io.FileUtils.mkdirs(FileUtils.java:2289) 
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at org.apache.commons.io.FileUtils.forceMkdir(FileUtils.java:1376) 
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.commons.io.FileUtils.forceMkdirParent(FileUtils.java:1394) 
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.flink.client.program.artifact.ArtifactUtils.createMissingParents(ArtifactUtils.java:46)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.flink.client.program.artifact.ArtifactFetchManager.fetchArtifacts(ArtifactFetchManager.java:123)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    at 
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:156)
>  ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{    ... 5 more}}
> A workaround is to always specify a location using configuration that allows 
> the process to create directories e.g., user.artifacts.base-dir: /tmp/foo.
> A solution proposal is to enable each fetcher to decide whether to create the 
> intermediate directory or fail.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to