[ https://issues.apache.org/jira/browse/FLINK-33992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823159#comment-17823159 ]
Ahmed Soliman commented on FLINK-33992: --------------------------------------- Hello [~skala] I have some thoughts here and please correct me if I am wrong, In Kubernetes, an {{[initContainer|https://kubernetes.io/docs/concepts/workloads/pods/init-containers/]}} is a special kind of container that runs before the main container in a Pod and completes its task before the main container starts. This is often used for setup tasks that need to be done before the main container can start. If you're using an {{initContainer}} to download the JAR file, you would need to make sure that the main container can access the downloaded file. This is where Kubernetes [volumes|https://kubernetes.io/docs/concepts/storage/volumes/] come in. A Kubernetes volume is essentially a directory that is accessible to all containers running in a Pod. Data in a volume is preserved across container restarts, and it can be shared between multiple containers in a Pod. so that's being said, you might use a volume to share the JAR file between the {{initContainer}} and the main container: # Define a volume in your Pod spec. This could be an {{emptyDir}} volume, which is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node. # In the {{initContainer}} spec, specify a volume mount that points to the volume you defined. Download the JAR file to a path in this volume. # In the main container spec, specify a volume mount that points to the same volume. The main container will now be able to access the JAR file downloaded by the {{{}initContainer{}}}. This way, the {{initContainer}} can download the JAR file and store it in a location that the main container can access, allowing the main container to use the JAR file when it starts. cc: [~gyfora] Do you think the explanation makes sense? if yes, if we think of a case where a session cluster will have tens of session jobs, with different job jars (if this is a valid use case). Is it worth implementing a way to download from private repo in the job spec other than using this initContainer way? I have some thoughts on how to implement it, if we agree that the feature makes sense. > Add option to fetch the jar from private repository in FlinkSessionJob > ---------------------------------------------------------------------- > > Key: FLINK-33992 > URL: https://issues.apache.org/jira/browse/FLINK-33992 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator > Reporter: Sweta Kalakuntla > Priority: Major > > FlinkSessionJob spec does not have a capability to download job jar from > remote private repository. It can currently only download from public > repositories. > Adding capability to supply credentials to the *spec.job.jarURI* in > FlinkSessionJob, will solve that problem. > If I use initContainer to download the jar in FlinkDeployment and try to > access that in FlinkSessionJob, the operator is unable to find the jar in the > defined path. > --- > apiVersion: flink.apache.org/v1beta1 > kind: FlinkSessionJob > metadata: > name: job1 > spec: > deploymentName: session-cluster > job: > jarURI: file:///opt/flink/job.jar > parallelism: 4 > upgradeMode: savepoint > (edited) > caused by: java.io.FileNotFoundException: /opt/flink/job.jar (No such file or > directory) > at java.base/java.io.FileInputStream.open0(Native Method) > at java.base/java.io.FileInputStream.open(Unknown Source) > at java.base/java.io.FileInputStream.<init>(Unknown Source) > at > org.apache.flink.core.fs.local.LocalDataInputStream.<init>(LocalDataInputStream.java:50) > at > org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:134) > at > org.apache.flink.kubernetes.operator.artifact.FileSystemBasedArtifactFetcher.fetch(FileSystemBasedArtifactFetcher.java:44) > at > org.apache.flink.kubernetes.operator.artifact.ArtifactManager.fetch(ArtifactManager.java:63) > at > org.apache.flink.kubernetes.operator.service.AbstractFlinkService.uploadJar(AbstractFlinkService.java:707) > at > org.apache.flink.kubernetes.operator.service.AbstractFlinkService.submitJobToSessionCluster(AbstractFlinkService.java:212) > at > org.apache.flink.kubernetes.operator.reconciler.sessionjob.SessionJobReconciler.deploy(SessionJobReconciler.java:73) > at > org.apache.flink.kubernetes.operator.reconciler.sessionjob.SessionJobReconciler.deploy(SessionJobReconciler.java:44) > at > org.apache.flink.kubernetes.operator.reconciler.deployment.AbstractFlinkResourceReconciler.reconcile(AbstractFlinkResourceReconciler.java:120) > at > org.apache.flink.kubernetes.operator.controller.FlinkSessionJobController.reconcile(FlinkSessionJobController.java:109) -- This message was sent by Atlassian Jira (v8.20.10#820010)