[ 
https://issues.apache.org/jira/browse/FLINK-33926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17916171#comment-17916171
 ] 

Trystan commented on FLINK-33926:
---------------------------------

I think it makes sense to me. Not sure how something like this is handled with 
Flink 2.0 on the horizon, though.

 

This issue still crops up and bites us from time to time, so I think this would 
be a positive change.

> Can't start a job with a jar in the system classpath in native k8s mode
> -----------------------------------------------------------------------
>
>                 Key: FLINK-33926
>                 URL: https://issues.apache.org/jira/browse/FLINK-33926
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.6.0
>            Reporter: Trystan
>            Assignee: Gantigmaa Selenge
>            Priority: Major
>              Labels: pull-request-available
>
> It appears that the combination of the running operator-controlled jobs in 
> native k8s + application mode + using a job jar in the classpath is invalid. 
> Avoiding dynamic classloading (as specified in the 
> [docs|https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/debugging/debugging_classloading/#avoiding-dynamic-classloading-for-user-code])
>  is beneficial for some jobs. This affects at least Flink 1.16.1 and 
> Kubernetes Operator 1.6.0.
>  
> FLINK-29288 seems to have addressed this for standalone mode. If I am 
> misunderstanding how to correctly build jars for this native k8s scenario, 
> apologies for the noise and any pointers would be appreciated!
>  
> Perhaps related, the [spec 
> documentation|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/reference/#jobspec]
>  declares it optional, but isn't clear about under what conditions that 
> applies.
>  * Putting the jar in the system classpath and pointing *jarURI* to that jar 
> leads to linkage errors.
>  * Not including *jarURI* leads to NullPointerExceptions in the operator:
> {code:java}
> {"type":"org.apache.flink.kubernetes.operator.exception.ReconciliationException","message":"java.lang.NullPointerException","stackTrace":"org.apache.flink.kubernetes.operator.exception.ReconciliationException:
>  java.lang.NullPointerException\n\tat 
> org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:148)\n\tat
>  
> org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:56)\n\tat
>  
> io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:138)\n\tat
>  
> io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:96)\n\tat
>  
> org.apache.flink.kubernetes.operator.metrics.OperatorJosdkMetrics.timeControllerExecution(OperatorJosdkMetrics.java:80)\n\tat
>  
> io.javaoperatorsdk.operator.processing.Controller.reconcile(Controller.java:95)\n\tat
>  
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.reconcileExecution(ReconciliationDispatcher.java:139)\n\tat
>  
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:119)\n\tat
>  
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:89)\n\tat
>  
> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:62)\n\tat
>  
> io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:414)\n\tat
>  java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
> Source)\n\tat 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
> Source)\n\tat java.base/java.lang.Thread.run(Unknown Source)\nCaused by: 
> java.lang.NullPointerException\n\tat 
> org.apache.flink.kubernetes.utils.KubernetesUtils.checkJarFileForApplicationMode(KubernetesUtils.java:407)\n\tat
>  
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployApplicationCluster(KubernetesClusterDescriptor.java:207)\n\tat
>  
> org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer.run(ApplicationClusterDeployer.java:67)\n\tat
>  
> org.apache.flink.kubernetes.operator.service.NativeFlinkService.deployApplicationCluster(Native","additionalMetadata":{},"throwableList":[{"type":"java.lang.NullPointerException","additionalMetadata":{}}]}
>   {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to