So when spinning it up on minikube, and then ssh into one of the JobManager pods shows following for the commands you mentioned:
flink@local-thoros-jobmanager-6l9lz:~$ id uid=9999(flink) gid=9999(flink) groups=9999(flink) flink@local-thoros-jobmanager-6l9lz:~$ ls -la $FLINK_HOME/plugins/s3-fs-presto/ ls: cannot access '/opt/flink/plugins/s3-fs-presto/': No such file or directory It appears, this had to do with minikube using cached version of the docker image, so the new changes (i.e. adding the plugin) never was reflected. After pulling down the latest the error stopped :) Best, Jonas Den ons 25 aug. 2021 kl 21:17 skrev Thms Hmm <thms....@gmail.com>: > Can you check what is the output of those commands > > $ id > $ ls -la $FLINK_HOME/plugins/s3-fs-presto/ > > > jonas eyob <jonas.e...@gmail.com> schrieb am Mi. 25. Aug. 2021 um 16:17: > >> The exception is showing up both in TM and JM >> >> This however seemed only to appear when running on my local kubernetes >> setup. >> > I'd also recommend setting "kubernetes.namespace" option, unless you're >> using "default" namespace. >> >> Yes, good point - I now see why that was needed. >> >> >> Den ons 25 aug. 2021 kl 11:37 skrev David Morávek <d...@apache.org>: >> >>> Hi Jonas, >>> >>> Where does the exception pop-up? In job driver, TM, JM? You need to make >>> sure that the plugin folder is setup for all of them, because they all may >>> need to access s3 at some point. >>> >>> Best, >>> D. >>> >>> On Wed, Aug 25, 2021 at 11:54 AM jonas eyob <jonas.e...@gmail.com> >>> wrote: >>> >>>> Hey Thms, >>>> >>>> tried the s3p:// option as well - same issue. >>>> >>>> > Also check if your user that executes the process is able to read the >>>> jars. >>>> Not exactly sure how to do that? The user "flink" in the docker image >>>> is able to read the contents as far I understand. But maybe that's not how >>>> I would check it? >>>> >>>> Den ons 25 aug. 2021 kl 10:12 skrev Thms Hmm <thms....@gmail.com>: >>>> >>>>> Hey Jonas, >>>>> you could also try to use the ´s3p://´ scheme to directly specify that >>>>> presto should be used. Also check if your user that executes the process >>>>> is >>>>> able to read the jars. >>>>> >>>>> Am Mi., 25. Aug. 2021 um 10:01 Uhr schrieb jonas eyob < >>>>> jonas.e...@gmail.com>: >>>>> >>>>>> Thanks David for the quick response! >>>>>> >>>>>> *face palm* - Thanks a lot, that seems to have addressed the >>>>>> NullPointerException issue. >>>>>> May I also suggest that this [1] page be updated, since it says the >>>>>> key is "high-availability.cluster-id" >>>>>> >>>>>> This led me to another issue however: >>>>>> "org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: >>>>>> Could not find a file system implementation for scheme 's3'" >>>>>> >>>>>> The section [2] describe how I can either use environment variables >>>>>> e.g. ENABLE_BUILT_IN_PLUGINS or bake that in to the image by copying the >>>>>> provided plugins in opt/ under /plugins >>>>>> >>>>>> Dockerfile (snippet) >>>>>> # Configure flink provided plugin for S3 access >>>>>> RUN mkdir -p $FLINK_HOME/plugins/s3-fs-presto >>>>>> RUN cp $FLINK_HOME/opt/flink-s3-fs-presto-*.jar >>>>>> $FLINK_HOME/plugins/s3-fs-presto/ >>>>>> >>>>>> When bashing into the image: >>>>>> >>>>>> flink@dd86717a92a0:~/plugins/s3-fs-presto$ ls >>>>>> flink-s3-fs-presto-1.12.5.jar >>>>>> >>>>>> Any idea? >>>>>> >>>>>> [1] >>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#high-availability >>>>>> [2] >>>>>> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#hadooppresto-s3-file-systems-plugins >>>>>> >>>>>> >>>>>> Den ons 25 aug. 2021 kl 08:00 skrev David Morávek <d...@apache.org>: >>>>>> >>>>>>> Hi Jonas, >>>>>>> >>>>>>> this exception is raised because "kubernetes.cluster-id" [1] is not >>>>>>> set. I'd also recommend setting "kubernetes.namespace" option, unless >>>>>>> you're using "default" namespace. >>>>>>> >>>>>>> I've filled FLINK-23961 [2] so we provide more descriptive warning >>>>>>> for this issue next time ;) >>>>>>> >>>>>>> [1] >>>>>>> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/#example-configuration >>>>>>> [2] https://issues.apache.org/jira/browse/FLINK-23961 >>>>>>> >>>>>>> Best, >>>>>>> D. >>>>>>> >>>>>>> On Wed, Aug 25, 2021 at 8:48 AM jonas eyob <jonas.e...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hey, I've been struggling with this problem now for some days - >>>>>>>> driving me crazy. >>>>>>>> >>>>>>>> I have a standalone kubernetes Flink (1.12.5) using an application >>>>>>>> cluster mode approach. >>>>>>>> >>>>>>>> *The problem* >>>>>>>> I am getting a NullPointerException when specifying the FQN of the >>>>>>>> Kubernetes HA Service Factory class >>>>>>>> i.e. >>>>>>>> *org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory* >>>>>>>> >>>>>>>> What other configurations besides the ones specified (here >>>>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions>) >>>>>>>> may I be missing? >>>>>>>> >>>>>>>> Details: >>>>>>>> * we are using a custom image using the flink: 1.12 >>>>>>>> <https://hub.docker.com/layers/flink/library/flink/1.12/images/sha256-4b4290888e30d27a28517bac3b1678674cd4b17aa7b8329969d1d12fcdf68f02?context=explore> >>>>>>>> as base image >>>>>>>> >>>>>>>> flink-conf.yaml -- thought this may be useful? >>>>>>>> flink-conf.yaml: |+ >>>>>>>> jobmanager.rpc.address: {{ $fullName }}-jobmanager >>>>>>>> jobmanager.rpc.port: 6123 >>>>>>>> jobmanager.memory.process.size: 1600m >>>>>>>> taskmanager.numberOfTaskSlots: 2 >>>>>>>> taskmanager.rpc.port: 6122 >>>>>>>> taskmanager.memory.process.size: 1728m >>>>>>>> blob.server.port: 6124 >>>>>>>> queryable-state.proxy.ports: 6125 >>>>>>>> parallelism.default: 2 >>>>>>>> scheduler-mode: reactive >>>>>>>> execution.checkpointing.interval: 10s >>>>>>>> restart-strategy: fixed-delay >>>>>>>> restart-strategy.fixed-delay.attempts: 10 >>>>>>>> high-availability: >>>>>>>> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory >>>>>>>> high-availability.cluster-id: /{{ $fullName }} >>>>>>>> high-availability.storageDir: s3://redacted-flink-dev/recovery >>>>>>>> >>>>>>>> *Snippet of Job Manager pod log* >>>>>>>> 2021-08-25 06:14:20,652 INFO >>>>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - >>>>>>>> Shutting >>>>>>>> StandaloneApplicationClusterEntryPoint down with application status >>>>>>>> FAILED. >>>>>>>> Diagnostics org.apache.flink.util.FlinkException: Could not create the >>>>>>>> ha >>>>>>>> services from the instantiated HighAvailabilityServicesFactory >>>>>>>> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory. >>>>>>>> at >>>>>>>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:268) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:338) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:296) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:178) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:585) >>>>>>>> at >>>>>>>> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:85) >>>>>>>> Caused by: java.lang.NullPointerException >>>>>>>> at >>>>>>>> org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59) >>>>>>>> at >>>>>>>> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.<init>(Fabric8FlinkKubeClient.java:85) >>>>>>>> at >>>>>>>> org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory.fromConfiguration(FlinkKubeClientFactory.java:106) >>>>>>>> at >>>>>>>> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:37) >>>>>>>> at >>>>>>>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:265) >>>>>>>> ... 9 more >>>>>>>> . >>>>>>>> >>>>>>>> -- >>>>>>>> Many thanks, >>>>>>>> Jonas >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> *Med Vänliga Hälsningar* >>>>>> *Jonas Eyob* >>>>>> >>>>> >>>> >>>> -- >>>> *Med Vänliga Hälsningar* >>>> *Jonas Eyob* >>>> >>> >> >> -- >> *Med Vänliga Hälsningar* >> *Jonas Eyob* >> > -- *Med Vänliga Hälsningar* *Jonas Eyob*