[ 
https://issues.apache.org/jira/browse/FLINK-35695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17861156#comment-17861156
 ] 

Mate Czagany commented on FLINK-35695:
--------------------------------------

h2. Setup

Build flink and flink-docker using the documentation found in the corresponding 
repositories.

To enable S3 support, I have moved `flink-s3-fs-hadoop` lib to plugins/ folder 
locally.

We also need minio, on Minikube I have applied the following YAML and added 
`minio` to /etc/hosts:

 
{code:java}
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: minio
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: minio
spec:
  selector:
    matchLabels:
      app: minio
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: minio
    spec:
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: minio
      containers:
      - name: minio
        volumeMounts:
        - name: data 
          mountPath: "/data"
        image: minio/minio:latest
        args:
        - server
        - /data
        - --console-address 
        - ":9001"
        env:
        - name: MINIO_ACCESS_KEY
          value: "admin"
        - name: MINIO_SECRET_KEY
          value: "password"
        ports:
        - containerPort: 9000
          name: data
        - containerPort: 9001
          name: web

---
apiVersion: v1
kind: Service
metadata:
  name: minio
  labels:
    app: minio
spec:
  ports:
  - port: 9000
    name: data
  - port: 9001
    name: web
  clusterIP: None
  selector:
    app: minio
{code}
 

In all Flink applications, I added the next to the default configs in 
`config.yaml`:

 
{code:java}
s3:
  access-key: admin
  secret-key: password
  endpoint: http://minio:9000
  path.style.access: true
{code}
 
h2. Deploy local job JAR as the only dependency:
{code:java}
./bin/flink run-application \
    --target kubernetes-application \
    -Dkubernetes.service-account=flink \
    -Dkubernetes.container.image.ref=flink:1.20 \
    -Dkubernetes.artifacts.local-upload-enabled=true \
    -Dkubernetes.artifacts.local-upload-target=s3://test/ \
    
-Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS=flink-s3-fs-hadoop-1.20-SNAPSHOT.jar
 \
    
-Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS=flink-s3-fs-hadoop-1.20-SNAPSHOT.jar
 \
    local://$(pwd)/examples/streaming/StateMachineExample.jar{code}
!image-2024-07-01-14-54-17-770.png|width=862,height=261!
 * Verified in the logs that the JAR was uploaded to 
`s3://test/StateMachineExample.jar`.
 * Verified that job is running
 * Verified the following can be found in the JM pod logs:
{code:java}
INFO  [] - Loading configuration property: pipeline.jars, 
['s3://test/StateMachineExample.jar']
{code}

h2. Deploy job with a local job JAR, and further dependencies

This will also copy udf2.jar from "other-bucket" to the "test" bucket, so that 
will also need to be created.

 
{code:java}
./bin/flink run-application \
    --target kubernetes-application \
    -Dkubernetes.service-account=flink \
    -Dkubernetes.container.image.ref=flink:1.20 \
    -Dkubernetes.artifacts.local-upload-enabled=true \
    -Dkubernetes.artifacts.local-upload-target=s3://test/ \
    
-Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS=flink-s3-fs-hadoop-1.20-SNAPSHOT.jar
 \
    
-Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS=flink-s3-fs-hadoop-1.20-SNAPSHOT.jar
 \
    
-Duser.artifacts.artifact-list=local://$(pwd)/examples/table/GettingStartedExample.jar\;s3://other-bucket/udf2.jar
 \
    local://$(pwd)/examples/streaming/StateMachineExample.jar
2 {code}
!image-2024-07-01-15-04-53-764.png|width=859,height=284!
 * Verified that "GettingStartedExample.jar" and "StateMachineExample.jar" can 
be found in the "test" bucket.
 * Verified that job is running
 * Verified that "GettingStartedExample.jar", "StateMachineExample.jar" and 
"udf2.jar" can be found in the JM pod under 
"/opt/flink/artifacts/default/flink-cluster-xxx/"
 * Verified the following line can be found in the JM pod logs: 

 
{code:java}
INFO  [] - Loading configuration property: pipeline.jars, 
['s3://test/StateMachineExample.jar']
...
INFO  [] - Loading configuration property: user.artifacts.artifact-list, 
['s3://test/GettingStartedExample.jar', 's3://other-bucket/udf2.jar'] {code}
 

 

> Release Testing: Verify FLINK-32315: Support local file upload in K8s mode
> --------------------------------------------------------------------------
>
>                 Key: FLINK-35695
>                 URL: https://issues.apache.org/jira/browse/FLINK-35695
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Client / Job Submission
>            Reporter: Ferenc Csaky
>            Assignee: Mate Czagany
>            Priority: Blocker
>              Labels: release-testing
>             Fix For: 1.20.0
>
>         Attachments: image-2024-07-01-14-54-17-770.png, 
> image-2024-07-01-15-04-53-764.png
>
>
> Follow up the test for FLINK-32315.
> In Flink 1.20, we introduced a local file upload possibility for Kubernetes 
> deployments. To verify this feature, you can check the relevant 
> [PR|https://github.com/apache/flink/pull/24303], which includes the docs, and 
> examples for more information.
> To test this feature, it is required to have an available Kubernetes cluster 
> to deploy to, and some DFS where Flink can deploy the local JAR. For a 
> sandbox setup, I recommend to install {{minikube}}. The flink-k8s-operator 
> [quickstart 
> guide|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/#prerequisites]
>  explains that pretty well ({{helm}} is not needed here). For the DFS, I have 
> a gist to setup Minio on a K8s pod 
> [here|https://gist.github.com/ferenc-csaky/fd7fee71d89cd389cac2da4a4471ab65].
> The two following main use-case should be handled correctly:
> # Deploy job with a local job JAR, but without further dependencies
> {code:bash}
> $ ./bin/flink run-application \
>     --target kubernetes-application \
>     -Dkubernetes.cluster-id=my-first-application-cluster \
>     -Dkubernetes.container.image=flink:1.20 \
>     -Dkubernetes.artifacts.local-upload-enabled=true \
>     -Dkubernetes.artifacts.local-upload-target=s3://my-bucket/ \
>     local:///path/to/TopSpeedWindowing.jar
> {code}
> # Deploy job with a local job JAR, and further dependencies (e.g. a UDF 
> included in a separate JAR).
> {code:bash}
> $ ./bin/flink run-application \
>     --target kubernetes-application \
>     -Dkubernetes.cluster-id=my-first-application-cluster \
>     -Dkubernetes.container.image=flink:1.20 \
>     -Dkubernetes.artifacts.local-upload-enabled=true \
>     -Dkubernetes.artifacts.local-upload-target=s3://my-bucket/ \
>     
> -Duser.artifacts.artifact-list=local:///tmp/my-flink-udf1.jar\;s3://my-bucket/my-flink-udf2.jar
>  \
>     local:///tmp/my-flink-job.jar
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to