Hello everybody, does anyone have experience with setting up a flink-cluster to execute beam pipelines over the python sdk?
Because rootfull containers are not allowed where I want to run the cluster, I try to do this in podman/podman-compose. I try to make a test pipeline work: https://github.com/matleicht/podman_beam_flink Environment: Debian 11 with podman and python 3.2.9(with apache-beam==2.38.0 and podman-compose) installed. The setup of the cluster defined in: https://github.com/matleicht/podman_beam_flink/blob/master/docker-compose.yml 1x flink-jobmanager (flink version 1.14) 1x flink-taskmanager 1x python harness sdk I chose to create a sdk container manually because i don't have docker installed and flink obviously fails, when it tries to create a container over docker. When i try to run the pipeline ( https://github.com/matleicht/podman_beam_flink/blob/master/pipeline_test.py) the sdk worker get stuck with the container lock: 2022/12/04 16:13:02 Starting worker pool 1: python -m apache_beam.runners.worker.worker_pool_main --service_port=50000 --container_executable=/opt/apache/beam/boot Starting worker with command ['/opt/apache/beam/boot', '--id=1-1', '--logging_endpoint=localhost:45087', '--artifact_endpoint=localhost:35323', '--provision_endpoint=localhost:36435', '--control_endpoint=localhost:33237'] 2022/12/04 16:16:31 Failed to obtain provisioning information: failed to dial server at localhost:36435 caused by: context deadline exceeded I suspect that I have an error in the network setup or there are some configurations missing for the harness worker, but I could not figure out the problem. Does anyone tried this himself on podman or see the error in container configuration and could help me? My ultimate goal is to create a streaming pipeline to read data from Kafka Instance, process it, and write back to it. Thanks for your help in advance. Best regards Matthis