Hi Sumeet,

Is there a problem with the documented approaches on how to submit the
Python program (not working) or are you asking in general? Given the
documentation, I would assume that you can configure the requirements.txt
via `set_python_requirements`.

I am also pulling in Dian who might be able to tell you more about the
Python deployment options.

If you are not running on a session cluster, then you can also create a K8s
image which contains your user code. That way you ship your job when
deploying the cluster.

Cheers,
Till

On Wed, Apr 28, 2021 at 10:17 AM Sumeet Malhotra <sumeet.malho...@gmail.com>
wrote:

> Hi,
>
> I have a PyFlink job that consists of:
>
>    - Multiple Python files.
>    - Multiple 3rdparty Python dependencies, specified in a
>    `requirements.txt` file.
>    - A few Java dependencies, mainly for external connectors.
>    - An overall job config YAML file.
>
> Here's a simplified structure of the code layout.
>
> flink/
> ├── deps
> │   ├── jar
> │   │   ├── flink-connector-kafka_2.11-1.12.2.jar
> │   │   └── kafka-clients-2.4.1.jar
> │   └── pip
> │       └── requirements.txt
> ├── conf
> │   └── job.yaml
> └── job
>     ├── some_file_x.py
>     ├── some_file_y.py
>     └── main.py
>
> I'm able to execute this job running it locally i.e. invoking something
> like:
>
> python main.py --config <path_to_job_yaml>
>
> I'm loading the jars inside the Python code, using env.add_jars(...).
>
> Now, the next step is to submit this job to a Flink cluster running on
> K8S. I'm looking for any best practices in packaging and specifying
> dependencies that people tend to follow. As per the documentation here [1],
> various Python files, including the conf YAML, can be specified using the
> --pyFiles option and Java dependencies can be specified using --jarfile
> option.
>
> So, how can I specify 3rdparty Python package dependencies? According to
> another piece of documentation here [2], I should be able to specify the
> requirements.txt directly inside the code and submit it via the --pyFiles
> option. Is that right?
>
> Are there any other best practices folks use to package/submit jobs?
>
> Thanks,
> Sumeet
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/cli.html#submitting-pyflink-jobs
> [2]
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/python/table-api-users-guide/dependency_management.html#python-dependency-in-python-program
>

Reply via email to