XComp commented on a change in pull request #14367: URL: https://github.com/apache/flink/pull/14367#discussion_r540949456
########## File path: docs/deployment/cli.md ########## @@ -338,4 +338,74 @@ specified in the `config/flink-config.yaml`. For more details on the commands and the available options, please refer to the Resource Provider-specific pages of the documentation. +### Submitting PyFlink Jobs + +Currently, users are able to submit a PyFlink job via the CLI. It does not require to specify the +JAR file path or the entry main class, which is different from the Java job submission. + +<span class="label label-info">Note</span> When submitting Python job via `flink run`, Flink will run the command "python". Please run the following command to confirm that the python executable in current environment points to a supported Python version of 3.5+. +{% highlight bash %} +$ python --version +# the version printed here must be 3.5+ +{% endhighlight %} + +The following commands show different PyFlink job submission use-cases: + +- Run a Python Table job: +{% highlight bash %} +$ ./bin/flink run --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job with pyFiles: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt +{% endhighlight %} + +- Run a Python Table job with a JAR file: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --jarfile <jarFile> +{% endhighlight %} + +- Run a Python Table job with pyFiles and pyModule: +{% highlight bash %} +$ ./bin/flink run \ + --pyModule batch.word_count \ + --pyFiles examples/python/table/batch +{% endhighlight %} + +- Submit a Python Table job on a specific JobManager running on host `<jobmanagerHost>` (adapt the command accordingly): +{% highlight bash %} +$ ./bin/flink run \ + --jobmanager <jobmanagerHost>:8081 \ + --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link deployment/resource-providers/yarn.md %}#run-a-single-flink-job-on-hadoop-yarn): Review comment: ```suggestion - Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link deployment/resource-providers/yarn.md %}#per-job-cluster-mode %): ``` ########## File path: docs/deployment/cli.md ########## @@ -338,4 +338,74 @@ specified in the `config/flink-config.yaml`. For more details on the commands and the available options, please refer to the Resource Provider-specific pages of the documentation. +### Submitting PyFlink Jobs + +Currently, users are able to submit a PyFlink job via the CLI. It does not require to specify the +JAR file path or the entry main class, which is different from the Java job submission. + +<span class="label label-info">Note</span> When submitting Python job via `flink run`, Flink will run the command "python". Please run the following command to confirm that the python executable in current environment points to a supported Python version of 3.5+. +{% highlight bash %} +$ python --version +# the version printed here must be 3.5+ +{% endhighlight %} + +The following commands show different PyFlink job submission use-cases: + +- Run a Python Table job: Review comment: Thinking about it: Why do we call it Python Table job? Isn't PyFlink job the correct way of labelling it? This applies to all the occurrences below as well. ########## File path: docs/deployment/cli.zh.md ########## @@ -337,4 +337,74 @@ specified in the `config/flink-config.yaml`. For more details on the commands and the available options, please refer to the Resource Provider-specific pages of the documentation. +### Submitting PyFlink Jobs + +Currently, users are able to submit a PyFlink job via the CLI. It does not require to specify the +JAR file path or the entry main class, which is different from the Java job submission. + +<span class="label label-info">Note</span> When submitting Python job via `flink run`, Flink will run the command "python". Please run the following command to confirm that the python executable in current environment points to a supported Python version of 3.5+. +{% highlight bash %} +$ python --version +# the version printed here must be 3.5+ +{% endhighlight %} + +The following commands show different PyFlink job submission use-cases: + +- Run a Python Table job: +{% highlight bash %} +$ ./bin/flink run --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job with pyFiles: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt +{% endhighlight %} + +- Run a Python Table job with a JAR file: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --jarfile <jarFile> +{% endhighlight %} + +- Run a Python Table job with pyFiles and pyModule: +{% highlight bash %} +$ ./bin/flink run \ + --pyModule batch.word_count \ + --pyFiles examples/python/table/batch +{% endhighlight %} + +- Submit a Python Table job on a specific JobManager running on host `<jobmanagerHost>` (adapt the command accordingly): +{% highlight bash %} +$ ./bin/flink run \ + --jobmanager <jobmanagerHost>:8081 \ + --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link deployment/resource-providers/yarn.md %}#run-a-single-flink-job-on-hadoop-yarn): Review comment: ```suggestion - Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link deployment/resource-providers/yarn.zh.md %}#run-a-single-flink-job-on-hadoop-yarn): ``` ########## File path: docs/deployment/cli.zh.md ########## @@ -337,4 +337,74 @@ specified in the `config/flink-config.yaml`. For more details on the commands and the available options, please refer to the Resource Provider-specific pages of the documentation. +### Submitting PyFlink Jobs + +Currently, users are able to submit a PyFlink job via the CLI. It does not require to specify the +JAR file path or the entry main class, which is different from the Java job submission. + +<span class="label label-info">Note</span> When submitting Python job via `flink run`, Flink will run the command "python". Please run the following command to confirm that the python executable in current environment points to a supported Python version of 3.5+. +{% highlight bash %} +$ python --version +# the version printed here must be 3.5+ +{% endhighlight %} + +The following commands show different PyFlink job submission use-cases: + +- Run a Python Table job: +{% highlight bash %} +$ ./bin/flink run --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job with pyFiles: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt +{% endhighlight %} + +- Run a Python Table job with a JAR file: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --jarfile <jarFile> +{% endhighlight %} + +- Run a Python Table job with pyFiles and pyModule: +{% highlight bash %} +$ ./bin/flink run \ + --pyModule batch.word_count \ + --pyFiles examples/python/table/batch +{% endhighlight %} + +- Submit a Python Table job on a specific JobManager running on host `<jobmanagerHost>` (adapt the command accordingly): +{% highlight bash %} +$ ./bin/flink run \ + --jobmanager <jobmanagerHost>:8081 \ + --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link deployment/resource-providers/yarn.md %}#run-a-single-flink-job-on-hadoop-yarn): +{% highlight bash %} +$ ./bin/flink run \ + --target yarn-per-job + --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table application on a native Kubernetes cluster having hte cluster ID `<ClusterId>`: +{% highlight bash %} +$ ./bin/flink run-application \ + --target kubernetes-application \ + --parallelism 8 \ + -Dkubernetes.cluster-id=<ClusterId> \ + -Dtaskmanager.memory.process.size=4096m \ + -Dkubernetes.taskmanager.cpu=2 \ + -Dtaskmanager.numberOfTaskSlots=4 \ + -Dkubernetes.container.image=<PyFlinkImageName> \ + --pyModule word_count \ + --pyFiles /opt/flink/examples/python/table/batch/word_count.py +{% endhighlight %} + +To learn more available options, please refer to [Kubernetes]({% link deployment/resource-providers/native_kubernetes.md %}) Review comment: ```suggestion To learn more available options, please refer to [Kubernetes]({% link deployment/resource-providers/native_kubernetes.zh.md %}) ``` ########## File path: docs/deployment/cli.zh.md ########## @@ -337,4 +337,74 @@ specified in the `config/flink-config.yaml`. For more details on the commands and the available options, please refer to the Resource Provider-specific pages of the documentation. +### Submitting PyFlink Jobs + +Currently, users are able to submit a PyFlink job via the CLI. It does not require to specify the +JAR file path or the entry main class, which is different from the Java job submission. + +<span class="label label-info">Note</span> When submitting Python job via `flink run`, Flink will run the command "python". Please run the following command to confirm that the python executable in current environment points to a supported Python version of 3.5+. +{% highlight bash %} +$ python --version +# the version printed here must be 3.5+ +{% endhighlight %} + +The following commands show different PyFlink job submission use-cases: + +- Run a Python Table job: +{% highlight bash %} +$ ./bin/flink run --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job with pyFiles: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt +{% endhighlight %} + +- Run a Python Table job with a JAR file: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --jarfile <jarFile> +{% endhighlight %} + +- Run a Python Table job with pyFiles and pyModule: +{% highlight bash %} +$ ./bin/flink run \ + --pyModule batch.word_count \ + --pyFiles examples/python/table/batch +{% endhighlight %} + +- Submit a Python Table job on a specific JobManager running on host `<jobmanagerHost>` (adapt the command accordingly): +{% highlight bash %} +$ ./bin/flink run \ + --jobmanager <jobmanagerHost>:8081 \ + --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link deployment/resource-providers/yarn.md %}#run-a-single-flink-job-on-hadoop-yarn): +{% highlight bash %} +$ ./bin/flink run \ + --target yarn-per-job + --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table application on a native Kubernetes cluster having hte cluster ID `<ClusterId>`: +{% highlight bash %} +$ ./bin/flink run-application \ + --target kubernetes-application \ + --parallelism 8 \ + -Dkubernetes.cluster-id=<ClusterId> \ + -Dtaskmanager.memory.process.size=4096m \ + -Dkubernetes.taskmanager.cpu=2 \ + -Dtaskmanager.numberOfTaskSlots=4 \ + -Dkubernetes.container.image=<PyFlinkImageName> \ + --pyModule word_count \ + --pyFiles /opt/flink/examples/python/table/batch/word_count.py +{% endhighlight %} + +To learn more available options, please refer to [Kubernetes]({% link deployment/resource-providers/native_kubernetes.md %}) +or [YARN]({% link deployment/resource-providers/yarn.md %}) which are described in more detail in the Review comment: ```suggestion or [YARN]({% link deployment/resource-providers/yarn.zh.md %}) which are described in more detail in the ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org