[GitHub] [flink] XComp commented on a change in pull request #14367: [FLINK-20352][docs] Add PyFlink job submission section under the Advanced CLI section.

GitBox Fri, 11 Dec 2020 05:55:23 -0800


XComp commented on a change in pull request #14367:
URL: https://github.com/apache/flink/pull/14367#discussion_r540949456




##########
File path: docs/deployment/cli.md
##########
@@ -338,4 +338,74 @@ specified in the `config/flink-config.yaml`.
 For more details on the commands and the available options, please refer to 
the Resource Provider-specific 
 pages of the documentation.
 
+### Submitting PyFlink Jobs
+
+Currently, users are able to submit a PyFlink job via the CLI. It does not 
require to specify the
+JAR file path or the entry main class, which is different from the Java job 
submission.
+
+<span class="label label-info">Note</span> When submitting Python job via 
`flink run`, Flink will run the command "python". Please run the following 
command to confirm that the python executable in current environment points to 
a supported Python version of 3.5+.
+{% highlight bash %}
+$ python --version
+# the version printed here must be 3.5+
+{% endhighlight %}
+
+The following commands show different PyFlink job submission use-cases:
+
+- Run a Python Table job:
+{% highlight bash %}
+$ ./bin/flink run --python examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+- Run a Python Table job with pyFiles:
+{% highlight bash %}
+$ ./bin/flink run \
+      --python examples/python/table/batch/word_count.py \
+      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
+{% endhighlight %}
+
+- Run a Python Table job with a JAR file:
+{% highlight bash %}
+$ ./bin/flink run \
+      --python examples/python/table/batch/word_count.py \
+      --jarfile <jarFile>
+{% endhighlight %}
+
+- Run a Python Table job with pyFiles and pyModule:
+{% highlight bash %}
+$ ./bin/flink run \
+      --pyModule batch.word_count \
+      --pyFiles examples/python/table/batch
+{% endhighlight %}
+
+- Submit a Python Table job on a specific JobManager running on host 
`<jobmanagerHost>` (adapt the command accordingly):
+{% highlight bash %}
+$ ./bin/flink run \
+      --jobmanager <jobmanagerHost>:8081 \
+      --python examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+- Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link 
deployment/resource-providers/yarn.md %}#run-a-single-flink-job-on-hadoop-yarn):

Review comment:
       ```suggestion
   - Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link 
deployment/resource-providers/yarn.md %}#per-job-cluster-mode %):
   ```

##########
File path: docs/deployment/cli.md
##########
@@ -338,4 +338,74 @@ specified in the `config/flink-config.yaml`.
 For more details on the commands and the available options, please refer to 
the Resource Provider-specific 
 pages of the documentation.
 
+### Submitting PyFlink Jobs
+
+Currently, users are able to submit a PyFlink job via the CLI. It does not 
require to specify the
+JAR file path or the entry main class, which is different from the Java job 
submission.
+
+<span class="label label-info">Note</span> When submitting Python job via 
`flink run`, Flink will run the command "python". Please run the following 
command to confirm that the python executable in current environment points to 
a supported Python version of 3.5+.
+{% highlight bash %}
+$ python --version
+# the version printed here must be 3.5+
+{% endhighlight %}
+
+The following commands show different PyFlink job submission use-cases:
+
+- Run a Python Table job:

Review comment:
       Thinking about it: Why do we call it Python Table job? Isn't PyFlink job 
the correct way of labelling it?
   This applies to all the occurrences below as well.

##########
File path: docs/deployment/cli.zh.md
##########
@@ -337,4 +337,74 @@ specified in the `config/flink-config.yaml`.
 For more details on the commands and the available options, please refer to 
the Resource Provider-specific 
 pages of the documentation.
 
+### Submitting PyFlink Jobs
+
+Currently, users are able to submit a PyFlink job via the CLI. It does not 
require to specify the
+JAR file path or the entry main class, which is different from the Java job 
submission.
+
+<span class="label label-info">Note</span> When submitting Python job via 
`flink run`, Flink will run the command "python". Please run the following 
command to confirm that the python executable in current environment points to 
a supported Python version of 3.5+.
+{% highlight bash %}
+$ python --version
+# the version printed here must be 3.5+
+{% endhighlight %}
+
+The following commands show different PyFlink job submission use-cases:
+
+- Run a Python Table job:
+{% highlight bash %}
+$ ./bin/flink run --python examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+- Run a Python Table job with pyFiles:
+{% highlight bash %}
+$ ./bin/flink run \
+      --python examples/python/table/batch/word_count.py \
+      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
+{% endhighlight %}
+
+- Run a Python Table job with a JAR file:
+{% highlight bash %}
+$ ./bin/flink run \
+      --python examples/python/table/batch/word_count.py \
+      --jarfile <jarFile>
+{% endhighlight %}
+
+- Run a Python Table job with pyFiles and pyModule:
+{% highlight bash %}
+$ ./bin/flink run \
+      --pyModule batch.word_count \
+      --pyFiles examples/python/table/batch
+{% endhighlight %}
+
+- Submit a Python Table job on a specific JobManager running on host 
`<jobmanagerHost>` (adapt the command accordingly):
+{% highlight bash %}
+$ ./bin/flink run \
+      --jobmanager <jobmanagerHost>:8081 \
+      --python examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+- Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link 
deployment/resource-providers/yarn.md %}#run-a-single-flink-job-on-hadoop-yarn):

Review comment:
       ```suggestion
   - Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link 
deployment/resource-providers/yarn.zh.md 
%}#run-a-single-flink-job-on-hadoop-yarn):
   ```

##########
File path: docs/deployment/cli.zh.md
##########
@@ -337,4 +337,74 @@ specified in the `config/flink-config.yaml`.
 For more details on the commands and the available options, please refer to 
the Resource Provider-specific 
 pages of the documentation.
 
+### Submitting PyFlink Jobs
+
+Currently, users are able to submit a PyFlink job via the CLI. It does not 
require to specify the
+JAR file path or the entry main class, which is different from the Java job 
submission.
+
+<span class="label label-info">Note</span> When submitting Python job via 
`flink run`, Flink will run the command "python". Please run the following 
command to confirm that the python executable in current environment points to 
a supported Python version of 3.5+.
+{% highlight bash %}
+$ python --version
+# the version printed here must be 3.5+
+{% endhighlight %}
+
+The following commands show different PyFlink job submission use-cases:
+
+- Run a Python Table job:
+{% highlight bash %}
+$ ./bin/flink run --python examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+- Run a Python Table job with pyFiles:
+{% highlight bash %}
+$ ./bin/flink run \
+      --python examples/python/table/batch/word_count.py \
+      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
+{% endhighlight %}
+
+- Run a Python Table job with a JAR file:
+{% highlight bash %}
+$ ./bin/flink run \
+      --python examples/python/table/batch/word_count.py \
+      --jarfile <jarFile>
+{% endhighlight %}
+
+- Run a Python Table job with pyFiles and pyModule:
+{% highlight bash %}
+$ ./bin/flink run \
+      --pyModule batch.word_count \
+      --pyFiles examples/python/table/batch
+{% endhighlight %}
+
+- Submit a Python Table job on a specific JobManager running on host 
`<jobmanagerHost>` (adapt the command accordingly):
+{% highlight bash %}
+$ ./bin/flink run \
+      --jobmanager <jobmanagerHost>:8081 \
+      --python examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+- Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link 
deployment/resource-providers/yarn.md %}#run-a-single-flink-job-on-hadoop-yarn):
+{% highlight bash %}
+$ ./bin/flink run \
+      --target yarn-per-job
+      --python examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+- Run a Python Table application on a native Kubernetes cluster having hte 
cluster ID `<ClusterId>`:
+{% highlight bash %}
+$ ./bin/flink run-application \
+      --target kubernetes-application \
+      --parallelism 8 \
+      -Dkubernetes.cluster-id=<ClusterId> \
+      -Dtaskmanager.memory.process.size=4096m \
+      -Dkubernetes.taskmanager.cpu=2 \
+      -Dtaskmanager.numberOfTaskSlots=4 \
+      -Dkubernetes.container.image=<PyFlinkImageName> \
+      --pyModule word_count \
+      --pyFiles /opt/flink/examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+To learn more available options, please refer to [Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %})

Review comment:
       ```suggestion
   To learn more available options, please refer to [Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.zh.md %})
   ```

##########
File path: docs/deployment/cli.zh.md
##########
@@ -337,4 +337,74 @@ specified in the `config/flink-config.yaml`.
 For more details on the commands and the available options, please refer to 
the Resource Provider-specific 
 pages of the documentation.
 
+### Submitting PyFlink Jobs
+
+Currently, users are able to submit a PyFlink job via the CLI. It does not 
require to specify the
+JAR file path or the entry main class, which is different from the Java job 
submission.
+
+<span class="label label-info">Note</span> When submitting Python job via 
`flink run`, Flink will run the command "python". Please run the following 
command to confirm that the python executable in current environment points to 
a supported Python version of 3.5+.
+{% highlight bash %}
+$ python --version
+# the version printed here must be 3.5+
+{% endhighlight %}
+
+The following commands show different PyFlink job submission use-cases:
+
+- Run a Python Table job:
+{% highlight bash %}
+$ ./bin/flink run --python examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+- Run a Python Table job with pyFiles:
+{% highlight bash %}
+$ ./bin/flink run \
+      --python examples/python/table/batch/word_count.py \
+      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
+{% endhighlight %}
+
+- Run a Python Table job with a JAR file:
+{% highlight bash %}
+$ ./bin/flink run \
+      --python examples/python/table/batch/word_count.py \
+      --jarfile <jarFile>
+{% endhighlight %}
+
+- Run a Python Table job with pyFiles and pyModule:
+{% highlight bash %}
+$ ./bin/flink run \
+      --pyModule batch.word_count \
+      --pyFiles examples/python/table/batch
+{% endhighlight %}
+
+- Submit a Python Table job on a specific JobManager running on host 
`<jobmanagerHost>` (adapt the command accordingly):
+{% highlight bash %}
+$ ./bin/flink run \
+      --jobmanager <jobmanagerHost>:8081 \
+      --python examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+- Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link 
deployment/resource-providers/yarn.md %}#run-a-single-flink-job-on-hadoop-yarn):
+{% highlight bash %}
+$ ./bin/flink run \
+      --target yarn-per-job
+      --python examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+- Run a Python Table application on a native Kubernetes cluster having hte 
cluster ID `<ClusterId>`:
+{% highlight bash %}
+$ ./bin/flink run-application \
+      --target kubernetes-application \
+      --parallelism 8 \
+      -Dkubernetes.cluster-id=<ClusterId> \
+      -Dtaskmanager.memory.process.size=4096m \
+      -Dkubernetes.taskmanager.cpu=2 \
+      -Dtaskmanager.numberOfTaskSlots=4 \
+      -Dkubernetes.container.image=<PyFlinkImageName> \
+      --pyModule word_count \
+      --pyFiles /opt/flink/examples/python/table/batch/word_count.py
+{% endhighlight %}
+
+To learn more available options, please refer to [Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %})
+or [YARN]({% link deployment/resource-providers/yarn.md %}) which are 
described in more detail in the

Review comment:
       ```suggestion
   or [YARN]({% link deployment/resource-providers/yarn.zh.md %}) which are 
described in more detail in the
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] XComp commented on a change in pull request #14367: [FLINK-20352][docs] Add PyFlink job submission section under the Advanced CLI section.

Reply via email to