dianfu commented on a change in pull request #11448: [FLINK-16666][python][table] Support new Python dependency configuration options in flink-java, flink-streaming-java and flink-table. URL: https://github.com/apache/flink/pull/11448#discussion_r396238669
########## File path: flink-core/src/main/java/org/apache/flink/configuration/PythonOptions.java ########## @@ -81,4 +79,60 @@ "buffer of a Python worker. The memory will be accounted as managed memory if the " + "actual memory allocated to an operator is no less than the total memory of a Python " + "worker. Otherwise, this configuration takes no effect."); + + public static final ConfigOption<String> PYTHON_FILES = ConfigOptions + .key("python.files") + .stringType() + .noDefaultValue() + .withDescription("Attach custom python files for job. These files will " + + "be added to the PYTHONPATH of both the local client and the remote python UDF " + + "worker. The standard python resource file suffixes such as .py/.egg/.zip or " + + "directory are all supported. Comma (',') could be used as the separator to specify " + + "multiple files. The option is equivalent to the command line option \"-pyfs\". "); + + public static final ConfigOption<String> PYTHON_REQUIREMENTS = ConfigOptions + .key("python.requirements") + .stringType() + .noDefaultValue() + .withDescription("Specify a requirements.txt file which defines the third-party " + + "dependencies. These dependencies will be installed and added to the PYTHONPATH of " + + "the python UDF worker. A directory which contains the installation packages of " + + "these dependencies could be specified optionally. Use '#' as the separator if the " + + "optional parameter exists. The option is equivalent to the command line option " + + "\"-pyreq\"."); + + public static final ConfigOption<String> PYTHON_ARCHIVES = ConfigOptions + .key("python.archives") + .stringType() + .noDefaultValue() + .withDescription("Add python archive files for job. The archive files will be extracted " + + "to the working directory of python UDF worker. Currently only zip-format is " + + "supported. For each archive file, a target directory is specified. If the target " + + "directory name is specified, the archive file will be extracted to a name can " + + "directory with the specified name. Otherwise, the archive file will be extracted to " + + "a directory with the same name of the archive file. The files uploaded via this " + + "option are accessible via relative path. '#' could be used as the separator of the " + + "archive file path and the target directory name. Comma (',') could be used as the " + + "separator to specify multiple archive files. This option can be used to upload the " + + "virtual environment, the data files used in Python UDF. The data files could be " + + "accessed in Python UDF, e.g.: f = open('data/data.txt', 'r'). The option is " + + "equivalent to the command line option \"-pyarch\"."); + + public static final ConfigOption<String> PYTHON_EXECUTABLE = ConfigOptions + .key("python.executable") + .stringType() + .noDefaultValue() + .withDescription("Specify the path of the python interpreter used to execute the python " + + "UDF worker. The python UDF worker depends on Python 3.5+, Apache Beam " + + "(version == 2.19.0), Pip (version >= 7.1.0) and SetupTools (version >= 37.0.0). " + + "Please ensure that the specified environment meets the above requirements. The " + + "option is equivalent to the command line option \"-pyexec\"."); + + public static final ConfigOption<String> PYTHON_CLIENT_EXECUTABLE = ConfigOptions + .key("python.client.executable") + .defaultValue("python") + .withDescription("The python interpreter used to launch the python process when compiling " + + "the jobs containing Python UDFs. Equivalent to the environment variable PYFLINK_EXECUTABLE. " + + "The precedence is: 1. configuration in job source code. 2. environment variable. " + Review comment: Do you mean the priority? If so, suggest to change precedence to priority to make it more clear: `The priority is as following: 1. the configuration 'python.client.executable' defined in the source code; 2. the environment variable PYFLINK_EXECUTABLE; 3. the configuration 'python.client.executable' defined in flink-conf.yaml` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services