Perfect, thanks for the background info! I also found this section now, which 
mentions that it comes from Hadoop: 
https://spark.apache.org/docs/latest/running-on-yarn.html#important-notes.

I think the proposed changes are good!

Best,
Aljoscha

> On 4. Dec 2019, at 04:34, Wei Zhong <weizhong0...@gmail.com> wrote:
> 
> Hi Aljoscha,
> 
> Thanks for your reply! Before bringing up this discussion I did some research 
> on commonly used separators for options that take multiple values. I have 
> considered ",", ":" and "#". Finally I chose "#" as the separator of 
> "--pyRequirements".
> 
> For ",", it is the most widely used separator. Many projects use it as the 
> separator of the values in same level. e.g. "-Dexcludes" in Maven, "--files" 
> in Spark and "-pyFiles" in Flink. But the second parameter of 
> "--pyRequirements", the requirement cached directory, is not at the same 
> level as its first parameter (the requirements file). It is secondary and is 
> only needed when the packages in the requirements file can not be downloaded 
> from the package index server.
> 
> For ":", it is used as a path separator in most cases. e.g. main arguments of 
> scp (secure copy), "--volume" in Docker and "-cp" in Java. But as we support 
> accept a URI as the file path, which contains ":" in most cases, ":" can not 
> be used as the separator of "--pyRequirements".
> 
> For "#", it is really rarely used as a separator for multiple values. I only 
> find Spark using "#" as the separator for option "--files" and "--archives" 
> between file path and target file/directory name. After some research I find 
> that this usage comes from the URI fragment. We can append a secondary 
> resource as the fragment of the URI after a number sign ("#") character. As 
> we treat user file paths as URIs when parsing command line, using "#" as the 
> separator of "--pyRequirements" makes sense to me, which means the second 
> parameter is the fragment of the first parameter. The definition of URI 
> fragment can be found here [1].
> 
> The reason of using "#" in "--pyArchives" as the separator of file path and 
> targer directory name is the same as above.
> 
> Best,
> Wei
> 
> [1] https://tools.ietf.org/html/rfc3986#section-3.5
> 
>> 在 2019年12月3日,22:02,Aljoscha Krettek <aljos...@apache.org> 写道:
>> 
>> Hi,
>> 
>> Yes, I think it’s a good idea to make the options uniform. Using ‘#’ as a 
>> separator for options that take two values seems a bit strange to me, did 
>> you research if any other CLI tools have this convention?
>> 
>> Side note: I don’t like that our options use camel-case, I think that’s very 
>> non-standard. But that’s how it is now…
>> 
>> Best,
>> Aljoscha
>> 
>>> On 3. Dec 2019, at 10:14, jincheng sun <sunjincheng...@gmail.com> wrote:
>>> 
>>> Thanks for bringup this discussion Wei!
>>> I think this is very important for Flink User, we should contains this
>>> changes in Flink 1.10.
>>> +1  for the optimization from the perspective of user convenience and the
>>> unified use of Flink command line parameters.
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> Wei Zhong <weizhong0...@gmail.com> 于2019年12月2日周一 下午3:26写道:
>>> 
>>>> Hi everyone,
>>>> 
>>>> I wanted to bring up the discussion of improving the Pyflink command line
>>>> options.
>>>> 
>>>> A few command line options have been introduced in the FLIP-78 [1], i.e.
>>>> "python-executable-path", "python-requirements","python-archive", etc.
>>>> There are a few problems with these options, i.e. the naming style,
>>>> variable argument options, etc.
>>>> 
>>>> We want to make some adjustment of FLIP-78 to improve the newly introduced
>>>> command line options, here is the design doc:
>>>> 
>>>> https://docs.google.com/document/d/1R8CaDa3908V1SnTxBkTBzeisWqBF40NAYYjfRl680eg/edit?usp=sharing
>>>> <
>>>> https://docs.google.com/document/d/1R8CaDa3908V1SnTxBkTBzeisWqBF40NAYYjfRl680eg/edit?usp=sharing
>>>>> 
>>>> Looking forward to your feedback!
>>>> 
>>>> Best,
>>>> Wei
>>>> 
>>>> [1]
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-78%3A+Flink+Python+UDF+Environment+and+Dependency+Management
>>>> <
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-78:+Flink+Python+UDF+Environment+and+Dependency+Management
>>>>> 
>>>> 
>>>> 
>> 
> 

Reply via email to