Hi All, Thanks for the feedback @Chesnay Schepler <ches...@apache.org> @Dian!
I think using `apache-flink` for the project name also makes sense to me. due to we should always keep in mind that Flink is owned by Apache. (And beam also using this pattern `apache-beam` for Python API) Regarding the Python API release with the JAVA JARs, I think the principle of consideration is the convenience of the user. So, Thanks for the explanation @Dian! And your right @Chesnay Schepler <ches...@apache.org> we can't make a hasty decision and we need more people's opinions! So, I appreciate it if anyone can give us feedback and suggestions! Best, Jincheng Chesnay Schepler <ches...@apache.org> 于2019年7月3日周三 下午8:46写道: > So this would not be a source release then, but a full-blown binary > release. > > Maybe it is just me, but I find it a bit suspect to ship an entire java > application via PyPI, just because there's a Python API for it. > > We definitely need input from more people here. > > On 03/07/2019 14:09, Dian Fu wrote: > > Hi Chesnay, > > > > Thanks a lot for the suggestions. > > > > Regarding “distributing java/scala code to PyPI”: > > The Python Table API is just a wrapper of the Java Table API and without > the java/scala code, two steps will be needed to set up an environment to > execute a Python Table API program: > > 1) Install pyflink using "pip install apache-flink" > > 2) Download the flink distribution and set the FLINK_HOME to it. > > Besides, users have to make sure that the manually installed Flink is > compatible with the pip installed pyflink. > > > > Bundle the java/scala code inside the Python package will eliminate step > 2) and makes it more simple for users to install pyflink. There was a short > discussion <https://issues.apache.org/jira/browse/SPARK-1267> on this in > Spark community and they finally decide to package the java/scala code in > the python package. (BTW, PySpark only bundle the jars of scala 2.11). > > > > Regards, > > Dian > > > >> 在 2019年7月3日,下午7:13,Chesnay Schepler <ches...@apache.org> 写道: > >> > >> The existing artifact in the pyflink project was neither released by > the Flink project / anyone affiliated with it nor approved by the Flink PMC. > >> > >> As such, if we were to use this account I believe we should delete it > to not mislead users that this is in any way an apache-provided > distribution. Since this goes against the users wishes, I would be in favor > of creating a separate account, and giving back control over the pyflink > account. > >> > >> My take on the raised points: > >> 1.1) "apache-flink" > >> 1.2) option 2 > >> 2) Given that we only distribute python code there should be no reason > to differentiate between scala versions. We should not be distributing any > java/scala code and/or modules to PyPi. Currently, I'm a bit confused about > this question and wonder what exactly we are trying to publish here. > >> 3) The should be treated as any other source release; i.e., it needs a > LICENSE and NOTICE file, signatures and a PMC vote. My suggestion would be > to make this part of our normal release process. There will be _one_ source > release on dist.apache.org encompassing everything, and a separate python > of focused source release that we push to PyPi. The LICENSE and NOTICE > contained in the python source release must also be present in the source > release of Flink; so basically the python source release is just the > contents of flink-python module the maven pom.xml, with no other special > sauce added during the release process. > >> > >> On 02/07/2019 05:42, jincheng sun wrote: > >>> Hi all, > >>> > >>> With the effort of FLIP-38 [1], the Python Table API(without UDF > support > >>> for now) will be supported in the coming release-1.9. > >>> As described in "Build PyFlink"[2], if users want to use the Python > Table > >>> API, they can manually install it using the command: > >>> "cd flink-python && python3 setup.py sdist && pip install > dist/*.tar.gz". > >>> > >>> This is non-trivial for users and it will be better if we can follow > the > >>> Python way to publish PyFlink to PyPI > >>> which is a repository of software for the Python programming language. > Then > >>> users can use the standard Python package > >>> manager "pip" to install PyFlink: "pip install pyflink". So, there are > some > >>> topic need to be discussed as follows: > >>> > >>> 1. How to publish PyFlink to PyPI > >>> > >>> 1.1 Project Name > >>> We need to decide the project name of PyPI to use, for example, > >>> apache-flink, pyflink, etc. > >>> > >>> Regarding to the name "pyflink", it has already been registered by > >>> @ueqt and there is already a package '1.0' released under this project > >>> which is based on flink-libraries/flink-python. > >>> > >>> @ueqt has kindly agreed to give this project back to the > community. And > >>> he has requested that the released package '1.0' should not be removed > as > >>> it has already been used in their company. > >>> > >>> So we need to decide whether to use the name 'pyflink'? If yes, > we > >>> need to figure out how to tackle with the package '1.0' under this > project. > >>> > >>> From the points of my view, the "pyflink" is better for our > project > >>> name and we can keep the release of 1.0, maybe more people want to use. > >>> > >>> 1.2 PyPI account for release > >>> We need also decide on which account to use to publish packages > to PyPI. > >>> > >>> There are two permissions in PyPI: owner and maintainer: > >>> > >>> 1) The owner can upload releases, delete files, releases or the > entire > >>> project. > >>> 2) The maintainer can also upload releases. However, they cannot > delete > >>> files, releases, or the project. > >>> > >>> So there are two options in my mind: > >>> > >>> 1) Create an account such as 'pyflink' as the owner share it with > all > >>> the release managers and then release managers can publish the package > to > >>> PyPI using this account. > >>> 2) Create an account such as 'pyflink' as owner(only PMC can > manage it) > >>> and adds the release manager's account as maintainers of the project. > >>> Release managers publish the package to PyPI using their own account. > >>> > >>> As I know, PySpark takes Option 1) and Apache Beam takes Option > 2). > >>> > >>> From the points of my view, I prefer option 2) as it's pretty > safer as > >>> it eliminate the risk of deleting old releases occasionally and at the > same > >>> time keeps the trace of who is operating. > >>> > >>> 2. How to handle Scala_2.11 and Scala_2.12 > >>> > >>> The PyFlink package bundles the jars in the package. As we know, there > are > >>> two versions of jars for each module: one for Scala 2.11 and the other > for > >>> Scala 2.12. So there will be two PyFlink packages theoretically. We > need to > >>> decide which one to publish to PyPI or both. If both packages will be > >>> published to PyPI, we may need two projects, such as pyflink_211 and > >>> pyflink_212 separately. Maybe more in the future such as pyflink_213. > >>> > >>> (BTW, I think we should bring up a discussion for dorp Scala_2.11 > in > >>> Flink 1.10 release due to 2.13 is available in early June.) > >>> > >>> From the points of my view, for now, we can only release the > scala_2.11 > >>> version, due to scala_2.11 is our default version in Flink. > >>> > >>> 3. Legal problems of publishing to PyPI > >>> > >>> As @Chesnay Schepler <ches...@apache.org> pointed out in > FLINK-13011[3], > >>> publishing PyFlink to PyPI means that we will publish binaries to a > >>> distribution channel not owned by Apache. We need to figure out if > there > >>> are legal problems. From my point of view, there are no problems as a > few > >>> Apache projects such as Spark, Beam, etc have already done it. Frankly > >>> speaking, I am not familiar with this problem, welcome any feedback on > this > >>> if somebody is more family with this. > >>> > >>> Great thanks to @ueqt for willing to dedicate PyPI's project name > `pyflink` > >>> to the Apache Flink community!!! > >>> Great thanks to @Dian for the offline effort!!! > >>> > >>> Best, > >>> Jincheng > >>> > >>> [1] > >>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API > >>> [2] > >>> > https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#build-pyflink > >>> [3] https://issues.apache.org/jira/browse/FLINK-13011 > >>> > > > >