Sorry for chiming in so late. I would be in favor of option #2. I guess that the PMC would need to give the credentials to the release manager for option #1. Hence, the PMC could also add the release manager as a maintainer which makes sure that only the PMC can delete artifacts.
Cheers, Till On Wed, Jul 24, 2019 at 12:33 PM jincheng sun <sunjincheng...@gmail.com> wrote: > Hi all, > > Thanks for all of your reply! > > Hi Stephan, thanks for the reply and prove the details we need to pay > attention to. such as: Readme and Trademark compliance. Regarding the PyPI > account for release, #1 may have some risk that our release package can be > deleted by anyone who know the password of the account. And in this case > PMC would not have means to correct problems. So, I think the #2 is pretty > safe for flink community. > > Hi Jeff&Dian, thanks for share your thoughts. Python API just a language > entry point. I think which binary should be contained in the release we > should make consistency with Java release policy. So, currently we do not > add the Hadoop, connectors JARs into the release package. > > Hi Chesnay, agree that we should ship the very common binary in feature if > Java side already make the decision. > > So, our current consensus is: > 1. Should we re publish the PyFlink into PyPI --> YES > 2. PyPI Project Name ---> apache-flink > 3. How to handle Scala_2.11 and Scala_2.12 ---> We only release one binary > with the default Scala version same with flink default config. > > We still need discuss how to manage PyPI account for release: > -------- > > 1) Create an account such as 'pyflink' as the owner share it with all the > release managers and then release managers can publish the package to PyPI > using this account. > 2) Create an account such as 'pyflink' as owner(only PMC can manage it) > and adds the release manager's account as maintainers of the project. > Release managers publish the package to PyPI using their own account. > -------- > Stephan like the #1 but want PMC can correct the problems. (sounds like #2) > can you conform that ? @Stephan > Chesnay and I prefer to #2 > > Best, Jincheng > > Chesnay Schepler <ches...@apache.org> 于2019年7月24日周三 下午3:57写道: > > > if we ship a binary, we should ship the binary we usually ship, not some > > highly customized version. > > > > On 24/07/2019 05:19, Dian Fu wrote: > > > Hi Stephan & Jeff, > > > > > > Thanks a lot for sharing your thoughts! > > > > > > Regarding the bundled jars, currently only the jars in the flink binary > > distribution is packaged in the pyflink package. That maybe a good idea > to > > also bundle the other jars such as flink-hadoop-compatibility. We may > need > > also consider whether to bundle the format jars such as flink-avro, > > flink-json, flink-csv and the connector jars such as > flink-connector-kafka, > > etc. > > > > > > If FLINK_HOME is set, the binary distribution specified by FLINK_HOME > > will be used instead. > > > > > > Regards, > > > Dian > > > > > >> 在 2019年7月24日,上午9:47,Jeff Zhang <zjf...@gmail.com> 写道: > > >> > > >> +1 for publishing pyflink to pypi. > > >> > > >> Regarding including jar, I just want to make sure which flink binary > > >> distribution we would ship with pyflink since we have multiple flink > > binary > > >> distributions (w/o hadoop). > > >> Personally, I prefer to use the hadoop-included binary distribution. > > >> > > >> And I just want to confirm whether it is possible for users to use a > > >> different flink binary distribution as long as he set env FLINK_HOME. > > >> > > >> Besides that, I hope that there will be bi-direction link reference > > between > > >> flink doc and pypi doc. > > >> > > >> > > >> > > >> Stephan Ewen <se...@apache.org> 于2019年7月24日周三 上午12:07写道: > > >> > > >>> Hi! > > >>> > > >>> Sorry for the late involvement. Here are some thoughts from my side: > > >>> > > >>> Definitely +1 to publishing to PyPy, even if it is a binary release. > > >>> Community growth into other communities is great, and if this is the > > >>> natural way to reach developers in the Python community, let's do it. > > This > > >>> is not about our convenience, but reaching users. > > >>> > > >>> I think the way to look at this is that this is a convenience > > distribution > > >>> channel, courtesy of the Flink community. It is not an Apache > release, > > we > > >>> make this clear in the Readme. > > >>> Of course, this doesn't mean we don't try to uphold similar standards > > as > > >>> for our official release (like proper license information). > > >>> > > >>> Concerning credentials sharing, I would be fine with whatever option. > > The > > >>> PMC doesn't own it (it is an initiative by some community members), > > but the > > >>> PMC needs to ensure trademark compliance, so slight preference for > > option > > >>> #1 (PMC would have means to correct problems). > > >>> > > >>> I believe there is no need to differentiate between Scala versions, > > because > > >>> this is merely a convenience thing for pure Python users. Users that > > mix > > >>> python and scala (and thus depend on specific scala versions) can > still > > >>> download from Apache or build themselves. > > >>> > > >>> Best, > > >>> Stephan > > >>> > > >>> > > >>> > > >>> On Thu, Jul 4, 2019 at 9:51 AM jincheng sun < > sunjincheng...@gmail.com> > > >>> wrote: > > >>> > > >>>> Hi All, > > >>>> > > >>>> Thanks for the feedback @Chesnay Schepler <ches...@apache.org> > @Dian! > > >>>> > > >>>> I think using `apache-flink` for the project name also makes sense > to > > me. > > >>>> due to we should always keep in mind that Flink is owned by Apache. > > (And > > >>>> beam also using this pattern `apache-beam` for Python API) > > >>>> > > >>>> Regarding the Python API release with the JAVA JARs, I think the > > >>> principle > > >>>> of consideration is the convenience of the user. So, Thanks for the > > >>>> explanation @Dian! > > >>>> > > >>>> And your right @Chesnay Schepler <ches...@apache.org> we can't > make > > a > > >>>> hasty decision and we need more people's opinions! > > >>>> > > >>>> So, I appreciate it if anyone can give us feedback and suggestions! > > >>>> > > >>>> Best, > > >>>> Jincheng > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> Chesnay Schepler <ches...@apache.org> 于2019年7月3日周三 下午8:46写道: > > >>>> > > >>>>> So this would not be a source release then, but a full-blown binary > > >>>>> release. > > >>>>> > > >>>>> Maybe it is just me, but I find it a bit suspect to ship an entire > > java > > >>>>> application via PyPI, just because there's a Python API for it. > > >>>>> > > >>>>> We definitely need input from more people here. > > >>>>> > > >>>>> On 03/07/2019 14:09, Dian Fu wrote: > > >>>>>> Hi Chesnay, > > >>>>>> > > >>>>>> Thanks a lot for the suggestions. > > >>>>>> > > >>>>>> Regarding “distributing java/scala code to PyPI”: > > >>>>>> The Python Table API is just a wrapper of the Java Table API and > > >>>> without > > >>>>> the java/scala code, two steps will be needed to set up an > > environment > > >>> to > > >>>>> execute a Python Table API program: > > >>>>>> 1) Install pyflink using "pip install apache-flink" > > >>>>>> 2) Download the flink distribution and set the FLINK_HOME to it. > > >>>>>> Besides, users have to make sure that the manually installed Flink > > is > > >>>>> compatible with the pip installed pyflink. > > >>>>>> Bundle the java/scala code inside the Python package will > eliminate > > >>>> step > > >>>>> 2) and makes it more simple for users to install pyflink. There > was a > > >>>> short > > >>>>> discussion <https://issues.apache.org/jira/browse/SPARK-1267> on > > this > > >>> in > > >>>>> Spark community and they finally decide to package the java/scala > > code > > >>> in > > >>>>> the python package. (BTW, PySpark only bundle the jars of scala > > 2.11). > > >>>>>> Regards, > > >>>>>> Dian > > >>>>>> > > >>>>>>> 在 2019年7月3日,下午7:13,Chesnay Schepler <ches...@apache.org> 写道: > > >>>>>>> > > >>>>>>> The existing artifact in the pyflink project was neither released > > by > > >>>>> the Flink project / anyone affiliated with it nor approved by the > > Flink > > >>>> PMC. > > >>>>>>> As such, if we were to use this account I believe we should > delete > > >>> it > > >>>>> to not mislead users that this is in any way an apache-provided > > >>>>> distribution. Since this goes against the users wishes, I would be > in > > >>>> favor > > >>>>> of creating a separate account, and giving back control over the > > >>> pyflink > > >>>>> account. > > >>>>>>> My take on the raised points: > > >>>>>>> 1.1) "apache-flink" > > >>>>>>> 1.2) option 2 > > >>>>>>> 2) Given that we only distribute python code there should be no > > >>> reason > > >>>>> to differentiate between scala versions. We should not be > > distributing > > >>>> any > > >>>>> java/scala code and/or modules to PyPi. Currently, I'm a bit > confused > > >>>> about > > >>>>> this question and wonder what exactly we are trying to publish > here. > > >>>>>>> 3) The should be treated as any other source release; i.e., it > > >>> needs a > > >>>>> LICENSE and NOTICE file, signatures and a PMC vote. My suggestion > > would > > >>>> be > > >>>>> to make this part of our normal release process. There will be > _one_ > > >>>> source > > >>>>> release on dist.apache.org encompassing everything, and a separate > > >>>> python > > >>>>> of focused source release that we push to PyPi. The LICENSE and > > NOTICE > > >>>>> contained in the python source release must also be present in the > > >>> source > > >>>>> release of Flink; so basically the python source release is just > the > > >>>>> contents of flink-python module the maven pom.xml, with no other > > >>> special > > >>>>> sauce added during the release process. > > >>>>>>> On 02/07/2019 05:42, jincheng sun wrote: > > >>>>>>>> Hi all, > > >>>>>>>> > > >>>>>>>> With the effort of FLIP-38 [1], the Python Table API(without UDF > > >>>>> support > > >>>>>>>> for now) will be supported in the coming release-1.9. > > >>>>>>>> As described in "Build PyFlink"[2], if users want to use the > > Python > > >>>>> Table > > >>>>>>>> API, they can manually install it using the command: > > >>>>>>>> "cd flink-python && python3 setup.py sdist && pip install > > >>>>> dist/*.tar.gz". > > >>>>>>>> This is non-trivial for users and it will be better if we can > > >>> follow > > >>>>> the > > >>>>>>>> Python way to publish PyFlink to PyPI > > >>>>>>>> which is a repository of software for the Python programming > > >>>> language. > > >>>>> Then > > >>>>>>>> users can use the standard Python package > > >>>>>>>> manager "pip" to install PyFlink: "pip install pyflink". So, > there > > >>>> are > > >>>>> some > > >>>>>>>> topic need to be discussed as follows: > > >>>>>>>> > > >>>>>>>> 1. How to publish PyFlink to PyPI > > >>>>>>>> > > >>>>>>>> 1.1 Project Name > > >>>>>>>> We need to decide the project name of PyPI to use, for > > >>> example, > > >>>>>>>> apache-flink, pyflink, etc. > > >>>>>>>> > > >>>>>>>> Regarding to the name "pyflink", it has already been > > >>> registered > > >>>> by > > >>>>>>>> @ueqt and there is already a package '1.0' released under this > > >>>> project > > >>>>>>>> which is based on flink-libraries/flink-python. > > >>>>>>>> > > >>>>>>>> @ueqt has kindly agreed to give this project back to the > > >>>>> community. And > > >>>>>>>> he has requested that the released package '1.0' should not be > > >>>> removed > > >>>>> as > > >>>>>>>> it has already been used in their company. > > >>>>>>>> > > >>>>>>>> So we need to decide whether to use the name 'pyflink'? If > > >>> yes, > > >>>>> we > > >>>>>>>> need to figure out how to tackle with the package '1.0' under > this > > >>>>> project. > > >>>>>>>> From the points of my view, the "pyflink" is better for our > > >>>>> project > > >>>>>>>> name and we can keep the release of 1.0, maybe more people want > to > > >>>> use. > > >>>>>>>> 1.2 PyPI account for release > > >>>>>>>> We need also decide on which account to use to publish > > >>> packages > > >>>>> to PyPI. > > >>>>>>>> There are two permissions in PyPI: owner and maintainer: > > >>>>>>>> > > >>>>>>>> 1) The owner can upload releases, delete files, releases or > > >>> the > > >>>>> entire > > >>>>>>>> project. > > >>>>>>>> 2) The maintainer can also upload releases. However, they > > >>> cannot > > >>>>> delete > > >>>>>>>> files, releases, or the project. > > >>>>>>>> > > >>>>>>>> So there are two options in my mind: > > >>>>>>>> > > >>>>>>>> 1) Create an account such as 'pyflink' as the owner share > it > > >>>> with > > >>>>> all > > >>>>>>>> the release managers and then release managers can publish the > > >>>> package > > >>>>> to > > >>>>>>>> PyPI using this account. > > >>>>>>>> 2) Create an account such as 'pyflink' as owner(only PMC > can > > >>>>> manage it) > > >>>>>>>> and adds the release manager's account as maintainers of the > > >>> project. > > >>>>>>>> Release managers publish the package to PyPI using their own > > >>> account. > > >>>>>>>> As I know, PySpark takes Option 1) and Apache Beam takes > > >>> Option > > >>>>> 2). > > >>>>>>>> From the points of my view, I prefer option 2) as it's > pretty > > >>>>> safer as > > >>>>>>>> it eliminate the risk of deleting old releases occasionally and > at > > >>>> the > > >>>>> same > > >>>>>>>> time keeps the trace of who is operating. > > >>>>>>>> > > >>>>>>>> 2. How to handle Scala_2.11 and Scala_2.12 > > >>>>>>>> > > >>>>>>>> The PyFlink package bundles the jars in the package. As we know, > > >>>> there > > >>>>> are > > >>>>>>>> two versions of jars for each module: one for Scala 2.11 and the > > >>>> other > > >>>>> for > > >>>>>>>> Scala 2.12. So there will be two PyFlink packages theoretically. > > We > > >>>>> need to > > >>>>>>>> decide which one to publish to PyPI or both. If both packages > will > > >>> be > > >>>>>>>> published to PyPI, we may need two projects, such as pyflink_211 > > >>> and > > >>>>>>>> pyflink_212 separately. Maybe more in the future such as > > >>> pyflink_213. > > >>>>>>>> (BTW, I think we should bring up a discussion for dorp > > >>>> Scala_2.11 > > >>>>> in > > >>>>>>>> Flink 1.10 release due to 2.13 is available in early June.) > > >>>>>>>> > > >>>>>>>> From the points of my view, for now, we can only release > the > > >>>>> scala_2.11 > > >>>>>>>> version, due to scala_2.11 is our default version in Flink. > > >>>>>>>> > > >>>>>>>> 3. Legal problems of publishing to PyPI > > >>>>>>>> > > >>>>>>>> As @Chesnay Schepler <ches...@apache.org> pointed out in > > >>>>> FLINK-13011[3], > > >>>>>>>> publishing PyFlink to PyPI means that we will publish binaries > to > > a > > >>>>>>>> distribution channel not owned by Apache. We need to figure out > if > > >>>>> there > > >>>>>>>> are legal problems. From my point of view, there are no problems > > >>> as a > > >>>>> few > > >>>>>>>> Apache projects such as Spark, Beam, etc have already done it. > > >>>> Frankly > > >>>>>>>> speaking, I am not familiar with this problem, welcome any > > feedback > > >>>> on > > >>>>> this > > >>>>>>>> if somebody is more family with this. > > >>>>>>>> > > >>>>>>>> Great thanks to @ueqt for willing to dedicate PyPI's project > name > > >>>>> `pyflink` > > >>>>>>>> to the Apache Flink community!!! > > >>>>>>>> Great thanks to @Dian for the offline effort!!! > > >>>>>>>> > > >>>>>>>> Best, > > >>>>>>>> Jincheng > > >>>>>>>> > > >>>>>>>> [1] > > >>>>>>>> > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API > > >>>>>>>> [2] > > >>>>>>>> > > >>> > > > https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#build-pyflink > > >>>>>>>> [3] https://issues.apache.org/jira/browse/FLINK-13011 > > >>>>>>>> > > >>>>> > > >> > > >> -- > > >> Best Regards > > >> > > >> Jeff Zhang > > > > > > > >