Sounds good to me. Thanks for driving this discussion.

Cheers,
Till

On Mon, Jul 29, 2019 at 9:24 AM jincheng sun <sunjincheng...@gmail.com>
wrote:

> Yes Till, I think you are correct that we should make sure that the
> published Flink Python API cannot be arbitrarily deleted.
>
> So, It seems that our current consensus is:
>
> 1. Should we re publish the PyFlink into PyPI --> YES
> 2. PyPI Project Name ---> apache-flink
> 3. How to handle Scala_2.11 and Scala_2.12 ---> We only release one binary
> with the default Scala version same with flink default config.
> 4. PyPI account for release --> Create an account such as 'pyflink' as
> owner(only PMC can manage it) and adds the release manager's account as
> maintainers of the project. Release managers publish the package to PyPI
> using their own account but can not delete the release.
>
> So, If there no other comments, I think we should initiate a voting thread.
>
> What do you think?
>
> Best, Jincheng
>
>
> Till Rohrmann <trohrm...@apache.org> 于2019年7月24日周三 下午1:17写道:
>
> > Sorry for chiming in so late. I would be in favor of option #2.
> >
> > I guess that the PMC would need to give the credentials to the release
> > manager for option #1. Hence, the PMC could also add the release manager
> as
> > a maintainer which makes sure that only the PMC can delete artifacts.
> >
> > Cheers,
> > Till
> >
> > On Wed, Jul 24, 2019 at 12:33 PM jincheng sun <sunjincheng...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > Thanks for all of your reply!
> > >
> > > Hi Stephan, thanks for the reply and prove the details we need to pay
> > > attention to. such as: Readme and Trademark compliance. Regarding the
> > PyPI
> > > account for release,  #1 may have some risk that our release package
> can
> > be
> > > deleted by anyone who know the password of the account. And in this
> case
> > > PMC would not have means to correct problems. So, I think the #2 is
> > pretty
> > > safe for flink community.
> > >
> > > Hi Jeff&Dian, thanks for share your thoughts. Python API just a
> language
> > > entry point. I think which binary should be contained in the release we
> > > should make consistency with Java release policy.  So, currently we do
> > not
> > > add the Hadoop, connectors JARs into the release package.
> > >
> > > Hi Chesnay, agree that we should ship the very common binary in feature
> > if
> > > Java side already make the decision.
> > >
> > > So, our current consensus is:
> > > 1. Should we re publish the PyFlink into PyPI --> YES
> > > 2. PyPI Project Name ---> apache-flink
> > > 3. How to handle Scala_2.11 and Scala_2.12 ---> We only release one
> > binary
> > > with the default Scala version same with flink default config.
> > >
> > > We still need discuss how to manage PyPI account for release:
> > > --------
> > > > 1) Create an account such as 'pyflink' as the owner share it with all
> > the
> > > release managers and then release managers can publish the package to
> > PyPI
> > > using this account.
> > >     2) Create an account such as 'pyflink' as owner(only PMC can manage
> > it)
> > > and adds the release manager's account as maintainers of the project.
> > > Release managers publish the package to PyPI using their own account.
> > > --------
> > > Stephan like the #1 but want PMC can correct the problems. (sounds like
> > #2)
> > > can you conform that ? @Stephan
> > > Chesnay and I prefer to #2
> > >
> > > Best, Jincheng
> > >
> > > Chesnay Schepler <ches...@apache.org> 于2019年7月24日周三 下午3:57写道:
> > >
> > > > if we ship a binary, we should ship the binary we usually ship, not
> > some
> > > > highly customized version.
> > > >
> > > > On 24/07/2019 05:19, Dian Fu wrote:
> > > > > Hi Stephan & Jeff,
> > > > >
> > > > > Thanks a lot for sharing your thoughts!
> > > > >
> > > > > Regarding the bundled jars, currently only the jars in the flink
> > binary
> > > > distribution is packaged in the pyflink package. That maybe a good
> idea
> > > to
> > > > also bundle the other jars such as flink-hadoop-compatibility. We may
> > > need
> > > > also consider whether to bundle the format jars such as flink-avro,
> > > > flink-json, flink-csv and the connector jars such as
> > > flink-connector-kafka,
> > > > etc.
> > > > >
> > > > > If FLINK_HOME is set, the binary distribution specified by
> FLINK_HOME
> > > > will be used instead.
> > > > >
> > > > > Regards,
> > > > > Dian
> > > > >
> > > > >> 在 2019年7月24日,上午9:47,Jeff Zhang <zjf...@gmail.com> 写道:
> > > > >>
> > > > >> +1 for publishing pyflink to pypi.
> > > > >>
> > > > >> Regarding including jar, I just want to make sure which flink
> binary
> > > > >> distribution we would ship with pyflink since we have multiple
> flink
> > > > binary
> > > > >> distributions (w/o hadoop).
> > > > >> Personally, I prefer to use the hadoop-included binary
> distribution.
> > > > >>
> > > > >> And I just want to confirm whether it is possible for users to
> use a
> > > > >> different flink binary distribution as long as he set env
> > FLINK_HOME.
> > > > >>
> > > > >> Besides that, I hope that there will be bi-direction link
> reference
> > > > between
> > > > >> flink doc and pypi doc.
> > > > >>
> > > > >>
> > > > >>
> > > > >> Stephan Ewen <se...@apache.org> 于2019年7月24日周三 上午12:07写道:
> > > > >>
> > > > >>> Hi!
> > > > >>>
> > > > >>> Sorry for the late involvement. Here are some thoughts from my
> > side:
> > > > >>>
> > > > >>> Definitely +1 to publishing to PyPy, even if it is a binary
> > release.
> > > > >>> Community growth into other communities is great, and if this is
> > the
> > > > >>> natural way to reach developers in the Python community, let's do
> > it.
> > > > This
> > > > >>> is not about our convenience, but reaching users.
> > > > >>>
> > > > >>> I think the way to look at this is that this is a convenience
> > > > distribution
> > > > >>> channel, courtesy of the Flink community. It is not an Apache
> > > release,
> > > > we
> > > > >>> make this clear in the Readme.
> > > > >>> Of course, this doesn't mean we don't try to uphold similar
> > standards
> > > > as
> > > > >>> for our official release (like proper license information).
> > > > >>>
> > > > >>> Concerning credentials sharing, I would be fine with whatever
> > option.
> > > > The
> > > > >>> PMC doesn't own it (it is an initiative by some community
> members),
> > > > but the
> > > > >>> PMC needs to ensure trademark compliance, so slight preference
> for
> > > > option
> > > > >>> #1 (PMC would have means to correct problems).
> > > > >>>
> > > > >>> I believe there is no need to differentiate between Scala
> versions,
> > > > because
> > > > >>> this is merely a convenience thing for pure Python users. Users
> > that
> > > > mix
> > > > >>> python and scala (and thus depend on specific scala versions) can
> > > still
> > > > >>> download from Apache or build themselves.
> > > > >>>
> > > > >>> Best,
> > > > >>> Stephan
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Thu, Jul 4, 2019 at 9:51 AM jincheng sun <
> > > sunjincheng...@gmail.com>
> > > > >>> wrote:
> > > > >>>
> > > > >>>> Hi All,
> > > > >>>>
> > > > >>>> Thanks for the feedback @Chesnay Schepler <ches...@apache.org>
> > > @Dian!
> > > > >>>>
> > > > >>>> I think using `apache-flink` for the project name also makes
> sense
> > > to
> > > > me.
> > > > >>>> due to we should always keep in mind that Flink is owned by
> > Apache.
> > > > (And
> > > > >>>> beam also using this pattern `apache-beam` for Python API)
> > > > >>>>
> > > > >>>> Regarding the Python API release with the JAVA JARs, I think the
> > > > >>> principle
> > > > >>>> of consideration is the convenience of the user. So, Thanks for
> > the
> > > > >>>> explanation @Dian!
> > > > >>>>
> > > > >>>> And your right @Chesnay Schepler <ches...@apache.org>  we can't
> > > make
> > > > a
> > > > >>>> hasty decision and we need more people's opinions!
> > > > >>>>
> > > > >>>> So, I appreciate it if anyone can give us feedback and
> > suggestions!
> > > > >>>>
> > > > >>>> Best,
> > > > >>>> Jincheng
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> Chesnay Schepler <ches...@apache.org> 于2019年7月3日周三 下午8:46写道:
> > > > >>>>
> > > > >>>>> So this would not be a source release then, but a full-blown
> > binary
> > > > >>>>> release.
> > > > >>>>>
> > > > >>>>> Maybe it is just me, but I find it a bit suspect to ship an
> > entire
> > > > java
> > > > >>>>> application via PyPI, just because there's a Python API for it.
> > > > >>>>>
> > > > >>>>> We definitely need input from more people here.
> > > > >>>>>
> > > > >>>>> On 03/07/2019 14:09, Dian Fu wrote:
> > > > >>>>>> Hi Chesnay,
> > > > >>>>>>
> > > > >>>>>> Thanks a lot for the suggestions.
> > > > >>>>>>
> > > > >>>>>> Regarding “distributing java/scala code to PyPI”:
> > > > >>>>>> The Python Table API is just a wrapper of the Java Table API
> and
> > > > >>>> without
> > > > >>>>> the java/scala code, two steps will be needed to set up an
> > > > environment
> > > > >>> to
> > > > >>>>> execute a Python Table API program:
> > > > >>>>>> 1) Install pyflink using "pip install apache-flink"
> > > > >>>>>> 2) Download the flink distribution and set the FLINK_HOME to
> it.
> > > > >>>>>> Besides, users have to make sure that the manually installed
> > Flink
> > > > is
> > > > >>>>> compatible with the pip installed pyflink.
> > > > >>>>>> Bundle the java/scala code inside the Python package will
> > > eliminate
> > > > >>>> step
> > > > >>>>> 2) and makes it more simple for users to install pyflink. There
> > > was a
> > > > >>>> short
> > > > >>>>> discussion <https://issues.apache.org/jira/browse/SPARK-1267>
> on
> > > > this
> > > > >>> in
> > > > >>>>> Spark community and they finally decide to package the
> java/scala
> > > > code
> > > > >>> in
> > > > >>>>> the python package. (BTW, PySpark only bundle the jars of scala
> > > > 2.11).
> > > > >>>>>> Regards,
> > > > >>>>>> Dian
> > > > >>>>>>
> > > > >>>>>>> 在 2019年7月3日,下午7:13,Chesnay Schepler <ches...@apache.org> 写道:
> > > > >>>>>>>
> > > > >>>>>>> The existing artifact in the pyflink project was neither
> > released
> > > > by
> > > > >>>>> the Flink project / anyone affiliated with it nor approved by
> the
> > > > Flink
> > > > >>>> PMC.
> > > > >>>>>>> As such, if we were to use this account I believe we should
> > > delete
> > > > >>> it
> > > > >>>>> to not mislead users that this is in any way an apache-provided
> > > > >>>>> distribution. Since this goes against the users wishes, I would
> > be
> > > in
> > > > >>>> favor
> > > > >>>>> of creating a separate account, and giving back control over
> the
> > > > >>> pyflink
> > > > >>>>> account.
> > > > >>>>>>> My take on the raised points:
> > > > >>>>>>> 1.1) "apache-flink"
> > > > >>>>>>> 1.2)  option 2
> > > > >>>>>>> 2) Given that we only distribute python code there should be
> no
> > > > >>> reason
> > > > >>>>> to differentiate between scala versions. We should not be
> > > > distributing
> > > > >>>> any
> > > > >>>>> java/scala code and/or modules to PyPi. Currently, I'm a bit
> > > confused
> > > > >>>> about
> > > > >>>>> this question and wonder what exactly we are trying to publish
> > > here.
> > > > >>>>>>> 3) The should be treated as any other source release; i.e.,
> it
> > > > >>> needs a
> > > > >>>>> LICENSE and NOTICE file, signatures and a PMC vote. My
> suggestion
> > > > would
> > > > >>>> be
> > > > >>>>> to make this part of our normal release process. There will be
> > > _one_
> > > > >>>> source
> > > > >>>>> release on dist.apache.org encompassing everything, and a
> > separate
> > > > >>>> python
> > > > >>>>> of focused source release that we push to PyPi. The LICENSE and
> > > > NOTICE
> > > > >>>>> contained in the python source release must also be present in
> > the
> > > > >>> source
> > > > >>>>> release of Flink; so basically the python source release is
> just
> > > the
> > > > >>>>> contents of flink-python module the maven pom.xml, with no
> other
> > > > >>> special
> > > > >>>>> sauce added during the release process.
> > > > >>>>>>> On 02/07/2019 05:42, jincheng sun wrote:
> > > > >>>>>>>> Hi all,
> > > > >>>>>>>>
> > > > >>>>>>>> With the effort of FLIP-38 [1], the Python Table API(without
> > UDF
> > > > >>>>> support
> > > > >>>>>>>> for now) will be supported in the coming release-1.9.
> > > > >>>>>>>> As described in "Build PyFlink"[2], if users want to use the
> > > > Python
> > > > >>>>> Table
> > > > >>>>>>>> API, they can manually install it using the command:
> > > > >>>>>>>> "cd flink-python && python3 setup.py sdist && pip install
> > > > >>>>> dist/*.tar.gz".
> > > > >>>>>>>> This is non-trivial for users and it will be better if we
> can
> > > > >>> follow
> > > > >>>>> the
> > > > >>>>>>>> Python way to publish PyFlink to PyPI
> > > > >>>>>>>> which is a repository of software for the Python programming
> > > > >>>> language.
> > > > >>>>> Then
> > > > >>>>>>>> users can use the standard Python package
> > > > >>>>>>>> manager "pip" to install PyFlink: "pip install pyflink". So,
> > > there
> > > > >>>> are
> > > > >>>>> some
> > > > >>>>>>>> topic need to be discussed as follows:
> > > > >>>>>>>>
> > > > >>>>>>>> 1. How to publish PyFlink to PyPI
> > > > >>>>>>>>
> > > > >>>>>>>> 1.1 Project Name
> > > > >>>>>>>>       We need to decide the project name of PyPI to use, for
> > > > >>> example,
> > > > >>>>>>>> apache-flink,  pyflink, etc.
> > > > >>>>>>>>
> > > > >>>>>>>>      Regarding to the name "pyflink", it has already been
> > > > >>> registered
> > > > >>>> by
> > > > >>>>>>>> @ueqt and there is already a package '1.0' released under
> this
> > > > >>>> project
> > > > >>>>>>>> which is based on flink-libraries/flink-python.
> > > > >>>>>>>>
> > > > >>>>>>>>     @ueqt has kindly agreed to give this project back to the
> > > > >>>>> community. And
> > > > >>>>>>>> he has requested that the released package '1.0' should not
> be
> > > > >>>> removed
> > > > >>>>> as
> > > > >>>>>>>> it has already been used in their company.
> > > > >>>>>>>>
> > > > >>>>>>>>      So we need to decide whether to use the name 'pyflink'?
> > If
> > > > >>> yes,
> > > > >>>>> we
> > > > >>>>>>>> need to figure out how to tackle with the package '1.0'
> under
> > > this
> > > > >>>>> project.
> > > > >>>>>>>>      From the points of my view, the "pyflink" is better for
> > our
> > > > >>>>> project
> > > > >>>>>>>> name and we can keep the release of 1.0, maybe more people
> > want
> > > to
> > > > >>>> use.
> > > > >>>>>>>> 1.2 PyPI account for release
> > > > >>>>>>>>      We need also decide on which account to use to publish
> > > > >>> packages
> > > > >>>>> to PyPI.
> > > > >>>>>>>>      There are two permissions in PyPI: owner and
> maintainer:
> > > > >>>>>>>>
> > > > >>>>>>>>      1) The owner can upload releases, delete files,
> releases
> > or
> > > > >>> the
> > > > >>>>> entire
> > > > >>>>>>>> project.
> > > > >>>>>>>>      2) The maintainer can also upload releases. However,
> they
> > > > >>> cannot
> > > > >>>>> delete
> > > > >>>>>>>> files, releases, or the project.
> > > > >>>>>>>>
> > > > >>>>>>>>      So there are two options in my mind:
> > > > >>>>>>>>
> > > > >>>>>>>>      1) Create an account such as 'pyflink' as the owner
> share
> > > it
> > > > >>>> with
> > > > >>>>> all
> > > > >>>>>>>> the release managers and then release managers can publish
> the
> > > > >>>> package
> > > > >>>>> to
> > > > >>>>>>>> PyPI using this account.
> > > > >>>>>>>>      2) Create an account such as 'pyflink' as owner(only
> PMC
> > > can
> > > > >>>>> manage it)
> > > > >>>>>>>> and adds the release manager's account as maintainers of the
> > > > >>> project.
> > > > >>>>>>>> Release managers publish the package to PyPI using their own
> > > > >>> account.
> > > > >>>>>>>>      As I know, PySpark takes Option 1) and Apache Beam
> takes
> > > > >>> Option
> > > > >>>>> 2).
> > > > >>>>>>>>      From the points of my view, I prefer option 2) as it's
> > > pretty
> > > > >>>>> safer as
> > > > >>>>>>>> it eliminate the risk of deleting old releases occasionally
> > and
> > > at
> > > > >>>> the
> > > > >>>>> same
> > > > >>>>>>>> time keeps the trace of who is operating.
> > > > >>>>>>>>
> > > > >>>>>>>> 2. How to handle Scala_2.11 and Scala_2.12
> > > > >>>>>>>>
> > > > >>>>>>>> The PyFlink package bundles the jars in the package. As we
> > know,
> > > > >>>> there
> > > > >>>>> are
> > > > >>>>>>>> two versions of jars for each module: one for Scala 2.11 and
> > the
> > > > >>>> other
> > > > >>>>> for
> > > > >>>>>>>> Scala 2.12. So there will be two PyFlink packages
> > theoretically.
> > > > We
> > > > >>>>> need to
> > > > >>>>>>>> decide which one to publish to PyPI or both. If both
> packages
> > > will
> > > > >>> be
> > > > >>>>>>>> published to PyPI, we may need two projects, such as
> > pyflink_211
> > > > >>> and
> > > > >>>>>>>> pyflink_212 separately. Maybe more in the future such as
> > > > >>> pyflink_213.
> > > > >>>>>>>>      (BTW, I think we should bring up a discussion for dorp
> > > > >>>> Scala_2.11
> > > > >>>>> in
> > > > >>>>>>>> Flink 1.10 release due to 2.13 is available in early June.)
> > > > >>>>>>>>
> > > > >>>>>>>>      From the points of my view, for now, we can only
> release
> > > the
> > > > >>>>> scala_2.11
> > > > >>>>>>>> version, due to scala_2.11 is our default version in Flink.
> > > > >>>>>>>>
> > > > >>>>>>>> 3. Legal problems of publishing to PyPI
> > > > >>>>>>>>
> > > > >>>>>>>> As @Chesnay Schepler <ches...@apache.org>  pointed out in
> > > > >>>>> FLINK-13011[3],
> > > > >>>>>>>> publishing PyFlink to PyPI means that we will publish
> binaries
> > > to
> > > > a
> > > > >>>>>>>> distribution channel not owned by Apache. We need to figure
> > out
> > > if
> > > > >>>>> there
> > > > >>>>>>>> are legal problems. From my point of view, there are no
> > problems
> > > > >>> as a
> > > > >>>>> few
> > > > >>>>>>>> Apache projects such as Spark, Beam, etc have already done
> it.
> > > > >>>> Frankly
> > > > >>>>>>>> speaking, I am not familiar with this problem, welcome any
> > > > feedback
> > > > >>>> on
> > > > >>>>> this
> > > > >>>>>>>> if somebody is more family with this.
> > > > >>>>>>>>
> > > > >>>>>>>> Great thanks to @ueqt for willing to dedicate PyPI's project
> > > name
> > > > >>>>> `pyflink`
> > > > >>>>>>>> to the Apache Flink community!!!
> > > > >>>>>>>> Great thanks to @Dian for the offline effort!!!
> > > > >>>>>>>>
> > > > >>>>>>>> Best,
> > > > >>>>>>>> Jincheng
> > > > >>>>>>>>
> > > > >>>>>>>> [1]
> > > > >>>>>>>>
> > > > >>>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API
> > > > >>>>>>>> [2]
> > > > >>>>>>>>
> > > > >>>
> > > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#build-pyflink
> > > > >>>>>>>> [3] https://issues.apache.org/jira/browse/FLINK-13011
> > > > >>>>>>>>
> > > > >>>>>
> > > > >>
> > > > >> --
> > > > >> Best Regards
> > > > >>
> > > > >> Jeff Zhang
> > > > >
> > > >
> > > >
> > >
> >
>

Reply via email to