How do other projects solve this problem? Cheers, Till
On Wed, Mar 17, 2021 at 3:45 AM Xingbo Huang <hxbks...@gmail.com> wrote: > Hi Chesnay, > > Yes, in most cases, we can indeed download the required jars in `setup.py`, > which is also the solution I originally thought of reducing the size of > wheel packages. However, I'm afraid that it will not work in scenarios when > accessing the external network is not possible which is very common in the > production cluster. > > Best, > Xingbo > > Chesnay Schepler <ches...@apache.org> 于2021年3月16日周二 下午8:32写道: > > > This proposed apache-flink-libraries package would just contain the > > binary, right? And effectively be unusable to the python audience on > > it's own. > > > > Essentially we are just abusing Pypi for shipping a java binary. Is > > there no way for us to download the jars when the python package is > > being installed? (e.g., in setup.py) > > > > On 3/16/2021 1:23 PM, Dian Fu wrote: > > > Yes, the size of .whl file in PyFlink will also be about 3MB if we > split > > the package. Currently the package is big because we bundled the jar > files > > in it. > > > > > >> 2021年3月16日 下午8:13,Chesnay Schepler <ches...@apache.org> 写道: > > >> > > >> key difference being that the beam .whl files are 3mb large, aka 60x > > smaller. > > >> > > >> On 3/16/2021 1:06 PM, Dian Fu wrote: > > >>> Hi Chesnay, > > >>> > > >>> We will publish binary packages separately for: > > >>> 1) Python 3.5 / 3.6 / 3.7 / 3.8 (since 1.12) separately > > >>> 2) Linux / Mac separately > > >>> > > >>> Besides, there is also a source package which is used when none of > the > > above binary packages is usable, e.g. for Window users. > > >>> > > >>> PS: publishing multiple binary packages is very common in Python > > world, e.g. Beam published 22 packages in 2.28, Pandas published 16 > > packages in 1.2.3 [2]. We could also publishing more packages if we > > splitting the packages as the cost of adding another package will be very > > small. > > >>> > > >>> Regards, > > >>> Dian > > >>> > > >>> [1] https://pypi.org/project/apache-beam/#files < > > https://pypi.org/project/apache-beam/#files> < > > https://pypi.org/project/apache-beam/#files < > > https://pypi.org/project/apache-beam/#files>> > > >>> [2] https://pypi.org/project/pandas/#files > > >>> > > >>> > > >>> Hi Xintong, > > >>> > > >>> Yes, you are right that there is 9 packages in 1.12 as we added > Python > > 3.8 support in 1.12. > > >>> > > >>> Regards, > > >>> Dian > > >>> > > >>>> 2021年3月16日 下午7:45,Xintong Song <tonysong...@gmail.com> 写道: > > >>>> > > >>>> And it's not only uploaded to PyPI, but the ASF mirrors as well. > > >>>> > > >>>> > https://dist.apache.org/repos/dist/release/flink/flink-1.12.2/python/ > > >>>> > > >>>> Thank you~ > > >>>> > > >>>> Xintong Song > > >>>> > > >>>> > > >>>> > > >>>> On Tue, Mar 16, 2021 at 7:41 PM Xintong Song <tonysong...@gmail.com > > > > wrote: > > >>>> > > >>>>> Actually, I think it's 9 packages, not 7. > > >>>>> > > >>>>> Check here for the 1.12.2 packages. > > >>>>> https://pypi.org/project/apache-flink/#files > > >>>>> > > >>>>> Thank you~ > > >>>>> > > >>>>> Xintong Song > > >>>>> > > >>>>> > > >>>>> > > >>>>> On Tue, Mar 16, 2021 at 7:08 PM Chesnay Schepler < > ches...@apache.org > > > > > >>>>> wrote: > > >>>>> > > >>>>>> Am I reading this correctly that we publish 7 different artifacts > > just > > >>>>>> for python? > > >>>>>> What does the release matrix look like? > > >>>>>> > > >>>>>> On 3/16/2021 3:45 AM, Dian Fu wrote: > > >>>>>>> Hi Xingbo, > > >>>>>>> > > >>>>>>> > > >>>>>>> Thanks a lot for bringing up this discussion. Actually the size > > limit > > >>>>>> already becomes an issue during releasing 1.11.3 and 1.12.1. It > > blocks us > > >>>>>> to publish PyFlink packages to PyPI during the release as there is > > no > > >>>>>> enough space left (PS: already published the packages after > > increasing the > > >>>>>> size limit). > > >>>>>>> Considering that the total package size are about 1.5GB (220MB * > > 7) for > > >>>>>> each release, it makes sense to split the PyFlink package. It > could > > reduce > > >>>>>> the total package size to about 250MB (3MB * 7 + 220 MB) for each > > release. > > >>>>>> We don’t need to increase the size limit any more in the next few > > years as > > >>>>>> currently we still have about 7.5 GB space left. > > >>>>>>> So +1 from my side. > > >>>>>>> > > >>>>>>> Regards, > > >>>>>>> Dian > > >>>>>>> > > >>>>>>>> 2021年3月12日 下午2:30,Xingbo Huang <hxbks...@gmail.com> 写道: > > >>>>>>>> > > >>>>>>>> Hi everyone, > > >>>>>>>> > > >>>>>>>> Since release-1.11, pyflink has introduced cython support and we > > will > > >>>>>>>> release 7 packages (for different platforms and Python versions) > > to > > >>>>>> PyPI > > >>>>>>>> for each release and the size of each package is more than 200MB > > as we > > >>>>>> need > > >>>>>>>> to bundle the jar files into the package. The entire project > > space in > > >>>>>> PyPI > > >>>>>>>> grows very fast, and we need to apply to PyPI for more project > > space > > >>>>>>>> frequently. Please refer to [ > > >>>>>> https://github.com/pypa/pypi-support/issues/831] > > >>>>>>>> for more details. > > >>>>>>>> > > >>>>>>>> The root cause to this problem is that we bundled the jar files > > in each > > >>>>>>>> package. This is actually unnecessary if we could extract the > jar > > files > > >>>>>>>> into a separate package which is dedicated to hold the jar > files. > > >>>>>>>> > > >>>>>>>> I’d like to propose to split the pyflink package into two > > packages: the > > >>>>>>>> original apache-flink and apache-flink-libraries (Any > > suggestions for > > >>>>>> the > > >>>>>>>> name?). The package apache-flink-libraries only contains jar > > files and > > >>>>>>>> there is only one apache-flink-libraries package for each > > release. The > > >>>>>>>> package apache-flink depends on apache-flink-libraries and for > > users, > > >>>>>> they > > >>>>>>>> still only need to install apache-flink and there is nothing > > different > > >>>>>> from > > >>>>>>>> before. We still need to release multiple wheel packages of > > >>>>>> apache-flink. > > >>>>>>>> However, the size will be very small as it doesn't contain the > jar > > >>>>>> files > > >>>>>>>> any more. > > >>>>>>>> > > >>>>>>>> Looking forward to your feedback. > > >>>>>>>> > > >>>>>>>> Best, > > >>>>>>>> > > >>>>>>>> Xingbo > > > > > > > >