[ https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated FLINK-32758: ----------------------------------- Labels: pull-request-available (was: ) > PyFlink bounds are overly restrictive and outdated > -------------------------------------------------- > > Key: FLINK-32758 > URL: https://issues.apache.org/jira/browse/FLINK-32758 > Project: Flink > Issue Type: Improvement > Components: API / Python > Affects Versions: 1.17.1 > Reporter: Deepyaman Datta > Priority: Major > Labels: pull-request-available > > Hi! I am part of a team building the Flink backend for Ibis > ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink > under the hood for execution; however, PyFlink's requirements are > incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's > outdated and restrictive requirements prevent it from being used alongside > most recent releases of Python data libraries. > Some of the major libraries we (and likely others in the Python community > interested in using PyFlink alongside other libraries) need compatibility > with: > * PyArrow (at least >=10.0.0, but there's no reason not to be also be > compatible with latest) > * pandas (should be compatible with 2.x series, but also probably with > 1.4.x, released January 2022, and 1.5.x) > * numpy (1.22 was released in December 2022) > * Newer releases of Apache Beam > * Newer releases of cython > Furthermore, uncapped dependencies could be more generally preferable, as > they avoid the need for frequent PyFlink releases as newer versions of > libraries are released. A common (and great) argument for not upper-bounding > dependencies, especially for libraries: > [https://iscinumpy.dev/post/bound-version-constraints/] > I am currently testing removing upper bounds in > [https://github.com/apache/flink/pull/23141]; so far, builds pass without > issue in > [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581], > and I'm currently waiting on > [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6] > to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed > dependencies results in: > {{#}} > {{# This file is autogenerated by pip-compile with Python 3.8}} > {{# by the following command:}} > {{#}} > {{# pip-compile --config=pyproject.toml > --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}} > {{#}} > {{apache-beam==2.49.0}} > {{ # via -r dev/dev-requirements.txt}} > {{avro-python3==1.10.2}} > {{ # via -r dev/dev-requirements.txt}} > {{certifi==2023.7.22}} > {{ # via requests}} > {{charset-normalizer==3.2.0}} > {{ # via requests}} > {{cloudpickle==2.2.1}} > {{ # via}} > {{ # -r dev/dev-requirements.txt}} > {{ # apache-beam}} > {{crcmod==1.7}} > {{ # via apache-beam}} > {{cython==3.0.0}} > {{ # via -r dev/dev-requirements.txt}} > {{dill==0.3.1.1}} > {{ # via apache-beam}} > {{dnspython==2.4.1}} > {{ # via pymongo}} > {{docopt==0.6.2}} > {{ # via hdfs}} > {{exceptiongroup==1.1.2}} > {{ # via pytest}} > {{fastavro==1.8.2}} > {{ # via}} > {{ # -r dev/dev-requirements.txt}} > {{ # apache-beam}} > {{fasteners==0.18}} > {{ # via apache-beam}} > {{find-libpython==0.3.0}} > {{ # via pemja}} > {{grpcio==1.56.2}} > {{ # via}} > {{ # -r dev/dev-requirements.txt}} > {{ # apache-beam}} > {{ # grpcio-tools}} > {{grpcio-tools==1.56.2}} > {{ # via -r dev/dev-requirements.txt}} > {{hdfs==2.7.0}} > {{ # via apache-beam}} > {{httplib2==0.22.0}} > {{ # via}} > {{ # -r dev/dev-requirements.txt}} > {{ # apache-beam}} > {{idna==3.4}} > {{ # via requests}} > {{iniconfig==2.0.0}} > {{ # via pytest}} > {{numpy==1.24.4}} > {{ # via}} > {{ # -r dev/dev-requirements.txt}} > {{ # apache-beam}} > {{ # pandas}} > {{ # pyarrow}} > {{objsize==0.6.1}} > {{ # via apache-beam}} > {{orjson==3.9.2}} > {{ # via apache-beam}} > {{packaging==23.1}} > {{ # via pytest}} > {{pandas==2.0.3}} > {{ # via -r dev/dev-requirements.txt}} > {{pemja==0.3.0 ; platform_system != "Windows"}} > {{ # via -r dev/dev-requirements.txt}} > {{pluggy==1.2.0}} > {{ # via pytest}} > {{proto-plus==1.22.3}} > {{ # via apache-beam}} > {{protobuf==4.23.4}} > {{ # via}} > {{ # -r dev/dev-requirements.txt}} > {{ # apache-beam}} > {{ # grpcio-tools}} > {{ # proto-plus}} > {{py4j==0.10.9.7}} > {{ # via -r dev/dev-requirements.txt}} > {{pyarrow==11.0.0}} > {{ # via}} > {{ # -r dev/dev-requirements.txt}} > {{ # apache-beam}} > {{pydot==1.4.2}} > {{ # via apache-beam}} > {{pymongo==4.4.1}} > {{ # via apache-beam}} > {{pyparsing==3.1.1}} > {{ # via}} > {{ # httplib2}} > {{ # pydot}} > {{pytest==7.4.0}} > {{ # via -r dev/dev-requirements.txt}} > {{python-dateutil==2.8.2}} > {{ # via}} > {{ # -r dev/dev-requirements.txt}} > {{ # apache-beam}} > {{ # pandas}} > {{pytz==2023.3}} > {{ # via}} > {{ # -r dev/dev-requirements.txt}} > {{ # apache-beam}} > {{ # pandas}} > {{regex==2023.6.3}} > {{ # via apache-beam}} > {{requests==2.31.0}} > {{ # via}} > {{ # apache-beam}} > {{ # hdfs}} > {{six==1.16.0}} > {{ # via}} > {{ # hdfs}} > {{ # python-dateutil}} > {{tomli==2.0.1}} > {{ # via pytest}} > {{typing-extensions==4.7.1}} > {{ # via apache-beam}} > {{tzdata==2023.3}} > {{ # via pandas}} > {{urllib3==2.0.4}} > {{ # via requests}} > {{wheel==0.41.0}} > {{ # via -r dev/dev-requirements.txt}} > {{zstandard==0.21.0}} > {{ # via apache-beam}} > {{# The following packages are considered to be unsafe in a requirements > file:}} > {{# pip}} > {{# setuptools}} -- This message was sent by Atlassian Jira (v8.20.10#820010)