Ah. And also to add - I created this issue in datastack asking to add libev support to the compiled .whl package they release:
[6] cassandra-driver for Python 3.12 Linux is compiled without libev support : https://datastax-oss.atlassian.net/jira/software/c/projects/PYTHON/issues/PYTHON-1378 On Wed, Feb 21, 2024 at 10:26 AM Jarek Potiuk <ja...@potiuk.com> wrote: > Hello dear Cassandra community, > > I am a fellow PMC member of Apache Airflow and recently we started to look > at the Cassandra provider of ours in the context of Python 3.12 migration > and the integration raised my interest. > > TL;DR; I am quite confused, which client should we use to be future-proof > and I would appreciate the advice of the community on it, also I would like > to understand why there is no community-managed client, as seems that with > the current approach, any Python project (including ASF ones are pretty > much forced to use 3rd-party managed way to use Cassandra, which I find > rather strange. > > Context: > > So far in Apache Airflow we were using > https://github.com/datastax/python-driver/ to connect to Cassandra, but > when we worked on Python 3.12 compatibility. While looking at it, I > discovered something strange > > This driver is published on Pypi as "Cassandra driver" [1] which raises a > bit of a question about trademark - I was so far convinced this driver is > managed by the Cassandra community, but at a closer inspection it turned > out that it is - in fact - Datastax driver. I find it pretty confusing to > be honest, and with all the debate about ASF trademarks, this should IMHO > raise a few eyebrows and PMC reaction - if you ask me. As a PMC of Apache > Airflow I am responsible to raise trademark issues if I see them and that > one seems to be at odds with the ASF rules. And if I am confused by > the PyPI naming, then I am pretty sure zany of the users are as well. > > Note that I am not attacking anyone with that, I just noticed that this > should likely be handled by the PMC somehow (or that would be my advise at > least as a fellow ASF member and PMC member of a friendly ASF project) > > But that's a bit tangential to the problem. Coming back to the main > problem. > > I did quite some research and it turned out that the driver still uses the > default asyncore stdlib (which is removed in Python 3.12) and even if > theoretically we could use libev reactor, it does not work out of the box > with the .whl released even if proper libraries are installed - you really > have to take an sdist and build the package with gcc configured and > libev4/libev-devel installed. > > Another option is to use the asyncio reactor [2] as far as I understand - > but as I understand from the issue [3] - this support is still experimental > and it''s not ready for prime time. > > This is all captured in the PR [4] where I work on Python 3.12 > compatibility and Cassandra is - literally - the last remaining provider > that we have to make a decision on what to do. > > That makes it rather useless fpr us - because we would not only complicate > our testing / tooling setup (we have ~90 providers and pretty complicated > system to manage dependencies already) and also it would make our users who > would want to use Python 3.12 require to the same, which is quite a > blocker. And handling user issues in this case would become rather tiring. > > In the same PR Israel Fruchter - who helped us with the Cassandra issue > and suggested that another option is to use the Scylladb driver - that is > 100% compatible and published and released by Scylla [5]. I tested it and > the .whl packages nicely work with libev installed - as expected (and > initially Israel thought the datastax driver will work similarly). From > Israel's explanation Datastax and Scylla are cooperating on the driver (in > fact Scylla one is a fork of the Datastax one) but there is no insight who > and how builds the packages (which also raised my eyebrow because it seems > that - unlike in ASF, the process of building and releasing the package is > not transparent and verifiable). > > Now - we have two choices: > > 1) We can use "cassandra-driver" (which really is a "datastax driver") and > disable Cassandra provider for the users of Airflow for Python 3.12 until > Datastax fixes the compatibility with Python 3.12 > > 2) W can switch to Scylla driver and release next provider with Python > 3.12 support > > So ... Providing all the context I have two questions: > > Q1: What would be the recommended solution by the community here. I > understand the community has no impact on Datastax decisions and effort on > releasing those drivers, so you can at most ask Datastax to fix the > compatibility issue. As a user I have no insight on what relations are > between the Cassandra community, Datastax and Scylla, so I am reaching here > as the place to advise me on which option is best. (This I am asking as a > confused user) > > Q2: I find it pretty worrying that such an important interface (data world > is driven by Python) is not under the community "umbrella" - seems that a > very important thing for the users of Cassandra is managed and > controlled by a 3rd-parties, and the users (as it is in this case) are > pretty much left on the "mercy" (for the lack of better word) of the > 3rd-parties - those are the parties that decide on whether Python 3.12 > users are able to use Cassandra. If I had such a situation in Airflow, I > would be deeply worried in the PMC. Also what adds to that is the potential > trademark issue that might confuse the users. If I see such a situation, > I'd certainly reach out to tradema...@apache.org to check if that usage > of name is acceptable (and I am pretty sure the answer would be "no" - > looking at some recent discussions). I wonder if there were earlier > discussions about it and whether the PMC is aware of the potential > confusion it can create. > > Again - especially for point Q2 - I also know this might be treated as > some way of complaining, but it's more a concern of a fellow user and ASF > member that is at play here - I just find it quite a bit confusing and > likely bad for the community. Maybe I do not understand the context, and > there are other options I am aware of, but - I simply approached it as a > user and did quite a deep research and arrived to those conclusions, so if > anything, I think it would be good if other users who come the same route > are not as confused as I am. > > > [1] "Cassandra" driver - https://pypi.org/project/cassandra-driver/ > > [2] Cassandra Asyncio reactor: > https://docs.datastax.com/en/developer/python-driver/3.25/api/cassandra/io/asyncioreactor/ > > [3] Consider making asyncio reactor the default: > https://datastax-oss.atlassian.net/browse/PYTHON-1375 > > [4] Python 3.12 support: > https://github.com/apache/airflow/pull/36755#issuecomment-1954688181 > > [5] Scylladb driver on PyPI https://pypi.org/project/scylla-driver/ > > J > > > > >