Hello dear Cassandra community, I am a fellow PMC member of Apache Airflow and recently we started to look at the Cassandra provider of ours in the context of Python 3.12 migration and the integration raised my interest.
TL;DR; I am quite confused, which client should we use to be future-proof and I would appreciate the advice of the community on it, also I would like to understand why there is no community-managed client, as seems that with the current approach, any Python project (including ASF ones are pretty much forced to use 3rd-party managed way to use Cassandra, which I find rather strange. Context: So far in Apache Airflow we were using https://github.com/datastax/python-driver/ to connect to Cassandra, but when we worked on Python 3.12 compatibility. While looking at it, I discovered something strange This driver is published on Pypi as "Cassandra driver" [1] which raises a bit of a question about trademark - I was so far convinced this driver is managed by the Cassandra community, but at a closer inspection it turned out that it is - in fact - Datastax driver. I find it pretty confusing to be honest, and with all the debate about ASF trademarks, this should IMHO raise a few eyebrows and PMC reaction - if you ask me. As a PMC of Apache Airflow I am responsible to raise trademark issues if I see them and that one seems to be at odds with the ASF rules. And if I am confused by the PyPI naming, then I am pretty sure zany of the users are as well. Note that I am not attacking anyone with that, I just noticed that this should likely be handled by the PMC somehow (or that would be my advise at least as a fellow ASF member and PMC member of a friendly ASF project) But that's a bit tangential to the problem. Coming back to the main problem. I did quite some research and it turned out that the driver still uses the default asyncore stdlib (which is removed in Python 3.12) and even if theoretically we could use libev reactor, it does not work out of the box with the .whl released even if proper libraries are installed - you really have to take an sdist and build the package with gcc configured and libev4/libev-devel installed. Another option is to use the asyncio reactor [2] as far as I understand - but as I understand from the issue [3] - this support is still experimental and it''s not ready for prime time. This is all captured in the PR [4] where I work on Python 3.12 compatibility and Cassandra is - literally - the last remaining provider that we have to make a decision on what to do. That makes it rather useless fpr us - because we would not only complicate our testing / tooling setup (we have ~90 providers and pretty complicated system to manage dependencies already) and also it would make our users who would want to use Python 3.12 require to the same, which is quite a blocker. And handling user issues in this case would become rather tiring. In the same PR Israel Fruchter - who helped us with the Cassandra issue and suggested that another option is to use the Scylladb driver - that is 100% compatible and published and released by Scylla [5]. I tested it and the .whl packages nicely work with libev installed - as expected (and initially Israel thought the datastax driver will work similarly). From Israel's explanation Datastax and Scylla are cooperating on the driver (in fact Scylla one is a fork of the Datastax one) but there is no insight who and how builds the packages (which also raised my eyebrow because it seems that - unlike in ASF, the process of building and releasing the package is not transparent and verifiable). Now - we have two choices: 1) We can use "cassandra-driver" (which really is a "datastax driver") and disable Cassandra provider for the users of Airflow for Python 3.12 until Datastax fixes the compatibility with Python 3.12 2) W can switch to Scylla driver and release next provider with Python 3.12 support So ... Providing all the context I have two questions: Q1: What would be the recommended solution by the community here. I understand the community has no impact on Datastax decisions and effort on releasing those drivers, so you can at most ask Datastax to fix the compatibility issue. As a user I have no insight on what relations are between the Cassandra community, Datastax and Scylla, so I am reaching here as the place to advise me on which option is best. (This I am asking as a confused user) Q2: I find it pretty worrying that such an important interface (data world is driven by Python) is not under the community "umbrella" - seems that a very important thing for the users of Cassandra is managed and controlled by a 3rd-parties, and the users (as it is in this case) are pretty much left on the "mercy" (for the lack of better word) of the 3rd-parties - those are the parties that decide on whether Python 3.12 users are able to use Cassandra. If I had such a situation in Airflow, I would be deeply worried in the PMC. Also what adds to that is the potential trademark issue that might confuse the users. If I see such a situation, I'd certainly reach out to tradema...@apache.org to check if that usage of name is acceptable (and I am pretty sure the answer would be "no" - looking at some recent discussions). I wonder if there were earlier discussions about it and whether the PMC is aware of the potential confusion it can create. Again - especially for point Q2 - I also know this might be treated as some way of complaining, but it's more a concern of a fellow user and ASF member that is at play here - I just find it quite a bit confusing and likely bad for the community. Maybe I do not understand the context, and there are other options I am aware of, but - I simply approached it as a user and did quite a deep research and arrived to those conclusions, so if anything, I think it would be good if other users who come the same route are not as confused as I am. [1] "Cassandra" driver - https://pypi.org/project/cassandra-driver/ [2] Cassandra Asyncio reactor: https://docs.datastax.com/en/developer/python-driver/3.25/api/cassandra/io/asyncioreactor/ [3] Consider making asyncio reactor the default: https://datastax-oss.atlassian.net/browse/PYTHON-1375 [4] Python 3.12 support: https://github.com/apache/airflow/pull/36755#issuecomment-1954688181 [5] Scylladb driver on PyPI https://pypi.org/project/scylla-driver/ J