Ah. And also to add - I created this issue in datastack asking to add libev
support to the compiled .whl package they release:

[6] cassandra-driver for Python 3.12 Linux is compiled without libev
support :
https://datastax-oss.atlassian.net/jira/software/c/projects/PYTHON/issues/PYTHON-1378

On Wed, Feb 21, 2024 at 10:26 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Hello dear Cassandra community,
>
> I am a fellow PMC member of Apache Airflow and recently we started to look
> at the Cassandra provider of ours in the context of Python 3.12 migration
> and the integration raised my interest.
>
> TL;DR; I am quite confused, which client should we use to be future-proof
> and I would appreciate the advice of the community on it, also I would like
> to understand why there is no community-managed client, as seems that with
> the current approach, any Python project (including ASF ones are pretty
> much forced to use 3rd-party managed way to use Cassandra, which I find
> rather strange.
>
> Context:
>
> So far in Apache Airflow we were using
> https://github.com/datastax/python-driver/ to connect to Cassandra, but
> when we worked on Python 3.12 compatibility.  While looking at it, I
> discovered something strange
>
> This driver is published on Pypi  as "Cassandra driver" [1] which raises a
> bit of a question about trademark - I was so far convinced this driver is
> managed by the Cassandra community, but at a closer inspection it turned
> out that it is - in fact - Datastax driver. I find it pretty confusing to
> be honest, and with all the debate about ASF trademarks, this should IMHO
> raise a few eyebrows and PMC reaction - if you ask me. As a PMC of Apache
> Airflow I am responsible to raise trademark issues if I see them and that
> one seems to be at odds with the ASF rules. And if I am confused by
> the PyPI naming, then I am pretty sure zany of the users are as well.
>
> Note that I am not attacking anyone with that, I just noticed that this
> should likely be handled by the PMC somehow (or that would be my advise at
> least as a fellow ASF member and PMC member of a friendly ASF project)
>
> But that's a bit tangential to the problem. Coming back to the main
> problem.
>
> I did quite some research and it turned out that the driver still uses the
> default asyncore stdlib (which is removed in Python 3.12) and even if
> theoretically we could use libev reactor, it does not work out of the box
> with the .whl released even if proper libraries are installed - you really
> have to take an sdist and build the package with gcc configured and
> libev4/libev-devel installed.
>
> Another option is to use the asyncio reactor [2] as far as I understand -
> but as I understand from the issue [3] - this support is still experimental
> and it''s not ready for prime time.
>
> This is all captured in the PR [4] where I work on Python 3.12
> compatibility and Cassandra is - literally - the last remaining provider
> that we have to make a decision on what to do.
>
> That makes it rather useless fpr us - because we would not only complicate
> our testing / tooling setup (we have ~90 providers and pretty complicated
> system to manage dependencies already) and also it would make our users who
> would want to use Python 3.12 require to the same, which is quite a
> blocker. And handling user issues in this case would become rather tiring.
>
> In the same PR Israel Fruchter  - who helped us with the Cassandra issue
> and suggested that another option is to use the Scylladb driver - that is
> 100% compatible and published and released by Scylla [5]. I tested it and
> the .whl packages nicely work with libev installed - as expected (and
> initially Israel thought the datastax driver will work similarly). From
> Israel's explanation Datastax and Scylla are cooperating on the driver (in
> fact Scylla one is a fork of the Datastax one) but there is no insight who
> and how builds the packages (which also raised my eyebrow because it seems
> that - unlike in ASF, the process of building and releasing the package is
> not transparent and verifiable).
>
> Now - we have two choices:
>
> 1) We can use "cassandra-driver" (which really is a "datastax driver") and
> disable Cassandra provider for the users of Airflow for Python 3.12 until
> Datastax fixes the compatibility with Python 3.12
>
> 2) W can switch to Scylla driver and release next provider with Python
> 3.12 support
>
> So ... Providing all the context I have two questions:
>
> Q1: What would be the recommended solution by the community here. I
> understand the community has no impact on Datastax decisions and effort on
> releasing those drivers, so you can at most ask Datastax to fix the
> compatibility issue. As a user I have no insight on what relations are
> between the Cassandra community, Datastax and Scylla, so I am reaching here
> as the place to advise me on which option is best.  (This I am asking as a
> confused user)
>
> Q2: I find it pretty worrying that such an important interface (data world
> is driven by Python) is not under the community "umbrella" - seems that a
> very important thing for the users of Cassandra is managed and
> controlled by a 3rd-parties, and the users (as it is in this case) are
> pretty much left on the "mercy" (for the lack of better word) of the
> 3rd-parties - those are the parties that decide on whether Python 3.12
> users are able to use Cassandra. If I had such a situation in Airflow, I
> would be deeply worried in the PMC. Also what adds to that is the potential
> trademark issue that might confuse the users. If I see such a situation,
> I'd certainly reach out to tradema...@apache.org to check if that usage
> of name is acceptable (and I am pretty sure the answer would be "no" -
> looking at some recent discussions). I wonder if there were earlier
> discussions about it and whether the PMC is aware of the potential
> confusion it can create.
>
> Again - especially for point Q2 -  I also know this might be treated as
> some way of complaining, but it's more a concern of a fellow user and ASF
> member that is at play here - I just find it quite a bit confusing and
> likely bad for the community. Maybe I do not understand the context, and
> there are other options I am aware of, but - I simply approached it as a
> user and did quite a deep research and arrived to those conclusions, so if
> anything, I think it would be good if other users who come the same route
> are not as confused as I am.
>
>
> [1] "Cassandra" driver - https://pypi.org/project/cassandra-driver/
>
> [2] Cassandra Asyncio reactor:
> https://docs.datastax.com/en/developer/python-driver/3.25/api/cassandra/io/asyncioreactor/
>
> [3] Consider making asyncio reactor the default:
> https://datastax-oss.atlassian.net/browse/PYTHON-1375
>
> [4] Python 3.12 support:
> https://github.com/apache/airflow/pull/36755#issuecomment-1954688181
>
> [5] Scylladb driver on PyPI https://pypi.org/project/scylla-driver/
>
> J
>
>
>
>
>

Reply via email to