As a user, I really appreciate your efforts Jeff & Brad. I would *love*
for the C* project to officially support this.
In our environment we have a lot of client machines that all share
common NFS mounted directories. It's much easier for us to create a
Python virtual environment on a file server with the cqlsh PyPI package
installed than it is to install the Cassandra RPMs on every single
machine. Before I discovered your PyPI package, our developers would
need to login to a Cassandra node in order to run cqlsh. The cqlsh
PyPI package, however, is in our standard "python dev tools" virtual
environment -- along with Ansible, black, isort and various other Python
packages; which means it's accessible to everyone, everywhere.
I agree that this should not /replace/ packaging cqlsh in the Cassandra
RPM, so much provide an additional /option/ for installing cqlsh without
the baggage of installing the full Cassandra package.
Thanks again for your work Jeff & Brad.
- Max
On 7/6/2023 5:55 PM, Jeff Widman wrote:
Myself and Brad Schoening currently maintain
https://pypi.org/project/cqlsh/ which repackages CQLSH that ships with
every Cassandra release.
This way:
* anyone who wants a lightweight client to talk to a remote
cassandra can simply `pip install cqlsh` without having to
download the full cassandra source, unzip it, etc.
* it's very easy for folks to use it as scaffolding in their python
scripts/tooling since they can simply include it in the list of
their required dependencies.
We currently handle the packaging by waiting for a release, then
manually copy/pasting the code out of the cassandra source tree into
https://github.com/jeffwidman/cqlsh which has some additional
build/python package configuration files, then using standard
python tooling to publish to PyPI.
Given that our project is simply a build/packaging project, I wanted
to start a conversation about upstreaming this into core Cassandra. I
realize that Cassandra has no interest in maintaining lots of build
targets... but given that cqlsh is written in Python and publishing to
PyPI enables DBA's to share more complicated tooling built on top of
it this seems like a natural fit for core cassandra rather than a
standalone project.
Goal:
When a Cassandra release happens, the build/release process
automatically publishes cqlsh to https://pypi.org/project/cqlsh/.
Non-Goal: This is _not_ about having cassandra itself rely on PyPI.
There was some initial chatter about that in
https://issues.apache.org/jira/browse/CASSANDRA-18654, but that adds a
lot of complexity, and I'm honestly not sure it's a great idea. Even
if folks later want to go that route, the first hurdle is publishing
to PyPI, so for now let's keep the scope of the discussion limited to
treating PyPI purely as a release target, and not as an ingredient to
a release.
From an implementation perspective, this should be very
straightforward. We don't have any differences from the CQLSH source
that's in cassandra, instead we point folks to make changes to cqlsh
in the Cassandra source. In fact we've made multiple contributions
back to `cqlsh` ourselves and have drastically cleaned up the code:
https://github.com/search?q=repo%3Aapache%2Fcassandra%20is%3Apr%20author%3Ajeffwidman%20author%3Abschoening&type=pullrequests
<https://github.com/search?q=repo%3Aapache%2Fcassandra%20is%3Apr%20author%3Ajeffwidman%20author%3Abschoening&type=pullrequests>.
So the only real change is adding the package config files and the
build / release pipeline.
We realize the Cassandra team isn't python/PyPI experts, so we'd be
more than happy to help wire this up and maintain it. I am also a
maintainer of kazoo and kafka-python which are both popular python
clients for other distributed databases. So I'm very familiar with
open source, python, and distributed databases.
My one hesitation around this discussion is that I'm a little
concerned that we might lose the nimbleness we've currently got from
having a separate project. Ie, if something is screwed up on PyPI /
the build process, we can quickly get it fixed and get a new release
out so that users aren't blocked. Would it be possible as part of this
process to continue that myself/Brad had commit rights to the build
process for PyPI? To be clear, I'm not asking for commit rights to the
Java code or anything outside of Python, I just want to be sure that
if we go to the trouble of working with you to upstream this that
there's a commitment from Cassandra to keeping this build working, or
to letting us be able to fix the build. Otherwise there's no point in
upstreaming it only for it to go unmaintained leaving us looking on
helplessly from the sidelines. I'm very flexible here on the solution.
Thoughts?
--
*
Jeff Widman*
jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
<><