Myself and Brad Schoening currently maintain https://pypi.org/project/cqlsh/
which repackages CQLSH that ships with every Cassandra release.

This way:

   - anyone who wants a lightweight client to talk to a remote cassandra
   can simply `pip install cqlsh` without having to download the full
   cassandra source, unzip it, etc.
   - it's very easy for folks to use it as scaffolding in their python
   scripts/tooling since they can simply include it in the list of their
   required dependencies.

We currently handle the packaging by waiting for a release, then manually
copy/pasting the code out of the cassandra source tree into
https://github.com/jeffwidman/cqlsh which has some additional build/python
package configuration files, then using standard python tooling to publish
to PyPI.

Given that our project is simply a build/packaging project, I wanted to
start a conversation about upstreaming this into core Cassandra. I realize
that Cassandra has no interest in maintaining lots of build targets... but
given that cqlsh is written in Python and publishing to PyPI enables DBA's
to share more complicated tooling built on top of it this seems like a
natural fit for core cassandra rather than a standalone project.

Goal:
When a Cassandra release happens, the build/release process automatically
publishes cqlsh to https://pypi.org/project/cqlsh/.

Non-Goal: This is _not_ about having cassandra itself rely on PyPI. There
was some initial chatter about that in
https://issues.apache.org/jira/browse/CASSANDRA-18654, but that adds a lot
of complexity, and I'm honestly not sure it's a great idea. Even if folks
later want to go that route, the first hurdle is publishing to PyPI, so for
now let's keep the scope of the discussion limited to treating PyPI purely
as a release target, and not as an ingredient to a release.

>From an implementation perspective, this should be very straightforward. We
don't have any differences from the CQLSH source that's in cassandra,
instead we point folks to make changes to cqlsh in the Cassandra source. In
fact we've made multiple contributions back to `cqlsh` ourselves and have
drastically cleaned up the code:
https://github.com/search?q=repo%3Aapache%2Fcassandra%20is%3Apr%20author%3Ajeffwidman%20author%3Abschoening&type=pullrequests.
So the only real change is adding the package config files and the build /
release pipeline.

We realize the Cassandra team isn't python/PyPI experts, so we'd be more
than happy to help wire this up and maintain it. I am also a maintainer of
kazoo and kafka-python which are both popular python clients for other
distributed databases. So I'm very familiar with open source, python, and
distributed databases.

My one hesitation around this discussion is that I'm a little concerned
that we might lose the nimbleness we've currently got from having a
separate project. Ie, if something is screwed up on PyPI / the build
process, we can quickly get it fixed and get a new release out so that
users aren't blocked. Would it be possible as part of this process to
continue that myself/Brad had commit rights to the build process for PyPI?
To be clear, I'm not asking for commit rights to the Java code or anything
outside of Python, I just want to be sure that if we go to the trouble of
working with you to upstream this that there's a commitment from Cassandra
to keeping this build working, or to letting us be able to fix the build.
Otherwise there's no point in upstreaming it only for it to go unmaintained
leaving us looking on helplessly from the sidelines. I'm very flexible here
on the solution.

Thoughts?

-- 

*Jeff Widman*
jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
<><

Reply via email to