On Wed, Mar 30, 2011 at 3:15 AM, Eric Evans <eev...@rackspace.com> wrote:

> The client space as a whole *is* a mess, despite heroic efforts on the
> part of our third-party API maintainers, but forcing them in-tree is not
> going to solve anything.  In fact, it would very likely make it worse by
> adding unnecessary overhead to contribution, and discouraging
> innovation.
>

I can understand your reluctance to do the clients "in-tree" since it will
be a lot of work and people will no doubt be upset if their client is not
chosen for a given language.  But I think this is the wrong approach for
three reasons.

First, the client libraries in different languages should look alike.
 Having libraries that look different in different languages is a very bad
idea indeed.

Second, users should not have to ask which client library version goes with
which Cassandra version.  Nor should they have to deal with different
features appearing at different times in different libraries for different
languages. And getting everyone to version everything the same way, so users
won't have to scratch their heads, isn't going to work.

Third, if you think transitioning to maintaining official client libraries
in-tree is going to be hard and a lot of work now, just wait another year
and it is going to be even harder. Just like it is a lot harder to do now
than, say, 2 years ago. The sooner this gets done, the better.  This is
technology debt.  And if it doesn't get paid, people will go elsewhere.


> The root cause goes back to your first point, the RPC interface is
> baroque, and too tightly coupled to Cassandra's internals.  The
> third-party library maintainers can only do so much to paper over that;
> The Fail shines through.
>

Well, yes and no.  Yes the RPC interface is baroque and somewhat unhelpful.
 It worries me that nobody found it baroque and unhelpful enough to "paper
over" early on.

And no, it isn't really that hard to paper over, but the problem is that
nobody wants to write yet another Cassandra client library to provide a way
to talk to Cassandra which is both reasonably terse *and* at the same time
flexible enough to let you do more complex stuff.

One of the first urges I felt when starting to play with Cassandra was to
write my own client library.  That urge did not subside as I tried some of
the other (Java) client libraries.  (I'll be nice and just postulate that
they are roughly equally ugly so everyone is equally offended :-)).  The
problem is that writing a client library that I'd actually want to use would
not really be that helpful for the Cassandra project or for me.  It would be
yet another project competing for attention and consuming precious developer
resources.  That is: if anyone would have bothered to use it.  Either way
I'd probably get stuck maintaining my own library, or at best, have to
dedicate a considerable amount of time to coordinating a development effort
until someone actually interested in being a client lib maintainer could
take over.

Sure, having to maintain the client libraries in-tree is going to slow down
the release cycle somewhat, but which would you rather have:  more frequent
releases or a better product that appeals to developers?

And sure, there are a lot of *really* nice people that will probably feel
that they have wasted effort on building a client library if Cassandra were
to offer official client libraries,  but at this point people are beginning
to depend on Cassandra and paying attention to users (that is: programmers)
is going to be very important.  One day the Hadoop/HBase people are going to
get their shit together and *then what*.

I would much rather the client library maintainers for various libraries in
various languages teamed up with someone who is good at designing APIs that
need to work across languages, spent some time making a rough outline, did
some hacking to see what it would be like in different languages, adjusted
the common design guideline and got to work on maintaining in-tree client
libraries rather than everyone and their grandmother maintaining their own
client library.  Egos might get bruised, people may trot off in a huff, but
in the long run I think this is for the better that we do not encourage
effort to be duplicated and talent to be spread thin across different
projects.

I feel like a bit of a dick for saying things this bluntly, but I think that
someone has to be the unpopular jerk who says these things out loud.  You
have all done *great* work on Cassandra, I am grateful for it, and I am in
no way belittling your effort.  But you need an outside voice to tell you
that there are some things that aren't as they should.

Part of the problem here is that everyone is just too nice to each other
:-).  I'll be happy to be the jerk if I think it'd help the project.


> The solution here is the same as for point #1 above, CQL.  And, the idea
> is to include in-tree "drivers", basically, the minimum amount of common
> code that all third-party libs would need to implement (think connection
> pooling, parameter substitution, etc).


A minimal client for each language with a less sucky API would be a start.

Gathering the talent that is spread across different client projects today
and get them to maintain official libraries that offer some consistency
across languages would be better.

Next time you have a Cassandra summit (or similar), invite the client
library maintainers, put them in a nice room with enough fresh air, give
them a stack of pizzas and a whiteboard and let them have a couple of hours
to discuss a) if they are willing to pull together, and b) how they would go
about building better, official client libraries.  Add drugs as needed
(alcohol, red bull, carrots etc).



>  We already have drivers for
> Java, Python, and Twisted, and folks are working PHP and Perl (that I'm
> aware of).
>
> >    - It is buggy and the solution seems to be to just go to the next
> >    release.  And the next.  And the next.  Which would be okay if you
> > could upgrade all the time, but what to do once you hit production?
>
> 0.7 has been a rough ride, no doubt.  We spent too much time pushing in
> too many features, and didn't do a good job of drawing a line in the
> sand when it came time to release.  Our track record prior to 0.7 was
> Not Horrible, and trending toward Better And Better, and we've made some
> adjustment to the release process, so I'm hopeful we'll get back on
> track.
>

I'm glad to hear that.  Quality and reliability is going to make or break
Cassandra over the next 1-2 years.


> > In any case, thanks for all the effort that went into Cassandra.  I
> > will check back from time to time and perhaps in a year or so it'll be
> > time to re-evaluate Cassandra.
>
> In a year we'll have achieved Total World Domination. :)
>

Well, I sincerely hope so even though I have (for now) jumped ship :-)

But as I mentioned earlier, if the Hadoop guys get their shit together it
may be them that dominate the world.  The flock is restless and unfaithful.

~G

Reply via email to