On 2018-04-19 20:33, Ariel Weisberg wrote:
Hi,

That basically means a fork in the protocol (perhaps a temporary fork if
we go for mode 2 where Cassandra retroactively adopts our protocol
changes, if they fit will).

Implementing a protocol change may be easy for some simple changes, but
in the general case, it is not realistic to expect it.
Can you elaborate? No one is forcing driver maintainers to update their
drivers to support new features, either for Cassandra or Scylla, but
there should be no reason for them to reject a contribution adding that
support.
I think it's unrealistic to expect the next version of the protocol spec to 
include functionality that is not supported by either  the server or drivers 
once a version of the server or driver supporting that protocol version is  
released. Putting something in the spec is making a hard commitment for the 
driver and server without also specifying who will do the work.

So yes a temporary fork is fine, but then you run into things like "we" don't 
like the spec change and find we want to change it again. For us it's fine because we 
never committed to supporting the fork either way. For the driver maintainers it's fine 
because they probably never accepted the spec change either and didn't update the 
drivers. This is because the maintainers aren't going to accept changes that are 
incompatible with what the Cassandra server implements.

So if you have a temporary fork of the spec you might also be committing to a 
temporary fork of the drivers as well as the headaches that come with the final 
version of the spec not matching your fork. We would do what we can to avoid 
that by having the conversation around the protocol design up front.

What I am largely getting at is that I think Apache Cassandra and its drivers 
can only truly commit to a spec where there is a released implementation in the 
server and drivers.

The drivers are not part of Cassandra, so what "the server" is for drivers is up to their maintainer.

  Up until that point the spec is subject to change. We are less likely to 
change it if there is an implementation because we have already done the work 
and dug up most of the issues.

For sharding this is thorny and I think Ben makes a really good suggestion RE 
leveraging CASSANDRA-7544.  For paging state and timeouts I think it's likely 
we could stick to what we work out spec wise and we are happy to have the 
discussion and learn from ScyllaDB de-risking protocol changes, but if no one 
commits to doing the work you might find we release the next protocol version 
without the tentative spec changes.

So I think my proposed mode 1 (where the protocol, but not the server) is updated in cassandra.git is rejected. Let's discuss the two remaining options:

mode 2: cassandra.git reserves the prefix "SCYLLA" for the OPTIONS/SUPPORTED message, and, when it comes to implement a protocol extensions it will consider Scylla extensions and incorporate them into cassandra.git if they are found to be technically acceptable (but may of course extend the protocol in a different way if there is a technical reason)

mode 3: cassandra.git ignores Scylla


For Cassandra, the advantage of mode 2 is that if driver maintainers add support for the change (on their own or by merging changes authored by Scylla developers), then Cassandra developers get driver support with less effort.


Ariel
On Thu, Apr 19, 2018, at 12:53 PM, Avi Kivity wrote:

On 2018-04-19 19:10, Ariel Weisberg wrote:
Hi,

I think that updating the protocol spec to Cassandra puts the onus on the party 
changing the protocol specification to have an implementation of the spec in 
Cassandra as well as the Java and Python driver (those are both used in the 
Cassandra repo). Until it's implemented in Cassandra we haven't fully evaluated 
the specification change. There is no substitute for trying to make it work.
That basically means a fork in the protocol (perhaps a temporary fork if
we go for mode 2 where Cassandra retroactively adopts our protocol
changes, if they fit will).

Implementing a protocol change may be easy for some simple changes, but
in the general case, it is not realistic to expect it.

There are also realities to consider as to what the maintainers of the drivers 
are willing to commit.
Can you elaborate? No one is forcing driver maintainers to update their
drivers to support new features, either for Cassandra or Scylla, but
there should be no reason for them to reject a contribution adding that
support.

If you refer to a potential politically-motivated rejection by the
DataStax-maintained drivers, then those drivers should and will be
forked. That's not true open source. However, I'm not assuming that will
happen.

RE #1,

I am +1 on the fact that we shouldn't require an extra hop for range scans.

In JIRA Jeremiah made the point that you can still do this from the client by 
breaking up the token ranges, but it's a leaky abstraction to have a paging 
interface that isn't a vanilla ResultSet interface. Serial vs. parallel is kind 
of orthogonal as the driver can do either.

I agree it looks like the current specification doesn't make what should be 
simple as simple as it could be for driver implementers.

RE #2,

+1 on this change assuming an implementation in Cassandra and the Java and 
Python drivers.
Those were just given as examples. Each would be discussed on its own,
assuming we are able to find a way to cooperate.


These are relatively simple and it wouldn't be hard for use to patch
Cassandra. But I want to find a way to make more complicated protocol
changes where it wouldn't be realistic for us to modify Cassandra.

RE #3,

It's hard to be +1 on this because we don't benefit by boxing ourselves in by 
defining a spec we haven't implemented, tested, and decided we are satisfied 
with. Having it in ScyllaDB de-risks it to a certain extent, but what if 
Cassandra decides to go a different direction in some way?
Such a proposal would include negotiation about the sharding algorithm
used to prevent Cassandra being boxed in. Of course it's impossible to
guarantee that a new idea won't come up that requires more changes.

I don't think there is much discussion to be had without an example of the the 
changes to the CQL specification to look at, but even then if it looks risky I 
am not likely to be in favor of it.

Regards,
Ariel

On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:
On 2018/04/19 07:19:27, kurt greaves <k...@instaclustr.com> wrote:
1. The protocol change is developed using the Cassandra process in
     a JIRA ticket, culminating in a patch to
     doc/native_protocol*.spec when consensus is achieved.
I don't think forking would be desirable (for anyone) so this seems
the most reasonable to me. For 1 and 2 it certainly makes sense but
can't say I know enough about sharding to comment on 3 - seems to me
like it could be locking in a design before anyone truly knows what
sharding in C* looks like. But hopefully I'm wrong and there are
devs out there that have already thought that through.
Thanks. That is our view and is great to hear.

About our proposal number 3: In my view, good protocol designs are
future proof and flexible. We certainly don't want to propose a design
that works just for Scylla, but would support reasonable
implementations regardless of how they may look like.

Do we have driver authors who wish to support both projects?

Surely, but I imagine it would be a minority. ​

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
additional commands, e-mail: dev-h...@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to