Re: Evolving the client protocol

Dor Laor Mon, 23 Apr 2018 17:19:22 -0700

On Mon, Apr 23, 2018 at 5:03 PM, Sankalp Kohli <[email protected]>
wrote:


> Is one of the “abuse” of Apache license is ScyllaDB which is using
> Cassandra but not contributing back?
>

It's not that we have a private version of Cassandra and we don't release
all of it or some of it back..

We didn't contribute because we have a different server base. We always
contribute where it makes sense.
I'll be happy to have several beers or emails about the cons and pros of
open source licensing but I don't think
this is the case. The discussion is about whether the community wish to
accept our contributions, we initiated it,
didn't we?

Let's be practical, I think it's not reasonable to commit C* protocol
changes that the community doesn't intend
to implement in C* in the short term (thread-per-core like), it's not
reasonable to expect Scylla to contribute
such a huge effort to the C* server. It is reasonable to collaborate around
protocol enhancements that are acceptable,
even without coding and make sure the protocol is enhanceable in a way that
forward compatible.


Happy to be proved wrong as I am not a lawyer and don’t understand various
> licenses ..
>
> > On Apr 23, 2018, at 16:55, Dor Laor <[email protected]> wrote:
> >
> >> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad <[email protected]>
> wrote:
> >>
> >> From where I stand it looks like you've got only two options for any
> >> feature that involves updating the protocol:
> >>
> >> 1. Don't built the feature
> >> 2. Built it in Cassanda & scylladb, update the drivers accordingly
> >>
> >> I don't think you have a third option, which is built it only in
> ScyllaDB,
> >> because that means you have to fork *all* the drivers and make it work,
> >> then maintain them.  Your business model appears to be built on not
> doing
> >> any of the driver work yourself, and you certainly aren't giving back to
> >> the open source community via a permissive license on ScyllaDB itself,
> so
> >> I'm a bit lost here.
> >>
> >
> > It's totally not about business model.
> > Scylla itself is 99% open source with AGPL license that prevents abuse
> and
> > forces to be committed back to the project. We also have our core engine
> > (seastar) licensed
> > as Apache since it needs to be integrated with  the core application.
> > Recently one of our community members even created a new Seastar based,
> C++
> > driver.
> >
> > Scylla chose to be compatible with the drivers in order to leverage the
> > existing infrastructure
> > and (let's be frank) in order to allow smooth migration.
> > We would have loved to contribute more to the drivers but up to recently
> we:
> > 1. Were busy on top of our heads with the server
> > 2. Happy w/ the existing drivers
> > 3. Developed extensions - GoCQLX - our own contribution
> >
> > Finally we can contribute back to the same driver project, we want to do
> it
> > the right way,
> > without forking and without duplicated efforts.
> >
> > Many times, having a private fork is way easier than proper open source
> > work so from
> > a pure business perspective, we don't select the shortest path.
> >
> >
> >>
> >> To me it looks like you're asking a bunch of volunteers that work on
> >> Cassandra to accommodate you.  What exactly do we get out of this
> >> relationship?  What incentive do I or anyone else have to spend time
> >> helping you instead of working on something that interests me?
> >>
> >
> > Jon, this is certainty not the case.
> > We genuinely wish to make true *open source* work on:
> > a. Cassandra drivers
> > b. Client protocol
> > c. Scylla server side.
> > d. Cassandra community related work: mailing list, Jira, design
> >
> > But not
> > e. Cassandra server side
> >
> > While I wouldn't mind doing the Cassandra server work, we don't have the
> > resources or
> > the expertise. The Cassandra _developer_ community is welcome to decide
> > whether
> > we get to contribute a/b/c/d. Avi has enumerated the options of
> > cooperation, passive cooperation
> > and zero cooperation (below).
> >
> > 1. The protocol change is developed using the Cassandra process in a JIRA
> > ticket, culminating in a patch to doc/native_protocol*.spec when
> consensus
> > is achieved.
> > 2. The protocol change is developed outside the Cassandra process.
> > 3. No cooperation.
> >
> > Look, I can understand the hostility and suspicious, however, from the C*
> > project POV, it makes no
> > sense to ignore, otherwise we'll fork the drivers and you won't get
> > anything back. There is another
> > at least one vendor today with their server fork and driver fork and it
> > makes sense to keep the protocol
> > unified in an extensible way and to discuss new features _together_.
> >
> >
> >
> >>
> >> Jon
> >>
> >>
> >> On Mon, Apr 23, 2018 at 7:59 AM Ben Bromhead <[email protected]>
> wrote:
> >>
> >>>>>> This doesn't work without additional changes, for RF>1. The token
> >> ring
> >>>> could place two replicas of the same token range on the same physical
> >>>> server, even though those are two separate cores of the same server.
> >> You
> >>>> could add another element to the hierarchy (cluster -> datacenter ->
> >> rack
> >>>> -> node -> core/shard), but that generates unneeded range movements
> >> when
> >>> a
> >>>> node is added.
> >>>>> I have seen rack awareness used/abused to solve this.
> >>>>>
> >>>>
> >>>> But then you lose real rack awareness. It's fine for a quick hack, but
> >>>> not a long-term solution.
> >>>>
> >>>> (it also creates a lot more tokens, something nobody needs)
> >>>>
> >>>
> >>> I'm having trouble understanding how you loose "real" rack awareness,
> as
> >>> these shards are in the same rack anyway, because the address and port
> >> are
> >>> on the same server in the same rack. So it behaves as expected. Could
> you
> >>> explain a situation where the shards on a single server would be in
> >>> different racks (or fault domains)?
> >>>
> >>> If you wanted to support a situation where you have a single rack per
> DC
> >>> for simple deployments, extending NetworkTopologyStrategy to behave the
> >> way
> >>> it did before https://issues.apache.org/jira/browse/CASSANDRA-7544
> with
> >>> respect to treating InetAddresses as servers rather than the address
> and
> >>> port would be simple. Both this implementation in Apache Cassandra and
> >> the
> >>> respective load balancing classes in the drivers are explicitly
> designed
> >> to
> >>> be pluggable so that would be an easier integration point for you.
> >>>
> >>> I'm not sure how it creates more tokens? If a server normally owns 256
> >>> tokens, each shard on a different port would just advertise ownership
> of
> >>> 256/# of cores (e.g. 4 tokens if you had 64 cores).
> >>>
> >>>
> >>>>
> >>>>> Regards,
> >>>>> Ariel
> >>>>>
> >>>>>> On Apr 22, 2018, at 8:26 AM, Avi Kivity <[email protected]> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> On 2018-04-19 21:15, Ben Bromhead wrote:
> >>>>>>> Re #3:
> >>>>>>>
> >>>>>>> Yup I was thinking each shard/port would appear as a discrete
> >> server
> >>>> to the
> >>>>>>> client.
> >>>>>> This doesn't work without additional changes, for RF>1. The token
> >> ring
> >>>> could place two replicas of the same token range on the same physical
> >>>> server, even though those are two separate cores of the same server.
> >> You
> >>>> could add another element to the hierarchy (cluster -> datacenter ->
> >> rack
> >>>> -> node -> core/shard), but that generates unneeded range movements
> >> when
> >>> a
> >>>> node is added.
> >>>>>>
> >>>>>>> If the per port suggestion is unacceptable due to hardware
> >>>> requirements,
> >>>>>>> remembering that Cassandra is built with the concept scaling
> >>>> *commodity*
> >>>>>>> hardware horizontally, you'll have to spend your time and energy
> >>>> convincing
> >>>>>>> the community to support a protocol feature it has no (current) use
> >>>> for or
> >>>>>>> find another interim solution.
> >>>>>> Those servers are commodity servers (not x86, but still commodity).
> >> In
> >>>> any case 60+ logical cores are common now (hello AWS i3.16xlarge or
> >> even
> >>>> i3.metal), and we can only expect logical core count to continue to
> >>>> increase (there are 48-core ARM processors now).
> >>>>>>
> >>>>>>> Another way, would be to build support and consensus around a clear
> >>>>>>> technical need in the Apache Cassandra project as it stands today.
> >>>>>>>
> >>>>>>> One way to build community support might be to contribute an Apache
> >>>>>>> licensed thread per core implementation in Java that matches the
> >>>> protocol
> >>>>>>> change and shard concept you are looking for ;P
> >>>>>> I doubt I'll survive the egregious top-posting that is going on in
> >>> this
> >>>> list.
> >>>>>>
> >>>>>>>
> >>>>>>>> On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg <[email protected]
> >>>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> So at technical level I don't understand this yet.
> >>>>>>>>
> >>>>>>>> So you have a database consisting of single threaded shards and a
> >>>> socket
> >>>>>>>> for accept that is generating TCP connections and in advance you
> >>>> don't know
> >>>>>>>> which connection is going to send messages to which shard.
> >>>>>>>>
> >>>>>>>> What is the mechanism by which you get the packets for a given TCP
> >>>>>>>> connection delivered to a specific core? I know that a given TCP
> >>>> connection
> >>>>>>>> will normally have all of its packets delivered to the same queue
> >>>> from the
> >>>>>>>> NIC because the tuple of source address + port and destination
> >>>> address +
> >>>>>>>> port is typically hashed to pick one of the queues the NIC
> >>> presents. I
> >>>>>>>> might have the contents of the tuple slightly wrong, but it always
> >>>> includes
> >>>>>>>> a component you don't get to control.
> >>>>>>>>
> >>>>>>>> Since it's hashing how do you manipulate which queue packets for a
> >>> TCP
> >>>>>>>> connection go to and how is it made worse by having an accept
> >> socket
> >>>> per
> >>>>>>>> shard?
> >>>>>>>>
> >>>>>>>> You also mention 160 ports as bad, but it doesn't sound like a big
> >>>> number
> >>>>>>>> resource wise. Is it an operational headache?
> >>>>>>>>
> >>>>>>>> RE tokens distributed amongst shards. The way that would work
> >> right
> >>>> now is
> >>>>>>>> that each port number appears to be a discrete instance of the
> >>>> server. So
> >>>>>>>> you could have shards be actual shards that are simply colocated
> >> on
> >>>> the
> >>>>>>>> same box, run in the same process, and share resources. I know
> >> this
> >>>> pushes
> >>>>>>>> more of the complexity into the server vs the driver as the server
> >>>> expects
> >>>>>>>> all shards to share some client visible like system tables and
> >>> certain
> >>>>>>>> identifiers.
> >>>>>>>>
> >>>>>>>> Ariel
> >>>>>>>>> On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
> >>>>>>>>> Port-per-shard is likely the easiest option but it's too ugly to
> >>>>>>>>> contemplate. We run on machines with 160 shards (IBM POWER
> >>> 2s20c160t
> >>>>>>>>> IIRC), it will be just horrible to have 160 open ports.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> It also doesn't fit will with the NICs ability to automatically
> >>>>>>>>> distribute packets among cores using multiple queues, so the
> >> kernel
> >>>>>>>>> would have to shuffle those packets around. Much better to have
> >>> those
> >>>>>>>>> packets delivered directly to the core that will service them.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> (also, some protocol changes are needed so the driver knows how
> >>>> tokens
> >>>>>>>>> are distributed among shards)
> >>>>>>>>>
> >>>>>>>>>> On 2018-04-19 19:46, Ben Bromhead wrote:
> >>>>>>>>>> WRT to #3
> >>>>>>>>>> To fit in the existing protocol, could you have each shard
> >> listen
> >>>> on a
> >>>>>>>>>> different port? Drivers are likely going to support this due to
> >>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-7544 (
> >>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm
> >> not
> >>>> super
> >>>>>>>>>> familiar with the ticket so their might be something I'm missing
> >>>> but it
> >>>>>>>>>> sounds like a potential approach.
> >>>>>>>>>>
> >>>>>>>>>> This would give you a path forward at least for the short term.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg <
> >>> [email protected]>
> >>>>>>>> wrote:
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> I think that updating the protocol spec to Cassandra puts the
> >>> onus
> >>>> on
> >>>>>>>> the
> >>>>>>>>>>> party changing the protocol specification to have an
> >>> implementation
> >>>>>>>> of the
> >>>>>>>>>>> spec in Cassandra as well as the Java and Python driver (those
> >>> are
> >>>>>>>> both
> >>>>>>>>>>> used in the Cassandra repo). Until it's implemented in
> >> Cassandra
> >>> we
> >>>>>>>> haven't
> >>>>>>>>>>> fully evaluated the specification change. There is no
> >> substitute
> >>>> for
> >>>>>>>> trying
> >>>>>>>>>>> to make it work.
> >>>>>>>>>>>
> >>>>>>>>>>> There are also realities to consider as to what the maintainers
> >>> of
> >>>> the
> >>>>>>>>>>> drivers are willing to commit.
> >>>>>>>>>>>
> >>>>>>>>>>> RE #1,
> >>>>>>>>>>>
> >>>>>>>>>>> I am +1 on the fact that we shouldn't require an extra hop for
> >>>> range
> >>>>>>>> scans.
> >>>>>>>>>>> In JIRA Jeremiah made the point that you can still do this from
> >>> the
> >>>>>>>> client
> >>>>>>>>>>> by breaking up the token ranges, but it's a leaky abstraction
> >> to
> >>>> have
> >>>>>>>> a
> >>>>>>>>>>> paging interface that isn't a vanilla ResultSet interface.
> >> Serial
> >>>> vs.
> >>>>>>>>>>> parallel is kind of orthogonal as the driver can do either.
> >>>>>>>>>>>
> >>>>>>>>>>> I agree it looks like the current specification doesn't make
> >> what
> >>>>>>>> should
> >>>>>>>>>>> be simple as simple as it could be for driver implementers.
> >>>>>>>>>>>
> >>>>>>>>>>> RE #2,
> >>>>>>>>>>>
> >>>>>>>>>>> +1 on this change assuming an implementation in Cassandra and
> >> the
> >>>>>>>> Java and
> >>>>>>>>>>> Python drivers.
> >>>>>>>>>>>
> >>>>>>>>>>> RE #3,
> >>>>>>>>>>>
> >>>>>>>>>>> It's hard to be +1 on this because we don't benefit by boxing
> >>>>>>>> ourselves in
> >>>>>>>>>>> by defining a spec we haven't implemented, tested, and decided
> >> we
> >>>> are
> >>>>>>>>>>> satisfied with. Having it in ScyllaDB de-risks it to a certain
> >>>>>>>> extent, but
> >>>>>>>>>>> what if Cassandra decides to go a different direction in some
> >>> way?
> >>>>>>>>>>>
> >>>>>>>>>>> I don't think there is much discussion to be had without an
> >>> example
> >>>>>>>> of the
> >>>>>>>>>>> the changes to the CQL specification to look at, but even then
> >> if
> >>>> it
> >>>>>>>> looks
> >>>>>>>>>>> risky I am not likely to be in favor of it.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Ariel
> >>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Apr 19, 2018, at 9:33 AM, [email protected] wrote:
> >>>>>>>>>>>> On 2018/04/19 07:19:27, kurt greaves <[email protected]>
> >>>> wrote:
> >>>>>>>>>>>>>> 1. The protocol change is developed using the Cassandra
> >>> process
> >>>> in
> >>>>>>>>>>>>>>     a JIRA ticket, culminating in a patch to
> >>>>>>>>>>>>>>     doc/native_protocol*.spec when consensus is achieved.
> >>>>>>>>>>>>> I don't think forking would be desirable (for anyone) so this
> >>>> seems
> >>>>>>>>>>>>> the most reasonable to me. For 1 and 2 it certainly makes
> >> sense
> >>>> but
> >>>>>>>>>>>>> can't say I know enough about sharding to comment on 3 -
> >> seems
> >>>> to me
> >>>>>>>>>>>>> like it could be locking in a design before anyone truly
> >> knows
> >>>> what
> >>>>>>>>>>>>> sharding in C* looks like. But hopefully I'm wrong and there
> >>> are
> >>>>>>>>>>>>> devs out there that have already thought that through.
> >>>>>>>>>>>> Thanks. That is our view and is great to hear.
> >>>>>>>>>>>>
> >>>>>>>>>>>> About our proposal number 3: In my view, good protocol designs
> >>> are
> >>>>>>>>>>>> future proof and flexible. We certainly don't want to propose
> >> a
> >>>>>>>> design
> >>>>>>>>>>>> that works just for Scylla, but would support reasonable
> >>>>>>>>>>>> implementations regardless of how they may look like.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Do we have driver authors who wish to support both projects?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Surely, but I imagine it would be a minority. 
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>> ---------------------------------------------------------------------
> >>>>>>>>>>>> To unsubscribe, e-mail: [email protected]
> >>> For
> >>>>>>>>>>>> additional commands, e-mail: [email protected]
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>> ---------------------------------------------------------------------
> >>>>>>>>>>> To unsubscribe, e-mail: [email protected]
> >>>>>>>>>>> For additional commands, e-mail: [email protected]
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>> Ben Bromhead
> >>>>>>>>>> CTO | Instaclustr <https://www.instaclustr.com/>
> >>>>>>>>>> +1 650 284 9692 <(650)%20284-9692> <(650)%20284-9692>
> >>> <(650)%20284-9692>
> >>>>>>>>>> Reliability at Scale
> >>>>>>>>>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> >>>>>>>>>>
> >>>>>>>>>
> >>> ---------------------------------------------------------------------
> >>>>>>>>> To unsubscribe, e-mail: [email protected]
> >>>>>>>>> For additional commands, e-mail: [email protected]
> >>>>>>>>>
> >>>>>>>>
> >>> ---------------------------------------------------------------------
> >>>>>>>> To unsubscribe, e-mail: [email protected]
> >>>>>>>> For additional commands, e-mail: [email protected]
> >>>>>>>>
> >>>>>>>> --
> >>>>>>> Ben Bromhead
> >>>>>>> CTO | Instaclustr <https://www.instaclustr.com/>
> >>>>>>> +1 650 284 9692 <(650)%20284-9692> <(650)%20284-9692>
> >>>>>>> Reliability at Scale
> >>>>>>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> >>>>>>>
> >>>>>>
> >>>>>> ------------------------------------------------------------
> >> ---------
> >>>>>> To unsubscribe, e-mail: [email protected]
> >>>>>> For additional commands, e-mail: [email protected]
> >>>>>>
> >>>>>
> >>>>> ------------------------------------------------------------
> >> ---------
> >>>>> To unsubscribe, e-mail: [email protected]
> >>>>> For additional commands, e-mail: [email protected]
> >>>>>
> >>>>
> >>>> --
> >>> Ben Bromhead
> >>> CTO | Instaclustr <https://www.instaclustr.com/>
> >>> +1 650 284 9692 <(650)%20284-9692>
> >>> Reliability at Scale
> >>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> >>>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: Evolving the client protocol

Reply via email to