Thanks Frank and Stefan.

@Patrick great suggestion and worthwhile getting everything on the table.

One minor change I would advocate for. The SIG has been great to iterate
and interact on the details, but I really think this conversation given the
nature of the content needs to be on the mailing list. The mailing list is
really our system of record and the most accessible.

It gives folk time to think and digest, it's asynchronous, easily
searchable and let's be honest, the majority of stakeholders in this are
not US based, so the timing issue then goes away and makes it easier for
people to participate in. I feel like we've made a lot more progress by
simply having this discussion here.

So instead of a presentation, maybe just an email to the ML addressing the
headings that Patrick identified?


On Fri, Sep 25, 2020 at 7:55 AM Stefan Miklosovic <
stefan.mikloso...@instaclustr.com> wrote:

> Hi,
>
> Patrick's suggestion seems good to me.
>
> I won't go into specifics here as I need to genuinely prepare for
> this. It is quite hard to dig deep into the solutions of others and
> bring some constructive criticism because it takes a lot of time to
> study it and everybody has some "why's" behind it.
>
> To summarize my goals and concerns:
>
> 1) We should be as much "Kubernetes operator idiomatic" as possible.
> Industry standards, no custom brain-child of this or that group
> because they think it is just cool or they just didn't know any
> better. I do NOT say it is like that right now, I just want to be
> ruthless here as much as possible when it comes to functionality and
> why it is done like that. It is awesome that we have already something
> latest (thanks to John) and it adheres to the latest releases. I
> personally had a hard time to keep up with all the releases, once I
> finished something and I aligned it, after a week or two there was
> already another one where things were different, it is a very
> fast-moving space and I hope that by time we develop something it will
> not be obsolete.
>
> 2) It may be easier said than done but it is guaranteed that people
> get emotional, it's their precious etc, so please let's go into this
> with good intentions, not trying to push one solution over the other
> just because they would like to see it there ... I will have an
> equally hard time to comply with this point. My plan is to explain
> what is _wrong_ with our solution. Where we made mistakes and what
> should be done differently but it is "too late" etc. It is quite hard
> to describe your work and all effort in this light but without telling
> what is wrong we can not decide what is good imho.
>
> 3) We should put something together fast enough so we can call it a
> release. We can always iterate on it for eternity. But the foundations
> need to be there. Here I want to say that I especially like what John
> did. I looked through these specs and it was obvious it has been
> written with care and attention. It looked _solid_. I am not sure how
> hard it is to put all other things on top of that, I truly do not, and
> here I think we would have to reinvent that wheel if we want to
> proceed because I can not imagine what it would be to retrofit e.g.
> CassKop on top of John specs, it is just like putting round pegs into
> the square holes, maybe some chunks would be reused easily but
> otherwise I worry we will be just on square one.
>
> One specific feeling I have as I read this is that even if there is
> the will to create the fourth operator, the respective parties will
> not be able to drop their own repository. The whole point behind this
> effort, to me, is to have a solid, community driven, stable, modern
> and feature complete operator people are truly using. I can see that
> once this is real, we will _really_ sunset our operator, redirecting
> people to the new operator on main readme doc etc, we truly mean it.
> Sure, if somebody comes and bug fix will be needed, we will fix it,
> but the whole point of doing this is to stop using what we have
> currently, over time, otherwise we are just splitting this space even
> more. If CassKop is not sure if they will use it because they do not
> know if that operator will be "enough" for them, aren't we just doing
> it wrong? If I exaggerate, they should be fine with deleting the whole
> repository and using just this Cassandra one we are going to make
> otherwise I don't see the point to work on this ...
>
> On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jmcken...@apache.org>
> wrote:
> >
> > - choose cass-operator: it is not on offer right now so let’s see if it
> does
> >
> >
> > We should all talk a lot more, but this is 100% a mistake - I take the
> > blame for that. The intention has long been to offer cass-operator for
> > donation but it slipped through the cracks and your email yesterday made
> me
> > double-take.
> >
> > We have since resolved this misalignment. DataStax would be happy to
> donate
> > any and all of cass-operator to the ASF and C* project if it's what we
> all
> > agree best serves our collective Cassandra users. I'm also cognizant that
> > an immense amount of effort has gone into CassKop and we seem to have
> > something of an embarrassment of riches.
> >
> > I'm given to understand (haven't dug in personally) that the two
> operators
> > express pretty different opinions when it comes to frameworks, designs,
> > supported versions, etc. I think a discrete enumeration of the feature
> set
> > and "identities" of both could really help navigate this conversation
> going
> > forward.
> >
> > Also - thanks for that context Franck. It's always helpful to know where
> > other people are coming from when we're all working together towards a
> > common goal.
> >
> >
> > On Thu, Sep 24, 2020 at 12:23 PM, <franck.de...@orange.com> wrote:
> >
> > > I can share Orange’s view of the situation, sorry it is a long story!
> > >
> > > We started CassKop at the end of 2018 after betting on K8S which was
> not
> > > so simple as far as C* was concerned. Lack of support for local
> storage,
> > > IPs that change all the time, different network plugins to try to
> implement
> > > a non standard K8s way of having nodes see each other from different
> dcs…
> > > We hesitated with Mesos but could not have both and K8S was already
> > > tracting so much you could not not choose it.
> > >
> > > Anyway, we looked around and did not see anyone with such requirements
> so
> > > we said: why not try it ourselves but on github so that we may give it
> back
> > > to the community. We have used C* for quite a few years with great
> success
> > > on production with massive load and perfect availability. We love C* @
> > > Orange :) Thanks!
> > >
> > > So we started writing support for mono-dc cluster (CassKop) and added
> the
> > > multi dc support with MultiCassKop which is another operator included
> in
> > > the CassKop repo. For more details we tried to document our designs as
> much
> > > as possible here: https://orange-opensource.github.io/casskop/docs/
> > > 1_concepts/3_design_principes#multi-site-management
> > >
> > > In the middle of last year we had some talks with Datastax about
> working
> > > together around their new management sidecar. Their position on open
> source
> > > was not clear at that time so we said please come back when you have
> > > decided to go open source with it. Which they did in the beginning of
> this
> > > year. But at that time I guess work had started on cass-operator so we
> kept
> > > our separate ways.
> > >
> > > Since the beginning of the years, we have been working with our OPS
> team
> > > to have it in production. It is not simple as the team has to learn
> K8S and
> > > trust a newborn operator. This takes time especially as our internal
> > > cluster has been tweaked for multi-tenancy with obscure options being
> set
> > > by our K8s team…
> > >
> > > We also developed with Instaclustr the Backup & Restore functionnality
> (we
> > > have new CRDs (Custom Resource Definition) for backup and restore and a
> > > reconcile loop that calls out Instaclustr sidecar for these
> operations). We
> > > now support multiple backups in parallel and can write to s3/ google or
> > > azur (but Stefan could give more details here if needed)
> > >
> > > During the SIG calls we mentioned our desire to donate CassKop once it
> > > satisfies our basics requirements (v1 coming just now but I said it too
> > > many times already) I am actually not sure Datastax mentioned their
> desire
> > > to donate cass-operator but we decided to compare the designs and the
> > > functionalities based on respective CRDs. The CRD is the interface
> with the
> > > user as it is where you describe the cluster that you want to have.
> These
> > > talks were very interesting and we found out that the CassKop team had
> made
> > > good choices most of the time but was may be too open. Indeed our
> intention
> > > was to give all the possibilities for our OPS team to work. This
> includes :
> > > - very open topology definition using any configuration of labels to
> map
> > > dcs / racks and nodes to labels on clusters (we have labels on dcs /
> rooms
> > > / rows and server racks so we can map C* racks to storage or network
> arrays
> > > internaly)
> > > - possibility to have multiple C* nodes on a single K8S host (because
> > > internal clouds are not really clouds, they have limited resources)
> > > - custom C* image selection,
> > > - custom bootstrap script that lets you configure C* as you want using
> > > ConfigMaps,
> > > - the ability to mount different volumes wherever they wanted,
> > > - the possibility to run any number of sidecars alongside C* for custom
> > > probes in our case
> > >
> > > This makes CassKop quite powerful and flexible.
> > > We made sure that all those options are not enabled by default so one
> can
> > > just pop a simple 3 node cluster quickly
> > >
> > > On the other hand cass-operator had an interesting way of configuring
> C*
> > > just inside the CRD using cass-config. This is simple and elegant so
> we are
> > > implementing it as well for the support of C* 4
> > >
> > > Now for the future, there are 3 choices in my opinion:
> > > - start from scratch (or John’s repo) by cherry picking bits from all
> > > operators. This is possible but will take some time / effort to have
> > > something usable. And then it will be compared to cass-operator and
> > > CassKop. I don’t see Orange contributing too much here as we believe
> > > CassKop to be a much better starting point
> > > - choose cass-operator: it is not on offer right now so let’s see if it
> > > does. I think Orange could contribute some bits inherited from CassKop
> if
> > > it is agreed by the community. Not sure it would be enough for us to
> use
> > > it.
> > > - choose CassKop: we would be delighted to donate it and contribute
> with
> > > some committers (including the original author who now works for AWS).
> It
> > > would then become the community operator but there would be
> cass-operator
> > > alongside probably. But Cass-operator is made to make it easier for
> > > Datastax to manage customer clusters by imposing some configuration. It
> > > make sense for their needs, so may be 2 operators. We don’t know how
> > > backup/restore will be handled here with medusa being adapted to K8s
> > >
> > > Sorry again for being long but 2 years of work deserve some lines of
> text
> > > :)
> > >
> > > I just saw your message Patrick but this was written already so we
> gain a
> > > week.
> > >
> > > Franck
> > >
> > > On 24 Sep 2020, at 10:08, Benjamin Lerer <benjamin.le...@datastax.com
> > > <mailto:benjamin.le...@datastax.com>> wrote:
> > >
> > > I realise there are meeting logs, but getting a wider discourse with
> > > non-stakeholder input might help to build a community consensus? It
> doesn't
> > > seem like it can hurt at this point, anyway.
> > >
> > > +1
> > >
> > > On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith
> <benedict@apache.
> > > org<mailto:bened...@apache.org>> wrote:
> > >
> > > Perhaps it helps to widen the field of discussion to the dev list?
> > >
> > > It might help if each of the stakeholder organisations state their
> view on
> > > the situation, including why they would or would not support a given
> > > approach/operator, and what (preferably specific) circumstances might
> lead
> > > them to change their mind?
> > >
> > > I realise there are meeting logs, but getting a wider discourse with
> > > non-stakeholder input might help to build a community consensus? It
> doesn't
> > > seem like it can hurt at this point, anyway.
> > >
> > > On 23/09/2020, 17:13, "John Sanda" <john.sa...@gmail.com<mailto:john.
> > > sa...@gmail.com>> wrote:
> > >
> > > I want to point out that pretty much everything being discussed in this
> > > thread has been discussed at length during the SIG meetings. I think
> it is
> > > worth noting because we are pretty much still have the same
> conversation.
> > >
> > > On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith <
> benedict@apache.
> > > org<mailto:bened...@apache.org>> wrote:
> > >
> > > I don't think there's anything about a code drop that's not "The Apache
> > > Way"
> > >
> > > If there's a consensus (or even strong majority) amongst invested
> parties,
> > > I don't see why we could not adopt an operator directly into the
> project.
> > >
> > > It's possible a green field approach might lead to fewer hard
> feelings, as
> > > everyone is in the same boat. Perhaps all operators are also suboptimal
> > > and
> > > could be improved with a rewrite? But I think coordinating a lot of
> > > different entities around an empty codebase is particularly
> challenging. I
> > > actually think it could be better for cohesion and collaboration to
> have a
> > > suboptimal but substantive starting point.
> > >
> > > On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> > > instaclustr.com<mailto:stefan.mikloso...@instaclustr.com>> wrote:
> > >
> > > I think that from Instaclustr it was stated quite clearly multiple
> > > times that we are "fine to throw it away" if there is something better
> > > and more wide-spread.Indeed, we have invested a lot of time in the
> > > operator but it was not useless at all, we gained a lot of quite unique
> > > knowledge how to put all pieces together. However, I think that
> > > this space is going to be quite fragmented and "balkanized", which is
> > > not always a bad thing, but in a quite narrow area as Kubernetes
> operator
> > > is, I just do not see how 4 operators are going to be beneficial for
> > > ordinary people ("official" from community, ours, Datastax one and
> CassKop
> > > (without any significant order)). Sure, innovation and healthy
> competition
> > > is important but to what extent ...
> > > One can start a Cassandra cluster on Kubernetes just so many times
> > > differently and nobody really likes a vendor lock-in. People wanting
> > > to run a cluster on K8S realise that there are three operators, each
> > > backed by a private business entity, and the community operator is not
> > > there ... Huh, interesting ... One may even start to question what is
> > > wrong with these folks that it takes three companies to build their
> > > own solution.
> > >
> > > Having said that, to my perception, Cassandra community just does not
> > > have enough engineers nor contributors to keep 4 operators alive at
> > > the same time (I wish I was wrong) so the idea of selecting the best
> > > one or to merge obvious things and approaches together is
> understandable,
> > > even if it meant we eventually sunset ours. In addition, nobody from
> big
> > > players is going to contribute to the code
> > > base of the other one, for obvious reasons, so channeling and directing
> > > this effort into something common for a community seems to
> > > be the only reasonable way of cooperation.
> > >
> > > It is quite hard to bootstrap this if the donation of the code in big
> > > chunks / whole repo is out of question as it is not the "Apache way"
> > > (there was some thread running here about this in more depth a while
> > > ago) and we basically need to start from scratch which is quite
> > > demotivating, we are just inventing the wheel and nobody is up to it.
> > > It is like people are waiting for that to happen so they can jump in
> > > "once it is the thing" but it will never materialise or at least the
> > > hurdle to kick it off is unnecessarily high. Nobody is going to invest
> > > in this heavily if there is already a working operator from companies
> > > mentioned above. As I understood it, one reason of not choosing the
> > > way of donating it all is that "the learning and community building
> > > should happen in organic manner and we just can not accept the
> donation",
> > > but is not it true that it is easier to build a community
> > > around something which is already there rather than trying to build it
> > > around an idea which is quite hard to dedicate to?
> > >
> > > On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie < jmcken...@apache.org
> > > <mailto:jmcken...@apache.org>> wrote:
> > >
> > > I think there's significant value to the community in trying to
> > > coalesce
> > > on a single approach,
> > > I agree. Unfortunately in this case, the parties with a vested interest
> > > and
> > > written operators came to the table and couldn't agree to coalesce
> > > on a
> > > single approach. John Sanda attempted to start an initiative to write a
> > > best-of-breed combining choice parts of each operator, but that effort
> did
> > > not gain traction.
> > >
> > > Which is where my hypothesis comes from that if there were a clear
> > > "better
> > > fit" operator to start from we wouldn't be in a deadlock; the correct
> > > choice would be obvious. Reasonably so, every engineer that's written
> > > something is going to want that something to be used and not thrown
> > > away in
> > > favor of another something without strong evidence as to why that's
> > > the
> > > better choice.
> > >
> > > As far as I know, nobody has made a clear case as to a more compelling
> > > place to start in terms of an operator donation the project then
> > > collaborates on. There's no mass adoption evidence nor feature
> enumeration
> > > that I know of for any of the approaches anyone's taken, so the
> > > discussions
> > > remain stalled.
> > >
> > > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith <
> benedict@apache.
> > > org<mailto:bened...@apache.org> wrote:
> > >
> > > I think there's significant value to the community in trying to
> > > coalesce
> > > on a single approach, earlier than later. This is an opportunity
> > > to expand
> > > the number of active organisations involved directly in the Apache
> > > Cassandra project, as well as to more quickly expand the project's
> > > functionality into an area we consider urgent and important. I
> > > think it
> > > would be a real shame to waste this opportunity. No doubt it will
> > > be hard,
> > > as organisations have certain built-in investments in their own
> > > approaches.
> > >
> > > I haven't participated in these calls as I do not consider myself
> > > to have
> > > the relevant experience and expertise, and have other focuses on
> > > the
> > > project. I just wanted to voice a vote in favour of trying to bring the
> > > different organisations together on a single approach if possible.
> > > Is there
> > > anything the project can do to help this happen?
> > >
> > > On 23/09/2020, 03:04, "Ben Bromhead" <b...@instaclustr.com<mailto:ben@
> > > instaclustr.com>> wrote:
> > >
> > > I think there is certainly an appetite to donate and standardise
> > > on a
> > > given operator (as mentioned in this thread).
> > >
> > > I personally found the SIG hard to participate in due to time zones and
> > > the synchronous nature of it.
> > >
> > > So while it was a great forum to dive into certain details for a
> > > subset of
> > > participants and a worthwhile endeavour, I wouldn't paint it as an
> > > accurate
> > > reflection of community intent.
> > >
> > > I don't think that any participants want to continue down the path
> > > of "let
> > > a thousand flowers bloom". That's why we are looking towards CasKop (as
> > > well as a number of technical reasons).
> > >
> > > Some of the recorded meetings and outputs can also be found if you
> > > are
> > > interested in some primary sources
> > > https://cwiki.apache.org/confluence/display/CASSANDRA/
> > > Cassandra+Kubernetes+Operator+SIG
> > > .
> > >
> > > From what I understand second-hand from talking to people on the
> > > SIG
> > > calls,
> > >
> > > there was a general inability to agree on an existing operator as a
> > > starting point and not much engagement on taking best of breed
> > > from the
> > > various to combine them. Seems to leave us in the "let a thousand
> > > flowers
> > > bloom" stage of letting operators grow in the ecosystem and seeing
> > > which
> > > ones meet the needs of end users before talking about adopting one
> > > into the
> > > foundation.
> > >
> > > Great to hear that you folks are joining forces though! Bodes well
> > > for C*
> > > users that are wanting to run things on k8s.
> > >
> > > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead < b...@instaclustr.com
> > > <mailto:b...@instaclustr.com>
> > >
> > > wrote:
> > >
> > > For what it's worth, a quick update from me:
> > >
> > > CassKop now has at least two organisations working on it substantially
> > > (Orange and Instaclustr) as well as the numerous other contributors.
> > >
> > > Internally we will also start pointing others towards CasKop once
> > > a few
> > > things get merged. While we are not yet sunsetting our operator
> > > yet, it
> > >
> > > is
> > >
> > > certainly looking that way.
> > >
> > > I'd love to see the community adopt it as a starting point for
> > > working
> > > towards whatever level of functionality is desired.
> > >
> > > Cheers
> > >
> > > Ben
> > >
> > > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> > > john.sa...@gmail.com>
> > > wrote:
> > >
> > > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie < jmcken...@apache.org>
> > > wrote:
> > >
> > > There's basically 1 java driver in the C* ecosystem. We have 3? 4?
> > > or
> > >
> > > more
> > >
> > > operators in the ecosystem. Has one of them hit a clear supermajority
> of
> > > adoption that makes it the de facto default and makes sense to
> > > pull it
> > >
> > > into
> > >
> > > the project?
> > >
> > > We as a project community were pretty slow to move on building a
> > > PoV
> > >
> > > around
> > >
> > > kubernetes so we find ourselves in a situation with a bunch of
> > > contenders
> > > for inclusion in the project. It's not clear to me what heuristics
> > > we'd
> > >
> > > use
> > >
> > > to gauge which one would be the best fit for inclusion outside
> > > letting
> > > community adoption speak.
> > >
> > > ---
> > > Josh McKenzie
> > >
> > > We actually talked a good bit on the SIG call earlier today about
> > > heuristics. We need to document what functionality an operator
> > > should
> > > include at level 0, level 1, etc. We did discuss this a good bit
> > > during
> > > some of the initial SIG meetings, but I guess it wasn't really a
> > > focal
> > > point at the time. I think we should also provide references to
> > > existing
> > > operator projects and possibly other related projects. This would
> > > benefit
> > > both community users as well as people working on these projects.
> > >
> > > - John
> > >
> > > --
> > >
> > > Ben Bromhead
> > >
> > > Instaclustr | www.instaclustr.com | @instaclustr
> > > <http://twitter.com/instaclustr> | (650) 284 9692
> > >
> > > --
> > >
> > > Ben Bromhead
> > >
> > > Instaclustr | www.instaclustr.com | @instaclustr
> > > <http://twitter.com/instaclustr> | (650) 284 9692
> > >
> > > ---------------------------------------------------------------------
> To
> > > unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> > > additional
> > > commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > > ---------------------------------------------------------------------
> To
> > > unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> additional
> > > commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > > ---------------------------------------------------------------------
> To
> > > unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> additional
> > > commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > > --
> > >
> > > - John
> > >
> > > ---------------------------------------------------------------------
> To
> > > unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> additional
> > > commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> _________________________________________________________________________________________________________________________
> > >
> > >
> > > Ce message et ses pieces jointes peuvent contenir des informations
> > > confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> > > exploites ou copies sans autorisation. Si vous avez recu ce message par
> > > erreur, veuillez le signaler a l'expediteur et le detruire ainsi que
> les
> > > pieces jointes. Les messages electroniques etant susceptibles
> d'alteration,
> > > Orange decline toute responsabilite si ce message a ete altere,
> deforme ou
> > > falsifie. Merci.
> > >
> > > This message and its attachments may contain confidential or privileged
> > > information that may be protected by law; they should not be
> distributed,
> > > used or copied without authorisation. If you have received this email
> in
> > > error, please notify the sender and delete this message and its
> > > attachments. As emails may be altered, Orange is not liable for
> messages
> > > that have been modified, changed or falsified. Thank you.
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

Reply via email to