Re: Cassandra storage: Some thoughts

Vangelis Koukis Tue, 13 Mar 2018 05:39:12 -0700

On Sat, Mar 10, 2018 at 04:35:05am +0100, Oleksandr Shulgin wrote:
> On 9 Mar 2018 16:56, "Vangelis Koukis" <vkou...@arrikto.com> wrote:
> > 
> > Hello all,
> > 
> > My name is Vangelis Koukis and I am a Founder and the CTO of Arrikto.
> > 
> > I'm writing to share our thoughts on how people run distributed,
> > stateful applications such as Cassandra on modern infrastructure,
> > and would love to get the community's feedback and comments.
> > 
>
> Thanks, that sounds interesting.
>


Thank you Alex.

> > At Arrikto we are building decentralized storage to tackle this problem
> > for cloud-native apps. Our software, Rok
>
> 
> Do I understand correctly that there is only white paper available, but not
> any source code?
> 

I assume you are referring to the resources which are publicly available
on our website.

Yes, Rok is not open source, at least not for now.

> >  In this case,
> >       Cassandra only has to recover the changed parts, which is just a
> >       small fraction of the node data, and does not cause CPU load on
> >       the whole cluster.
> > 
> 
> How if not running a repair? And if it's a repair why would it not put CPU
> load on other nodes?
> 

Rok will present new local storage to the node, which will be a thin,
instant clone of its latest snapshot, and will be hydrated in the
background.

This process only involves the node under recovery, which communicates
with the Rok snapshot store directly, and happens with predictable
performance without impacting the rest of the cluster. Currently, we
recover at ~1GB/s when storing snapshots on S3.

At this point, the node has been recovered to a previous point in time,
so yes, a repair needs to run, to bring it up to date with the rest of
the cluster.

We assume this will be an incremental repair, in which case the load on
the other nodes will be minimal: instead of having to transfer the full
data set, they will only have to provide the recovered node with the
data that has changed since the snapshot was last taken. Moreover, to
participate in the repair they will be reading their SSTables from
local, NVMe-based storage which has very good performance.

Since Rok can take snapshots of the whole Cassandra deployment
periodically, the amount of data to be transferred depends on the
snapshot frequency. If we assume a standard snapshot frequency of once
every 15', we can expect that only a small fraction of the whole data
set will need to be repaired, for the node to be up-to-date again.

Hope the above answers your question,
Vangelis.

-- 
Vangelis Koukis
CTO, Arrikto Inc.
3505 El Camino Real, Palo Alto, CA 94306
www.arrikto.com

signature.asc
Description: Digital signature

Re: Cassandra storage: Some thoughts

Reply via email to