These are my personal opinions, reflecting both my long experience w
database systems, and my newness to Cassandra...

[tl;dr]

The Cassandra contributors, having made its history, tend to describe it in
terms of implementation rather than action. And its implementation has a
history, all relatively recent, that many know, but which to newcomers like
me is obscure and, frankly, not particularly relevant.

Note: we are all trying to understand Crimea now, and to really understand,
you have to ingest several hundred years of history. Luckily, Cassandra has
not been around quite so long!

But Cassandra's history creeps into the nomenclature of CQL3. So what might
logically be called a 'hash key' is called a 'partition key', what is
called a 'clustering key' might be better termed a 'range key' IMHO.

The 'official' terms in the nomenclature are important to know, they are
just not descriptive of the actions one takes as a user of them. However,
they have meaning to those who have 'lived' the history of Cassandra, and
form an important bridge to the past.

As a new user I found them non-intuitive. Amazon has done a much better job
with DynamoDB - muddled, however, by bad syntax choices.

But you adjust and mentally map... I am still bumfuzzled when people talk
of slices and other C* cruft but just let it slide by like lectures from my
mother. That and thrift can just fade into history with gopher and lynx as
far as I am concerned - CQL3 is where it's at.

But another thing to remember is that performance is king - and to get
performance you fly 'close to the metal': Cassandra does that and you
should know the code paths, the physical structures, and the
characteristics of your 'metal' to understand how to build high-performing
apps.

***

The answer to both asterisks is Yes. You should use the term 'clustering
column' because that is what is in the docs - but you should think 'range
key' for how you use it. Similarly 'partition key' : 'hash key'.

Good luck,

ml

Reply via email to