I know that adding a new vnode enabled DC is the recommended method to
convert and existing cluster to vnode. And that the cassandra-shuffle
utility has been removed.
That said, I've done some testing and it appears to be possible to
perform an in place conversion as long as all nodes contain
Hello all,
I have read a lot about Cassandra and I read about key-value pairs,
partition keys, clustering keys, etc..
Is key mentioned in key-value pair and partition key refers to same or are
they different?
CREATE TABLE corpus.bigram_time_category_ordered_frequency (
id bigint,
word1 va
Correction: year and category form a “composite partition key”.
frequency, word1, and word2 are “clustering columns”.
The combination of a partition key with clustering columns is a “compound
primary key”.
Every CQL row will have a partition key by definition, and may optionally have
Hi Jack,
So what will be the keys and values of the following CF instance?
year | category | frequency | word1| word2 | id
2014 |N | 1 |සියළුම | යුද්ධ | 664
2014 |
For the first row, the key is: (2014, N, 1, සියළුම, යුද්ධ) and the value-part
is (664).
Jens Rantil
Backend engineer
Tink AB
Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se
Facebook Linkedin Twitter
On Tue, Dec 16, 2014 at 2:25 PM, Chamila Wijayarathn
Hi Jens,
Thank You!
On Tue, Dec 16, 2014 at 7:03 PM, Jens Rantil wrote:
> For the first row, the key is: (2014, N, 1, සියළුම, යුද්ධ) and the
> value-part is (664).
> Cheers,
> Jens
> ——— Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se
> Phone: +46 708 84 18 32 Web: www.tink
Hello all,
I am trying to test my application using cassandra-unit with following
schema and data given below.
CREATE TABLE corpus.bigram_time_category_ordered_frequency (
id bigint,
word1 varchar,
word2 varchar,
year int,
category varchar,
frequency int,
> You are, of course, free to use batches in your application
I'm not looking to justify the use of batches, I'm looking for the path
forward that will give us the Best Results™ both near and long term, for
some definition of Best (which would be a balance of client throughput and
cluster pressure
Howdy all,
Our use of cassandra unfortunately makes use of lots of deletes. Yes, I
know that C* is not well suited to this kind of workload, but that's where
we are, and before I go looking for an entirely new data layer I would
rather explore whether C* could be tuned to work well for us.
No, deletes are always written as a tombstone no matter the consistency.
This is because data at rest is written to sstables which are immutable
once written. The tombstone marks that a record in another sstable is now
deleted, and so a read of that value should be treated as if it doesn't
Tombstones have to be created. The SSTables are immutable, so the data cannot
be deleted. Therefore, a tombstone is required. The value you deleted will be
physically removed during compaction.
My workload sounds similar to yours in some respects, and I was able to get C*
working for me. I have
Ah, makes sense. Thanks for the explanations!
- Ian
On Tue, Dec 16, 2014 at 10:53 AM, Robert Wille wrote:
> Tombstones have to be created. The SSTables are immutable, so the data
> cannot be deleted. Therefore, a tombstone is required. The value you
> deleted will be physically removed duri
When you say “no need for tombstones”, did you actually read that somewhere or
were you just speculating? If the former, where exactly?
-- Jack Krupansky
From: Ian Rose
Sent: Tuesday, December 16, 2014 10:22 AM
To: user
Subject: does consistency=ALL for deletes obviate the need for tombstones?
Nope. I added millions of records and several GB to the cluster while one node
was down, and then ran "nodetool flush system hints" on a couple of nodes that
were up, and system/hints has less than 200K in it.
Here’s the relevant part of "nodetool cfstats system.hints":
Keyspace: system
Hi Jonathan,QUORUM = (sum_of_replication_factors / 2) + 1, For us Quorum =
(2/2) +1 = 2.
Default CL is ONE and RF=2 with Two Nodes in the cluster.(I am little
confused, what is my read CL and what is my WRITE CL?)
So, does it mean that for every WRITE it will write in both the nodes?
and For eve
I have been having a few exchanges with contributors to the project around what
is possible with Cassandra and a common response that comes up when I describe
functionality as broken or missing is that I am not modelling my data
correctly. Unfortunately, I cannot seem to find comprehensive d
CL quorum with RF2 is equivalent to ALL, writes will require
acknowledgement from both nodes, and reads will be from both nodes.
CL one will write to both replicas, but return success as soon as the first
one responds, read will be from one node ( load balancing strategy
determines which one).
Thanks Ryan.
So, as Jonathan recommended, we should have RF=3 with Three nodes.
So Quorum = 2 so, CL= 2 (or I need the CL to be set to two) and I will not
need the downgrading retry policy, in case if my one node goes down.
I can dynamically add a New node to my Cluster.
Can I change my RF to 3,
you'll have to run repair and that will involve some load and streaming,
but this is a normal use case for cassandra..and your cluster should be
sized load wise to allow repair, and bootstrapping of new nodes..otherwise
when you're over whelmed you won't be able to add more nodes easily.
If you ne
thanks Ryan.. We will get a new node and add it in the cluster. I will mail
if I have any question regarding the same.
On Tue, Dec 16, 2014 at 10:52 PM, Ryan Svihla wrote:
> you'll have to run repair and that will involve some load and streaming,
> but this is a normal use case for cassandra..a
Data Modeling a distributed application could be a book unto itself.
However, I will add, modeling by restriction is basically the entire
thought process in Cassandra data modeling since it's a distributed hash
table and a core aspect of that sort of application is you need to be able
to quickly lo
I'd ask the author of cassandra-unit. I've not personally used that project.
On Tue, Dec 16, 2014 at 8:00 AM, Chamila Wijayarathna <
cdwijayarat...@gmail.com> wrote:
> Hello all,
> I am trying to test my application using cassandra-unit with following
> schema and data given below.
Repair's performance is going to vary heavily by a large number of factors,
hours for 1 node to finish is within range of what I see in the wild, again
there are so many factors it's impossible to speculate on if that is good
or bad for your cluster. Factors that matter include:
1. speed of dis
Thanks for the response. It offers a bit more clarity.
I think a series of blog posts with good real world examples would go a long
way to increasing usability of Cassandra. Right now I find the process like
going through a mine field because I only discover what is not possible after
There is a lot of stuff out there and the best thing you can do today is
watch Patrick McFadden's series. This is was what I used before I started
at DataStax. Planet Cassandra has a data modeling playlist of videos you
can watch
I was speculating. From the responses above, it now appears to me that
tombstones serve (at least) 2 distinct roles:
1. When reading within a single cassandra instance, they mark a new version
of a value (that value being "deleted"). Without this, the prior version
would be the most recent and s
I have a three node cluster that has been sitting at a load of 4 (for each
node), 100% CPI utilization (although 92% nice) for that last 12 hours,
ever since some significant writes finished. I'm trying to determine what
tuning I should be doing to get it out of this state. The debug log is just
What version of Cassandra are you running?
If it's 2.0, we recently experienced something similar with 8447 [1],
which 8485 [2] should hopefully resolve.
Please note that 8447 is not related to tombstones. Tombstone processing
can put a lot of pressure on the heap as well. Why do y
What's heap usage at?
On Tue, Dec 16, 2014 at 1:04 PM, Arne Claassen wrote:
> I have a three node cluster that has been sitting at a load of 4 (for each
> node), 100% CPI utilization (although 92% nice) for that last 12 hours,
> ever since some significant writes finished. I'm trying to determi
I'm running 2.0.10.
The data is all time series data and as we change our pipeline, we've been
periodically been reprocessing the data sources, which causes each time
series to be overwritten, i.e. every row per partition key is deleted and
re-written, so I assume i've been collecting a bunch of t
What's CPU, RAM, Storage layer, and data density per node? Exact heap
settings would be nice. In the logs look for TombstoneOverflowingException
On Tue, Dec 16, 2014 at 1:36 PM, Arne Claassen wrote:
> I'm running 2.0.10.
> The data is all time series data and as we change our pipeline, we've
AWS r3.xlarge, 30GB, but only using a Heap of 10GB, new 2GB because we
might go c3.2xlarge instead if CPU is more important than RAM
Storage is optimized EBS SSD (but iostat shows no real IO going on)
Each node only has about 10GB with ownership of 67%, 64.7% & 68.3%.
The node on which I set the H
Sorry, I meant 15GB heap on the one machine that has less nice CPU% now.
The others are 6GB
On Tue, Dec 16, 2014 at 12:50 PM, Arne Claassen wrote:
> AWS r3.xlarge, 30GB, but only using a Heap of 10GB, new 2GB because we
> might go c3.2xlarge instead if CPU is more important than RAM
> Storage i
Changed the 15GB node to 25GB heap and the nice CPU is down to ~20% now.
Checked my dev cluster to see if the ParNew log entries are just par for
the course, but not seeing them there. However, both have the following
every 30 seconds:
DEBUG [BatchlogTasks:1] 2014-12-16 21:00:44,898 BatchlogManage
So heap of that size without some tuning will create a number of problems
(high cpu usage one of them), I suggest either 8GB heap and 400mb parnew
(which I'd only set that low for that low cpu count) , or attempt the
tunings as indicated in https://issues.apache.org/jira/browse/CASSANDRA-8150
On T
also based on replayed batches..are you using batches to load data?
On Tue, Dec 16, 2014 at 3:12 PM, Ryan Svihla wrote:
> So heap of that size without some tuning will create a number of problems
> (high cpu usage one of them), I suggest either 8GB heap and 400mb parnew
> (which I'd only set th
I have a time series table consisting of frame information for media. The
table is partitioned on the media ID and uses time and some other frame
level keys as cluster keys, i.e. all frames for a one piece of media is
really one column family "row", even though it is represented in CQL as a
The starting configuration I had, which is still running on two of the
nodes, was 6GB Heap, 1024MB parnew which is close to what you are
suggesting and those have been pegged at load 4 for the over 12 hours with
hardly and read or write traffic. I will set one to 8GB/400MB and see if
its load chang
So 1024 is still a good 2.5 times what I'm suggesting, 6GB is hardly enough
to run Cassandra well in, especially if you're going full bore on loads.
However, you maybe just flat out be CPU bound on your write throughput, how
many TPS and what size writes do you have? Also what is your widest row?
Actually not sure why the machine was originally configured at 6GB since we
even started it on an r3.large with 15GB.
Re: Batches
Not using batches. I actually have that as a separate question on the list.
Currently I fan out async single inserts and I'm wondering if batches are
better since my d
Can you define what is "virtual no traffic" sorry to be repetitive about
that, but I've worked on a lot of clusters in the past year and people have
wildly different ideas what that means.
unlogged batches of the same partition key are definitely a performance
optimization. Typically async is much
No problem with the follow up questions. I'm on a crash course here trying
to understand what makes C* tick so I appreciate all feedback.
We reprocessed all media (1200 partition keys) last night where partition
keys had somewhere between 4k and 200k "rows". After that completed, no
traffic went t
Ok based on those numbers I have a theory..
can you show me nodetool tptats for all 3 nodes?
On Tue, Dec 16, 2014 at 4:04 PM, Arne Claassen wrote:
> No problem with the follow up questions. I'm on a crash course here trying
> to understand what makes C* tick so I appreciate all feedback.
> W
Of course QA decided to start a test batch (still relatively low traffic),
so I hope it doesn't throw the tpstats off too much
Node 1:
Pool NameActive Pending Completed Blocked All
time blocked
MutationStage 0 0 13804928 0
so you've got some blocked flush writers but you have a incredibly large
number of dropped mutations, are you using secondary indexes? and if so how
many? what is your flush queue set to?
On Tue, Dec 16, 2014 at 4:43 PM, Arne Claassen wrote:
> Of course QA decided to start a test batch (still r
Not using any secondary indicies and memtable_flush_queue_size is the default 4.
But let me tell you how data is "mutated" right now, maybe that will give you
an insight on how this is happening
Basically the frame data table has the following primary key: PRIMARY KEY
((id), trackid, "timestamp
so a delete is really another write for gc_grace_seconds (default 10 days),
if you get enough tombstones it can make managing your cluster a challenge
as is. open up cqlsh, turn on tracing and try a few queries..how many
tombstones are scanned for a given query? It's possible the heap problems
I just did a wide set of selects and ran across no tombstones. But while on the
subject of gc_grace_seconds, any reason, on a small cluster not to set it to
something low like a single day. It seems like 10 days is only need to large
clusters undergoing long partition splits, or am i misundersta
manual forced compactions create more problems than they solve, if you have
no evidence of tombstones in your selects (which seems odd, can you share
some of the tracing output?), then I'm not sure what it would solve for you.
Compaction running could explain a high load, logs messages with ERRORS
Looking at the output of "nodetool netstats" I see that the bootstrapping nodes
pulling from only two of the nine nodes currently in the datacenter. That
surprises me: I'd think the vnodes it pulls from would be randomly spread
across the existing nodes. We're using Cassandra 2.0.11 with 256
That's just the thing. There is nothing in the logs except the constant ParNew
collections like
DEBUG [ScheduledTasks:1] 2014-12-16 19:03:35,042 GCInspector.java (line 118) GC
for ParNew: 166 ms for 10 collections, 4400928736 used; max is 8000634888
But the load is staying continuously high.
What version of Cassandra?
On Dec 16, 2014 6:36 PM, "Arne Claassen" wrote:
> That's just the thing. There is nothing in the logs except the constant
> ParNew collections like
> DEBUG [ScheduledTasks:1] 2014-12-16 19:03:35,042 GCInspector.java (line
> 118) GC for ParNew: 166 ms for 10 collection
Cassandra 2.0.10 and Datastax Java Driver 2.1.1
On Dec 16, 2014, at 4:48 PM, Ryan Svihla wrote:
> What version of Cassandra?
> On Dec 16, 2014 6:36 PM, "Arne Claassen" wrote:
> That's just the thing. There is nothing in the logs except the constant
> ParNew collections like
> DEBUG [Sche
When I set Consistency to QUORUM in cqlsh command line. It says
consistency is set to quorum.
cqlsh:testdb> CONSISTENCY QUORUM ;
Consistency level set to QUORUM.
However when I check it back using CONSISTENCY command on the prompt
it says consistency is 4. However it should be 2 as my replic
Maybe checking which thread(s) would hint what's going on? (see
On Wed, Dec 17, 2014 at 1:51 AM, Arne Claassen wrote:
> Cassandra 2.0.10 and Datastax Java Driver 2.1.1
> On Dec 16, 2014, at 4:48 PM, Ry
55 matches
Mail list logo