They are evenly distributed. 5 nodes * 40 connections each using hector, and I
can confirm that all 200 are active when this happened (from hector's
perspective, from graphing the hector jmx data), and all 5 nodes saw roughly 40
connections, and all were receiving traffic over those connections
We'll solve #2890 and we should have done it sooner.
That being said, a quick question: how do you do your inserts from the
clients ? Are you evenly
distributing the inserts among the nodes ? Or are you always hitting
the same coordinator ?
Because provided the nodes are correctly distributed on
It was exactly due to 2890, and the fact that the first replica is always the
one with the lowest value IP address. I patched cassandra to pick a random
node out of the replica set in StorageProxy.java findSuitableEndpoint:
Random rng = new Random();
return endpoints.get(rng.nextInt(endpoints.
Does it always pick the node with the lowest IP address? All of my hosts are
in the same /24. The fourth node in the 5 node cluster has the lowest value in
the 4th octet (54). I erased the cluster and rebuilt it from scratch as a 3
node cluster using the first 3 nodes, and now the ReplicateOn
That ticket explains a lot, looking forward to a resolution on it.
(Sorry I don't have a patch to offer)
Ian
On Fri, Sep 2, 2011 at 12:30 AM, Sylvain Lebresne wrote:
> On Thu, Sep 1, 2011 at 8:52 PM, David Hawthorne wrote:
>> I'm curious... digging through the source, it looks like replicate on
That's interesting. I did an experiment wherein I added some entropy to the
row name based on the time when the increment came in, (e.g. row = row + "/" +
(timestamp - (timestamp % 300))) and now not only is the load (in GB) on my
cluster more balanced, the performance has not decayed and has s
On Thu, Sep 1, 2011 at 8:52 PM, David Hawthorne wrote:
> I'm curious... digging through the source, it looks like replicate on write
> triggers a read of the entire row, and not just the columns/supercolumns that
> are affected by the counter update. Is this the case? It would certainly
> exp
sorry i mean cf * row
if you look in the code, db.cf is just basically a set of columns
On Sep 1, 2011 1:36 PM, "Ian Danforth" wrote:
> I'm not sure I understand the scalability of this approach. A given
> column family can be HUGE with millions of rows and columns. In my
> cluster I have a sin
disk.
- Original Message -
From: "Ian Danforth"
To: user@cassandra.apache.org
Sent: Thursday, September 1, 2011 4:35:33 PM
Subject: Re: Replicate On Write behavior
I'm not sure I understand the scalability of this approach. A given
column family can be HUGE with millions of r
I'm not sure I understand the scalability of this approach. A given
column family can be HUGE with millions of rows and columns. In my
cluster I have a single column family that accounts for 90GB of load
on each node. Not only that but column family is distributed over the
entire ring.
Clearly I'm
when Cassandra reads, the entire CF is always read together, only at the
hand-over to client does the pruning happens
On Thu, Sep 1, 2011 at 11:52 AM, David Hawthorne wrote:
> I'm curious... digging through the source, it looks like replicate on write
> triggers a read of the entire row, and not
I'm curious... digging through the source, it looks like replicate on write
triggers a read of the entire row, and not just the columns/supercolumns that
are affected by the counter update. Is this the case? It would certainly
explain why my inserts/sec decay over time and why the average inse
12 matches
Mail list logo