In order order to split the nodes.
SimpleGeo have max 1,000 recods (i.e places) on each node in the tree, if
the number is >1,000 they split the node.
In order to avoid that more then 1 process will edit/split the node -
transaction is needed.
On Jul 22, 2011 1:01 AM, "aaron morton" wrote:
>> But
ERROR [pool-2-thread-3] 2011-07-22 10:34:59,102 Cassandra.java (line 3294)
Internal error processing insert
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut
down
at
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecu
One of the main reasons for regularly running repair is to make sure
deletes are propagated in the cluster, ie, data is not resurrected if a
node never received the delete call.
And repair-on-read takes care of repairing inconsistencies "on-the-fly".
So if I were to set a universal TTL on al
UnavailableException is raised server side when there is less than CL nodes UP
when the request starts.
It seems odd to get it in this case because the default replication factor used
by stress test is 1. How many nodes do you have and have you made any changes
to the RF ?
Also check the serv
You can use something like Zoo Keeper to coordinate processes doing page splits.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 22 Jul 2011, at 19:05, Eldad Yamin wrote:
> In order order to split the nodes.
> SimpleGeo have max 1
Something has shutdown the mutation stage thread pool. This happens during
drain or decommission / move.
Restart the service and it should be ok.
if it happens again without anyone running something like drain, decommission
or move let us know.
Cheers
-
Aaron Morton
Freelan
it happend again i turn off compaction by setting max and min compaction
tresholds to zero, and run, 5 threads of inserts, after base reach 27GB size
cassandra fall with same error. OS Windows Server 2008 datacenter, JVM have
1.5 GB heap. cassandra version 0.8.1 all parameters in conf file are
defa
Read repair will only repair data that is read on the nodes that are up at that
time, and does not guarantee that any changes it detects will be written back
to the nodes. The diff mutations are async fire and forget messages which may
go missing or be dropped or ignored by the recipient just li
Running only one node. I dnt think it is coming for the replication
factor... I will try to sort this out Any other suggestions from your
side is always be helpful..
:) Thank you
On 22 July 2011 14:36, aaron morton wrote:
> UnavailableException is raised server side when there is les
good points Aaron. I realize now how expensive repair on reads are. I'm
going to keep doing repairs regularly but still have a max TTL on all
columns to make sure we don't have really old data we no longer need
getting buried in the cluster.
On , aaron morton wrote:
Read repair will only r
What does nodetool ring say?
On Fri, Jul 22, 2011 at 12:43 AM, Nilabja Banerjee
wrote:
> Hi All,
>
> I am following this following link "
> http://www.datastax.com/docs/0.7/utilities/stress_java " for a stress test.
> I am getting this notification after running this command
>
> xxx.xxx.xxx.xx= m
>
>
> Short answer, yes it's safe to kill cassandra during a repair. It's one of
> the nice things about never mutating data.
>
> Longer answer: If nodetool compactionstats says there are no Validation
> compactions running (and the compaction queue is empty) and netstats says
> there is nothing s
Hi everyone
I've been struggling trying to get the data volume ("load") to equalize across
a balanced cluster, and I'm not sure what else I can try.
Background: This was originally a 5-node cluster. We re-balanced the 3 faster
machines across the ring, and decommissioned the 2 older ones. We
As of Cassandra 0.8.1, are counter increments and decrements idempotent? If,
for example, a client sends an increment request and the increment occurs,
but the network subsequently fails and reports a failure to the client, will
Cassandra retry the increment (thus leading to an overcount and incons
are you trying to balance "load" or "owns" ? "owns" looks fine ...
33.33% each ... which to me says balanced.
how did you calculate your tokens?
On Fri, Jul 22, 2011 at 4:37 PM, Mina Naguib
wrote:
>
> Address Status State Load Owns Token
> xx.xx.x.105 Up Normal
On Fri, Jul 22, 2011 at 4:52 PM, Kenny Yu wrote:
> As of Cassandra 0.8.1, are counter increments and decrements idempotent? If,
> for example, a client sends an increment request and the increment occurs,
> but the network subsequently fails and reports a failure to the client, will
> Cassandra re
I'm trying to balance Load ( 41.98GB vs 59.4GB vs 74.65GB )
Owns looks ok. They're all 33.33% which is what I want. It was calculated
simply by 2^127 / num_nodes. The only reason the first one doesn't start at 0
is that I''ve actually carved the ring planning for 9 machines (2 new data
cente
With the current implementation of CompositeType in Cassandra 0.8.1,
is it recommended practice to try to use a CompositeType as the key?
Or are both, column and key, equally well supported?
The documentation on CompositeType is light, well non-existent really, with
key_validation_class set to Co
If you are using OPP, then you can use CompositeType on both key and
column name; otherwise(Random Partition), just use it for columns.
On 22/07/2011 17:10, Patrick Julien wrote:
With the current implementation of CompositeType in Cassandra 0.8.1,
is it recommended practice to try to use a Compo
I can still use it for keys if I don't need ranges then? Because for
what we are doing we can always re-assemble keys
On Fri, Jul 22, 2011 at 11:38 AM, Donal Zang wrote:
> If you are using OPP, then you can use CompositeType on both key and column
> name; otherwise(Random Partition), just use it
btw, this "issue" of not knowing whether a write is persisted or not
when client reports error, is not limited to counters, for regular
columns, it's the same: if client reports write failure, the value may
well be replicated to all replicas later. this is even the same with
all other systems: Z
If that's the case, your client is being misleading. Cassandra
distinguishes between Unavailable (we knew we couldn't achieve CL
before we started, and nothing changed) and TimedOut (didn't get reply
in a timely fashion; it may or may not have gone through).
TimedOut != Failed.
On Fri, Jul 22, 2
On 22/07/2011 17:56, Patrick Julien wrote:
I can still use it for keys if I don't need ranges then? Because for
what we are doing we can always re-assemble keys
yes,but why would you use CompositeType if you don't need range query?
On Fri, Jul 22, 2011 at 11:38 AM, Donal Zang wrote:
If you a
On 22/07/2011 18:08, Yang wrote:
btw, this "issue" of not knowing whether a write is persisted or not
when client reports error, is not limited to counters, for regular
columns, it's the same: if client reports write failure, the value may
well be replicated to all replicas later. this is even
> yes,but why would you use CompositeType if you don't need range query?
If you were doing composite keys anyway (common approach with time
series data for example), you would not have to write parsing and
concatenation code. Particularly useful if you had mixed types in the
key.
Exactly. In any case, I just answered my own question. If I need
range, I can just make another column family where the column name are
these keys
On Fri, Jul 22, 2011 at 12:37 PM, Nate McCall wrote:
>> yes,but why would you use CompositeType if you don't need range query?
>
> If you were doing
In order to be predicable @ big data scale, the intensity and periodicity of
STW Garbage Collection has to be brought down. Assume that SLABS (Cass 2252)
will be available in the main line at some time and assume that this will
have the impact that other projects (hbase etc) are reporting. I womder
The Cassandra team is pleased to announce the release of Apache Cassandra
version 0.7.8.
This version is a bug fix release[1] and in particular it fixes a regression
of Cassandra 0.7.7 that made hinted handoff delivery not being triggered
automatically (you could still force delivery through JMX).
On Fri, Jul 22, 2011 at 12:05 AM, Eldad Yamin wrote:
> In order order to split the nodes.
> SimpleGeo have max 1,000 recods (i.e places) on each node in the tree, if
> the number is >1,000 they split the node.
> In order to avoid that more then 1 process will edit/split the node -
> transaction i
Yes, I am wondering more about the yaml file and the settings like the
autobootstrap setting and such.
I guess I will find out once they enable my amazon service and I can get
running with it.
NOTE: anyone doing 1.0 or prototype I think constantly uses start/stop whole
cluster to upgrade/install
is there such an option?
in some cases I want to distribute some small lookup tables to all the
nodes, so that everyone has a local copy, and loaded in memory. so the
lookup is fast. supposedly I want to write to all N nodes, but that
exposes me to failure in case of just one node down.
so I'd lik
is there such an option?
in some cases I want to distribute some small lookup tables to all the
nodes, so that everyone has a local copy, and loaded in memory. so the
lookup is fast. supposedly I want to write to all N nodes, but that
exposes me to failure in case of just one node down.
so I'd lik
On Fri, Jul 22, 2011 at 3:24 PM, Yang wrote:
> is there such an option?
>
> in some cases I want to distribute some small lookup tables to all the
> nodes, so that everyone has a local copy, and loaded in memory. so the
> lookup is fast. supposedly I want to write to all N nodes, but that
> expos
I don't see a JVM crashlog ( hs_err_pid[pid].log) in
~/brisk/resources/cassandra/bin or /tmp. So maybe JVM didn't crash?
We're running a pretty up to date with Sun Java:
ubuntu@ip-10-2-x-x:/tmp$ java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpo
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
I've not tried this but speculative implementation schema probably
something like the following:
Super Col family for structure
hash(nodeId): {
root: { left="nodeId1", right="nodeId2" }
nodeId1: { left="nodeId3", right="nodeId4" }
On Fri, Jul 22, 2011 at 9:27 AM, Donal Zang wrote:
> On 22/07/2011 18:08, Yang wrote:
>>
>> btw, this "issue" of not knowing whether a write is persisted or not
>> when client reports error, is not limited to counters, for regular
>> columns, it's the same: if client reports write failure, the v
Ideally, we would want to have a replication factor of 4, and a minimum
write consistency of 2 (which looking at the default in cassandra.yaml is to
memory first with asynch to disk...perfect so far!!!)
Now, obviously, I can get the partitioner setup to make sure I get 2
replicas in each data cent
I'm not sure if this is the answer, but major compaction on each node
for each column family. I suspect the data shuffle has left quite a few
deleted keys which may get cleaned out on major compaction. As I
remember major compaction doesn't automatically in 7.x, I'm not sure if
it is triggered by r
Hi,
I just noticed that the count(*) in CQL seems to be having wrong answer, when I
have only one row, the count(*) returns two.
Below are the commands I tried:
cqlsh> SELECT COUNT(*) FROM UserProfile USING CONSISTENCY QUORUM WHERE KEY IN
('00D760DB1730482D81BC6845F875A97D');
(2,)
cqlsh> selec
On Fri, 2011-07-22 at 14:18 -0700, Hefeng Yuan wrote:
> Hi,
>
> I just noticed that the count(*) in CQL seems to be having wrong answer, when
> I have only one row, the count(*) returns two.
>
> Below are the commands I tried:
>
> cqlsh> SELECT COUNT(*) FROM UserProfile USING CONSISTENCY QUORUM
It sounds like what you're looking for is write consistency of local_quorum:
http://www.datastax.com/docs/0.8/consistency/index#write-consistency
local_quorum would mean the write has to be successful on a majority of
nodes in DC1 (so 2) before it is considered successful.
If you use just quorum
this is a common pattern used in RDMS,
is there some existing idiom to do it in cassandra ?
if the size of "select * from A where id == a " is very large, and
similarly for B, while the join of A.id == a and B.id==b is small,
then doing a get() for both and then merging seems excessively slow.
Yes, this is broken. We'll fix this for
https://issues.apache.org/jira/browse/CASSANDRA-2474
On Fri, Jul 22, 2011 at 4:18 PM, Hefeng Yuan wrote:
> Hi,
>
> I just noticed that the count(*) in CQL seems to be having wrong answer, when
> I have only one row, the count(*) returns two.
>
> Below are
Hi Peter
That was precisely it. Thank you :)
Doing a major compaction on the heaviest node (74.65GB) reduced it to 33.55GB.
I'll compact the other 2 nodes as well. I anticipate they will also settle
around that size.
On 2011-07-22, at 5:00 PM, Peter Tillotson wrote:
> I'm not sure if this
44 matches
Mail list logo