http://www.datastax.com/documentation/cassandra/1.2/cassandra/architecture/architecturePlanningEC2_c.html
From the link:
EBS volumes are not recommended for Cassandra data volumes for the following
reasons:
• EBS volumes contend directly for network throughput with standard
packets. Th
+Cassandra DL
We have Cassandra nodes in three datacenters - dc1, dc2 and dc3 and the
cluster name is DataCluster. In the same way, our application code is also
in same three datacenters. Our application code is accessing cassandra.
Now I want to make sure if application call is coming from `dc1`
Hey that helped! Just to quell your curiosity here's my
snitch: endpoint_snitch: SimpleSnitch
thanks!
On Wed, Jun 18, 2014 at 11:03 PM, Marcelo Elias Del Valle <
marc...@s1mbi0se.com.br> wrote:
>
> Is "replication_factor" your DC name?
>
> Here is what I would using:
>
>
> CREATE KEYSPACE IF NO
Is "replication_factor" your DC name?
Here is what I would using:
CREATE KEYSPACE IF NOT EXISTS animals
WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',
'DC1' : 3 };
But in my case, I am using GossipPropertyFileSnitch and DC1 is
configured there, so Cassandra knows which nodes are i
hey all,
I know that something pretty basic must be wrong here. But what is the
mistake I'm making in creating this keyspace?
cqlsh> create keyspace animals with replication = { 'class':
'NetworkTopologyStrategy', 'replication_factor' : 3};
Bad Request: Error constructing replication strategy cla
Amen. I believe the whole seed node/bootstrapping confusion goes against
the "Why Cassandra", quoted from
http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra
*Operational simplicity* – with all nodes in a cluster being the same,
there is no complex configu
I don't think I have the space to run a major compaction right now (I'm
above 50% disk space used already) and compaction can take extra space I
think?
On Wed, Jun 18, 2014 at 3:24 PM, Robert Coli wrote:
> On Wed, Jun 18, 2014 at 12:05 PM, Brian Tarbox
> wrote:
>
>> Thank you! We are not usi
On Wed, Jun 18, 2014 at 12:05 PM, Brian Tarbox
wrote:
> Thank you! We are not using TTL, we're manually deleting data more than
> 5 days old for this CF. We're running 1.2.13 and are using size tiered
> compaction (this cf is append-only i.e.zero updates).
>
> Sounds like we can get away with
On Wed, Jun 18, 2014 at 5:36 AM, Alain RODRIGUEZ wrote:
> We stop the node using : nodetool disablegossip && nodetool disablethrift
> && nodetool disablebinary && sleep 10 && nodetool drain && sleep 30 &&
> service cassandra stop
>
The stuff before "nodetool drain" here is redundant and doesn't
Rob,
Thank you! We are not using TTL, we're manually deleting data more than 5
days old for this CF. We're running 1.2.13 and are using size tiered
compaction (this cf is append-only i.e.zero updates).
Sounds like we can get away with doing a (stop, delete old-data-file,
restart) process on a r
On Wed, Jun 18, 2014 at 4:56 AM, Jonathan Lacefield wrote:
> What Artur is alluding to is that seed nodes do not bootstrap.
> Replacing seed nodes requires a slightly different approach for node
> replacement compared to non seed nodes. See here for more details:
> http://www.datastax.com/doc
On Wed, Jun 18, 2014 at 10:56 AM, Brian Tarbox
wrote:
> I have a column family that only stores the last 5 days worth of some
> data...and yet I have files in the data directory for this CF that are 3
> weeks old.
>
Are you using TTL? If so :
https://issues.apache.org/jira/browse/CASSANDRA-6654
On Tue, Jun 17, 2014 at 11:08 PM, Prabath Abeysekara <
prabathabeysek...@gmail.com> wrote:
> First of all, apologies if the $subject was discussed previously in this
> list before. I've already gone through quite a few email trails on this but
> still couldn't find a convincing answer which really
Another nice resource...
http://www.ecyrd.com/cassandracalculator/
On Wed, Jun 18, 2014 at 9:10 AM, Brian Tarbox
wrote:
> We do a repair -pr on each node once a week on a rolling basis.
>
https://issues.apache.org/jira/browse/CASSANDRA-5850?focusedCommentId=14036057&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036057
> Shoul
I have a column family that only stores the last 5 days worth of some
data...and yet I have files in the data directory for this CF that are 3
weeks old. They take the form:
keyspace-CFName-ic--Filter.db
keyspace-CFName-ic--Index.db
keyspace-CFName-ic--Data.db
keyspace-CFName-ic--
repair only creates snapshots if you use the “-snapshot” option.
On June 18, 2014 at 12:28:58 PM, Marcelo Elias Del Valle
(marc...@s1mbi0se.com.br) wrote:
AFAIK, when you run a repair a snapshot is created.
After the repair, I run "nodetool clearsnapshot" to save disk space.
Not sure it's you
For snapshots, yes. For incremental backups you need to delete the files
yourself.
On Wed, Jun 18, 2014 at 6:28 AM, Marcelo Elias Del Valle <
marc...@s1mbi0se.com.br> wrote:
> Wouldn't be better to use "nodetool clearsnapshot"?
> []s
>
>
> 2014-06-14 17:38 GMT-03:00 S C :
>
> I am thinking of "
AFAIK, when you run a repair a snapshot is created.
After the repair, I run "nodetool clearsnapshot" to save disk space.
Not sure it's you case or not.
[]s
2014-06-18 13:10 GMT-03:00 Brian Tarbox :
> We do a repair -pr on each node once a week on a rolling basis.
> Should we be running cleanup a
We do a repair -pr on each node once a week on a rolling basis.
Should we be running cleanup as well? My understanding that was only used
after adding/removing nodes?
We'd like to avoid adding nodes if possible (which might not be). Still
curious if we can get C* to do the maintenance task on a
One option is to add new nodes, and do a node repair/cleanup on everything.
That will at least reduce your per-node data size.
On Wed, Jun 18, 2014 at 11:01 AM, Brian Tarbox
wrote:
> I'm running on AWS m2.2xlarge instances using the ~800 gig
> ephemeral/attached disk for my data directory. My
I'm running on AWS m2.2xlarge instances using the ~800 gig
ephemeral/attached disk for my data directory. My data size per node is
nearing 400 gig.
Sometimes during maintenance operations (repairs mostly I think) I run out
of disk space as my understanding is that some of these operations require
I have a 10 node cluster with cassandra 2.0.8.
I am taking this exceptions in the log when I run my code. What my code
does is just reading data from a CF and in some cases it writes new data.
WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391
BatchStatement.java (line 228) Batch of pr
Wouldn't be better to use "nodetool clearsnapshot"?
[]s
2014-06-14 17:38 GMT-03:00 S C :
> I am thinking of "rm " once the backup is complete. Any special
> cases to be careful about?
>
> -Kumar
> --
> Date: Sat, 14 Jun 2014 13:13:10 -0700
> Subject: Re: incremental b
This last command was supposed to be a best practice a few years ago, hope
it is still the case. I just added the recent "nodetool disablebinary"
part...
2014-06-18 14:36 GMT+02:00 Alain RODRIGUEZ :
> Thanks a lot for taking time to check the log.
>
> We just switched from 400M to 1600M NEW size
Thanks a lot for taking time to check the log.
We just switched from 400M to 1600M NEW size in the cassandra-env.sh. It
reduced our latency and the PARNEW GC time / second significantly...
(described here
http://tech.shift.com/post/74311817513/cassandra-tuning-the-jvm-for-read-heavy-workloads
)
E
There are several long Parnew pauses that were recorded during startup.
The young gen size looks large too, if I am reading that line correctly.
Did you happen to overwrite the default settings for MAX_HEAP and/or NEW
size in the cassandra-env.sh? The large you gen size, set via the env.sh
file,
Well then you better provide your schema and query, as I select ranges like
this all the time using CQL and I (at least) must not understand your
problem from the description so far.
On Wed, Jun 18, 2014 at 2:54 AM, DuyHai Doan wrote:
> Hello Jason
>
> If you want to check for presence / absenc
Hello,
What Artur is alluding to is that seed nodes do not bootstrap. Replacing
seed nodes requires a slightly different approach for node replacement
compared to non seed nodes. See here for more details:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_see
Hello
Have you checked the log file to see what's happening during startup
? What caused the rolling restart? Did you preform an upgrade or
change a config?
> On Jun 18, 2014, at 5:40 AM, Alain RODRIGUEZ wrote:
>
> Hi guys
>
> Using 1.2.11, when I try to rolling restart the cluster, any nod
Hi,
I was wondering if there are any possible problems we may face if we use
completely fabricated values as TIMESTAMP when doing INSERTs and
UPDATEs. Because I can imagine a couple of examples where exploiting
column timestamps could simplify things.
Because Cassandra is LWW (last write win
Hi guys
Using 1.2.11, when I try to rolling restart the cluster, any node I restart
makes the whole cluster cpu load to increase, reaching a "red" state in
opscenter (load from 3-4 to 20+). This happens once the node is back online.
The restarted node uses 100 % cpu for 5 - 10 min and sometimes d
While they guarantee IOPS, they don't really make any guarantees about
latency. Since EBS goes over the network, there's so many things in the
path of getting at your data, I would be concerned with random latency
spikes, unless proven otherwise.
Thanks,
Daniel
On Wed, Jun 18, 2014 at 1:58 AM, A
Hi,
pretty sure we started out like that and had not seen any problems doing
that. On a side node, that config may become inconsistent anyway after
adding new nodes, because I think you'll need a restart of all your
nodes if you add new seeds to the yaml file. (Though that's just assumption)
In this document it is said :
- Provisioned IOPS (SSD) - Volumes of this type are ideal for the most
demanding I/O intensive, transactional workloads and large relational or
NoSQL databases. This volume type provides the most consistent performance
and allows you to provision the exac
Hi,
I just saw this :
http://aws.amazon.com/fr/blogs/aws/new-ssd-backed-elastic-block-storage/
Since the problem with EBS was the network, there is no chance that this
hardware architecture might be useful alongside Cassandra, right ?
Alain
My intended Cassandra cluster will have 15 nodes per DC, with 2 DCs.
I am considering using all the nodes as seed nodes.
It looks like having all the nodes as seeds should actually reduce the Gossip
overhead (See "Gossiper implementation" in
http://wiki.apache.org/cassandra/ArchitectureGossip)
Is
I am trying to follow an example given on "
http://www.datastax.com/dev/blog/big-analytics-with-r-cassandra-and-hive";
to connect R with Cassandra. Following is my code:
library(RJDBC)
#Load in the Cassandra-JDBC diver
cassdrv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver",
list.
38 matches
Mail list logo