Hi Malay ,
you can do below things,
cassandra stress tool is inside /tools/bin/cassandra-stress
for performing inserts and reads to test a keyspace to measure performace
cassandra-stress [options] [-o [operation name]]
-o (--operation name) : INSERT,READ,ETC.. (default INSERT)
-t (--threads) :
Hi
Anyone can you please let me know the steps for performance testing in
Cassandra using Stress tools.
Regards,
Malay Nilabh
BIDW BU/ Big Data CoE
L&T Infotech Ltd, Hinjewadi,Pune
[cid:image001.gif@01CFCCE7.400E62B0]: +91-20-66571746
[cid:image002.png@01CFCCE7.400E62B0]+91-73-879-00727
Email: m
Rob Coli strikes again, you're Doing It Wrong, and he's right :D
Using Cassandra as an distributed cache is a bad idea, seriously. Putting
6GB into row cache is another one.
On Tue, Sep 9, 2014 at 9:21 PM, Robert Coli wrote:
> On Tue, Sep 9, 2014 at 12:10 PM, Danny Chan wrote:
>
>> Is there a
I use jmxterm. http://wiki.cyclopsgroup.org/jmxterm/ attach it to your c*
process and then use the org.apache.cassandra.db:HintedHandoffManager bean
and run deleteHintsforEndpoint to drop hints for each ip.
On Wed, Sep 10, 2014 at 3:37 AM, Rahul Neelakantan wrote:
> RF=3, two DCs. (Going to
Regarding what Netflix does, the last time I checked:
1) sure, they use AWS VMs, but they take the whole machine.
So is that really using a VM? :)
2) they use SSD mainly to reduce compaction time. "We don't
even notice it with SSD any more."
When sizing nodes and clusters, the main factors I've
What you're describing depends on the load (data size) and latency.
Doing a bootstrap or backup would require a fair amount of bandwidth if
you want it done quickly with a lot of data. Also, latency would
be very high going over some kind of office VPN. But
there's no reason you can't do what you'
Hi everyone
I am at a loss for locating use cases/examples/documentation/books/etc for
deploying Cassandra where multi-dc nodes of a single cluster are on your
own network at points around the world.
In my example a Cassandra dc equates to a building.
Of interest to me is how installations are in
On Tue, Sep 9, 2014 at 2:36 PM, Eugene Voytitsky wrote:
> As I understand, atomic batch for counters can't work correctly
> (atomically) prior to 2.1 because of counters implementation.
> [Link: http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2]
>
> Cassandra 2.1. reimplements the
RF=3, two DCs. (Going to 1.2.x in a few weeks)
What's the procedure to drop via JMX?
- Rahul
1-678-451-4545 (US)
+91 99018-06625 (India)
On Sep 9, 2014, at 9:23 AM, Rahul Menon wrote:
> Yep, the hinted handoff in 1.0.8 is abysmal at best. What is your replication
> facter, i have had huge hi
I've used JBOD before and here's the operational problems I noticed:
1) each volume/disk fills at a different rate, so the min might be 100 GB data,
and the max might be 200 GB.That means you cannot use anywhere near your
real hard disk capacity. (Then on top of that compaction requires space.)
2
What is recommended read/write consistency level (CL) for counters?
Yes I know that write_CL + read_CL > RF is recommended.
But, I got strange results when run my junit tests with different CLs
against 3 nodes cluster.
I checked 9 combinations: (write=ONE,QUORUM,ALL) x (read=ONE,QUORUM,ALL)
Ea
On Tue, Sep 9, 2014 at 2:16 PM, Russell Bradberry
wrote:
> Because RAM is expensive and the JVM heap is limited to 8gb. While you do
>> get benefit out of using extra RAM as page cache, it's often not cost
>> efficient to do so
>
>
> Again, this is so use-case dependent. I have met several people
As I understand, atomic batch for counters can't work correctly
(atomically) prior to 2.1 because of counters implementation.
[Link: http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2]
Cassandra 2.1. reimplements the counters.
Will atomic batch of counters work as expected (atomic
Thanks, good article.
But some of my questions are still unanswered.
I will reformulate and post them as short separate emails.
On 05.09.14 01:01, Ken Hancock wrote:
Counters are way more complicated than what you're illustrating.
Datastax did a good blog post on this:
http://www.datastax.com/
>
> Because RAM is expensive and the JVM heap is limited to 8gb. While you do
> get benefit out of using extra RAM as page cache, it's often not cost
> efficient to do so
Again, this is so use-case dependent. I have met several people that run
small nodes with fat ram to get it all in memory to s
*TL;DR*
There is no one recommended setup for Cassandra, everyone's use-case is
different and it is up to you to figure out the best setup for your
use-case. There are a lot of questions that need to be asked before making
a decision on hardware layout.
There is just so
On Tue, Sep 9, 2014 at 1:07 PM, Rahul Neelakantan wrote:
> Why not more than 32gb of RAM/node?
>
Because RAM is expensive and the JVM heap is limited to 8gb. While you do
get benefit out of using extra RAM as page cache, it's often not cost
efficient to do so.
=Rob
>From my experience, and what I've read, more the RAM the better. Any excess
>memory can be used as disk cache, which should help with your reads a lot.
-Arindam
-Original Message-
From: Paolo Crosato [mailto:paolo.cros...@targaubiest.com]
Sent: Tuesday, September 09, 2014 12:53 PM
To:
Why not more than 32gb of RAM/node?
Rahul Neelakantan
> On Sep 9, 2014, at 3:52 PM, Paolo Crosato
> wrote:
>
> Every node should have at least 4 cores, with a maximum of 8. Memory
> shouldn't be higher than 32g, 16gb is good for a start. Every node should be
> a phisical machine, not a virtu
Every node should have at least 4 cores, with a maximum of 8. Memory shouldn't
be higher than 32g, 16gb is good for a start. Every node should be a phisical
machine, not a virtual one, or at least a virtual machine with an ssd hd
subsystem. The disk subsystem should be directly connected to the
On 9 Sep 2014, at 7:33 am, Nate McCall wrote:
> Other thoughts:
> - Go slowly and verify that clients and gossip are talking to the new nodes
> after each lift and shift
> - Don't forget to change seeds afterwards
> - This is not the time to upgrade/change *anything* else - match the version
>
It depends. Ultimately your load is low enough a single node can probably
handle it so you kinda want a "minimum" cluster. Different people have
different thoughts on what this means - I would recommend 5-6 nodes with a 3
replication factor. (say m1.xlarge, or c3.2xlarge striped ephemerals, I
On Tue, Sep 9, 2014 at 12:10 PM, Danny Chan wrote:
> Is there a method to quickly load a large dataset into the row cache?
> I use row caching as I want the entire dataset to be in memory.
>
You're doing it wrong. Use a memory store.
=Rob
It can get really unbalanced with STCS. Whats more is even if there was a disk
that could fit the 600gb sstable it doesn't pay attention to space (first) so
may pick the 75% full one over the 10% one. Its a better idea to use LCS with
it unless data model really needs it in which case monitor
Hello all,
Is there a method to quickly load a large dataset into the row cache?
I use row caching as I want the entire dataset to be in memory.
I'm running a Cassandra-1.2 database server with a dataset of 555
records (6GB size) and a row cache of 6GB. Key caching is disabled and
I am using
I would also love to see any resources that are shared describing best
practices. If you find something Oleg, or others have some very useful
resources outside of what I have found by searching online, I would be very
grateful if these were shared in my direction. Cheers.
Nate
On Tue, Sep 9, 2014
Hi ,
Where can I find the document with best practices about sizing for
cassandra deployment?
We have 1000 writes / reads per second. record size 1k.
Questions:
1) how many machines do we need?
2) how many ram ,disc size / type?
3) What should be network?
I understand that hardware
Alain Rodriguez outlined this procedure that he was going to try, but failed to
mention whether this actually worked :-)
https://mail-archives.apache.org/mod_mbox/incubator-cassandra-user/201406.mbox/%3cca+vsrlopop7th8nx20aoz3as75g2jrjm3ryx119deklynhq...@mail.gmail.com%3E
/Janne
On 8 Sep 2014,
We've done this several times with clients - Ben's response will work and
is pretty close to the approaches we took:
>
> Use the gossiping property file snitch in the VPC data centre.
>
Agree. I don't think you could even do this effectively with the EC2Snitch.
Use a public elastic ip for each
Yep, the hinted handoff in 1.0.8 is abysmal at best. What is your
replication facter, i have had huge hints pile up, where i had to drop the
entire coloumn family and then run a repair. Either that or you can use the
JMX HintedHandoffManager and delete hints per endpoint. Also it maybe
worthwhile t
30 matches
Mail list logo