Re: Opscenter usage in Development server

2014-04-07 Thread Hari Rajendhran
Hi, Thanks for your info :) Best Regards Hari Krishnan Rajendhran Hadoop Admin DESS-ABIM ,Chennai BIGDATA Galaxy Tata Consultancy Services Cell:- 9677985515 Mailto: hari.rajendh...@tcs.com Website: http://www.tcs.com Experience certainty. IT Servic

Re: Opscenter usage in Development server

2014-04-07 Thread Michael Shuler
On 04/07/2014 11:59 PM, Hari Rajendhran wrote: I need a clarification on Opscenter community version usage for Testing,Development and Production servers.Whether community version can be used without any license for Production servers ?? When you are using it with a subscription to DSE, there a

Opscenter usage in Development server

2014-04-07 Thread Hari Rajendhran
Hi Team, I need a clarification on Opscenter community version usage for Testing,Development and Production servers.Whether community version can be used without any license  for Production servers ?? Best Regards Hari Krishnan Rajendhran Hadoop Admin DESS-ABIM ,Chennai BIGDATA Galaxy Tata C

Re: Auto-Bootstrap not Auto-Bootstrapping?

2014-04-07 Thread Robert Coli
On Mon, Apr 7, 2014 at 6:17 PM, Greg Bone wrote: > > > If seed nodes do not auto bootstrap, what is the procedure for replacing a > node > in a three node cluster, with all of them identified as seed nodes? > No one seems to know the answer to your question [1], but the current workaround is to r

Re: Fwd: using hadoop + cassandra for CF mutations (delete)

2014-04-07 Thread Suraj Nayak
Good way of experimenting Will. Share your observation :) Adding cassandra user group for the input of the community on num_tokens settings in cassandra.yaml. Thanks Suraj On 07-Apr-2014 6:20 PM, "William Oberman" wrote: > If that works, it's a neat/fancy trick. But, after looking into the doc

Re: Auto-Bootstrap not Auto-Bootstrapping?

2014-04-07 Thread Paul Charles Leddy
Trick the seed node but removing itself from the yaml file, then start it up. On Mon, Apr 7, 2014 at 7:22 PM, Jonathan Lacefield wrote: > Hello > > Not sure I follow the auto bootstrap question, but seeds are only > used on startup. Also, what do you mean by convert the node to a > seed node

Re: Auto-Bootstrap not Auto-Bootstrapping?

2014-04-07 Thread Jonathan Lacefield
Hello Not sure I follow the auto bootstrap question, but seeds are only used on startup. Also, what do you mean by convert the node to a seed node? You could simply add the 4th node IP address to the seed list of the other nodes in the .yaml file. Hope that helps Jonathan > On Apr 7, 201

Re: Auto-Bootstrap not Auto-Bootstrapping?

2014-04-07 Thread Greg Bone
If seed nodes do not auto bootstrap, what is the procedure for replacing a node in a three node cluster, with all of them identified as seed nodes? Here's what I am thinking: 1) Add a 4th node to the cluster which is not a seed node 2) Decommission one of the seed nodes when data finished

Re: Migrating to new datacenter

2014-04-07 Thread Robert Coli
On Mon, Apr 7, 2014 at 1:41 PM, Brandon McCauslin wrote: > If I read your response in that 1st URL correctly, it seems changing both > the snitch and replication strategy at the same time is not advisable and > could lead to partial data loss. Is your suggestion of dumping an > reloading the dat

Re: Migrating to new datacenter

2014-04-07 Thread Brandon McCauslin
Rob, If I read your response in that 1st URL correctly, it seems changing both the snitch and replication strategy at the same time is not advisable and could lead to partial data loss. Is your suggestion of dumping an reloading the data into the new cluster still recommended for these situations

Re: Setting gc_grace_seconds to zero and skipping "nodetool repair (was RE: Timeseries with TTL)

2014-04-07 Thread Laing, Michael
Perhaps following this recent thread would help clarify things: http://mail-archives.apache.org/mod_mbox/cassandra-user/201401.mbox/%3ccakgmdnfk3pa-w+ltusm88a15jdg275o31p4ujwol1b7bkaj...@mail.gmail.com%3E Cheers, Michael On Mon, Apr 7, 2014 at 2:00 PM, Donald Smith < donald.sm...@audiencescien

Re: Migrating to new datacenter

2014-04-07 Thread Robert Coli
On Mon, Apr 7, 2014 at 10:48 AM, Brandon McCauslin wrote: > Thanks for the confirmation on the approach. The new dc is not yet ready, > but while I'm waiting I was thinking about updating the existing dc's > replication strategy from "SimpleStrategy" to "NetworkTopologyStrategy". I > also assum

Setting gc_grace_seconds to zero and skipping "nodetool repair (was RE: Timeseries with TTL)

2014-04-07 Thread Donald Smith
This statement is significant: “BTW if you never delete and only ttl your values at a constant value, you can set gc=0 and forget about periodic repair of the table, saving some space, IO, CPU, and an operational step.” Setting gc_grace_seconds to zero has the effect of not storing hinted handof

Re: Migrating to new datacenter

2014-04-07 Thread Brandon McCauslin
Thanks for the confirmation on the approach. The new dc is not yet ready, but while I'm waiting I was thinking about updating the existing dc's replication strategy from "SimpleStrategy" to "NetworkTopologyStrategy". I also assume I'll need to update my snitch from the current SimpleSnitch to Pro

Re: Transaction Timeout on get_count

2014-04-07 Thread Yulian Oifa
Thank for you replies. 1) I can not create raw each X time , since it will not allow me to get a complete list of currently active records ( this is the only reason i keep this raw initially ). 2) As for compaction i thought that only raw ids are cached and not columns itself. I have completed comp

Re: Migrating to new datacenter

2014-04-07 Thread Mark Reddy
I would go with option 1. I think it is the safer of the two options, involves less work and if something were to go wrong mid migration you can remove the second DC from your keyspace replication and have a clean break. SimpleStrategy will work across DCs. It is generally advised to not use it ac

Migrating to new datacenter

2014-04-07 Thread Brandon McCauslin
We're currently running a small 5 node 2.0.5 cluster in a single datacenter using the SimpleStrategy replication strategy with replication factor of 3. We want to migrate our data from our current datacenter to a new datacenter, without incurring any downtime or data loss. There is no plan to mai

Re: Transaction Timeout on get_count

2014-04-07 Thread Tupshin Harper
Constant deletes and rewrites are a very poor pattern to use with Cassandra. It would be better to write to a new row and partition every minute and use a TTL to auto expire the old data. -Tupshin On Apr 6, 2014 2:55 PM, "Yulian Oifa" wrote: > Hello > I am having raw in which approximately 100 v

Re: Drop in node replacements.

2014-04-07 Thread Robert Coli
On Sat, Apr 5, 2014 at 5:10 PM, Anand Somani wrote: > Have you tried nodetool rebuild for that node? I have seen that work when > repair failed. > While rebuild may work in cases when repair doesn't, they do different things and are not mutually substitutable. "rebuild" is essentially bootstrap

Re: Transaction Timeout on get_count

2014-04-07 Thread Lukas Steiblys
Deleting a column simply produces a tombstone for that column, as far as I know. It’s probably going through all the columns with tombstones and timing out. Compacting more often should help, but maybe Cassandra isn’t the best choice overall for what you’re trying to do. Lukas From: Yulian Oif

Re: Why is my cluster imbalanced ?

2014-04-07 Thread Tupshin Harper
I recommend rf=3 for most situations, and it would certainly be appropriate here. Just remember to add a third rack, and maintain the able number of nodes in each rack. -Tupshin On Apr 7, 2014 9:49 AM, "Oleg Dulin" wrote: > Tupshin: > > For EC2, 3 us-east, would you recommend RF=3 ? That would

Re: Why is my cluster imbalanced ?

2014-04-07 Thread Oleg Dulin
Tupshin: For EC2, 3 us-east, would you recommend RF=3 ? That would make sense, wouldn't it... That's what I'll do for production. Oleg On 2014-04-07 12:23:51 +, Tupshin Harper said: Your us-east datacenter, has RF=2, and 2 racks, which is the right way to do it (I would rarely recommen

Re: Inserting with large number of column

2014-04-07 Thread Fasika Daksa
Thanks for your response currently we are inserting the data line by line and soon we will implement the bulk insertion. the meta used to generate the data is No of Boolean cols: 20,000 .No of Int cols: 0 ...No of Rows = 100,000(we use only bool or integer variables). Attached you can find the

Re: Why is my cluster imbalanced ?

2014-04-07 Thread Oleg Dulin
Excellent, thanks. On 2014-04-07 12:23:51 +, Tupshin Harper said: Your us-east datacenter, has RF=2, and 2 racks, which is the right way to do it (I would rarely recommend using a different number of racks than your RF). But by having three nodes on one rack (1b) and only one on the other(1

Re: Inserting with large number of column

2014-04-07 Thread Tupshin Harper
More details would be helpful (exact schema), method of inserting data, etc) but you can try just doing dropping the indices and recreate them after the import is finished. -Tupshin On Apr 7, 2014 8:53 AM, "Fasika Daksa" wrote: > We are running different workload test on Cassandra and Redis fo

Inserting with large number of column

2014-04-07 Thread Fasika Daksa
We are running different workload test on Cassandra and Redis for benchmarking. We wrote a java client to read, write and evaluate the elapsed time of different test cases. Cassandra was doing great until we introduced 20'000 number of cols.. the insertion is running for a day and then i stoppe

Re: Why is my cluster imbalanced ?

2014-04-07 Thread Tupshin Harper
Your us-east datacenter, has RF=2, and 2 racks, which is the right way to do it (I would rarely recommend using a different number of racks than your RF). But by having three nodes on one rack (1b) and only one on the other(1a), you are telling Cassandra to distribute the data so that no two copies

Why is my cluster imbalanced ?

2014-04-07 Thread Oleg Dulin
I added two more nodes on Friday, and moved tokens around. For four nodes, the tokesn should be:  Node #1:    0  Node #2:   42535295865117307932921825928971026432  Node #3:   85070591730234615865843651857942052864  Node #4:  12760588759535192379876547778691307

Re: Cassandra Disk storage capacity

2014-04-07 Thread Bèrto ëd Sèra
I guess there is a misunderstanding here: >I am confused why cassandra uses the entire disk space ( / Directory) even when we specify /var/lib/cassandra/data as the directory in Cassandra.yaml file C* will use the entire MOUNTPOINT, which is not necessarily your entire total disk space. If you hav

Re: Cassandra Disk storage capacity

2014-04-07 Thread Jan Kesten
Am 07.04.2014 13:24, schrieb Hari Rajendhran: 1) I am confused why cassandra uses the entire disk space ( / Directory) even when we specify /var/lib/cassandra/data as the directory in Cassandra.yaml file 2) Is it only during compaction ,cassandra will use the entire Disk space ? 3) What is the

Re: Cassandra Disk storage capacity

2014-04-07 Thread Hari Rajendhran
Hi, Thanks for the update  Still i have few queries which needs to be clarified  1) I am confused why cassandra uses the entire disk space ( / Directory) even when we specify /var/lib/cassandra/data as the directory in Cassandra.yaml file 2) Is it only during compaction ,cassandra will use the

RE: Cassandra Disk storage capacity

2014-04-07 Thread Romain HARDOUIN
Hi, See data_file_directories and commitlog_directory in the settings file cassandra.yaml. Cheers, Romain Hari Rajendhran a écrit sur 07/04/2014 12:56:37 : > De : Hari Rajendhran > A : user@cassandra.apache.org, > Date : 07/04/2014 12:58 > Objet : Cassandra Disk storage capacity > > Hi T

Re: Cassandra Disk storage capacity

2014-04-07 Thread Prem Yadav
you can specify multiple data directories in cassandra.yaml. ex: data_file_directories: - /var/lib.cass1 - /var/lib/cass2 -/ On Mon, Apr 7, 2014 at 12:10 PM, Jan Kesten wrote: > Hi Hari, > > C* will use your entire space - that is something one should monitor. > Depending on your choose

Re: Cassandra Disk storage capacity

2014-04-07 Thread Jan Kesten
Hi Hari, C* will use your entire space - that is something one should monitor. Depending on your choose on compaction strategy your data_dir should not be filled up entirely - in the worst case compaction will need space as large as the sstables on disk, therefore 50% should be free space. T

Cassandra Disk storage capacity

2014-04-07 Thread Hari Rajendhran
Hi Team, We have a 3 node Apache cassandra 2.0.4 setup installed in our lab setup.We have set data directory to /var/lib/cassandra/data.What would be the maximum  disk storage that will be used for cassandra data storage. Note : /var partition has a storage capacity of 40GB. My question is whet