Case Study for Learning Cassandra

2015-02-04 Thread Krish Donald
Hi, I am new to Cassandra and have setup 4 nodes Cassandra cluster using VMs. Looking for any case study which I can do to understand the Cassandra Administration and put in my resume as well. Any help is appreciated. Thanks Krish

Re: Error while starting Cassandra for the first time

2015-02-04 Thread Krish Donald
I have used the yaml validator but tried to fixed based on error messages , I had to comment data_directories , commitlog_directory and saved_caches directory and after that it worked. Thanks a lot for the help ... On Wed, Feb 4, 2015 at 2:32 PM, Mark Reddy wrote: > INFO 22:17:19 Loading setti

Newly added column not visible

2015-02-04 Thread Saurabh Sethi
I have a 3 node cluster running Cassandra version 2.1.2. Through my unit test, I am creating a column family with 3 columns, inserting a row, asserting that the values got inserted and then truncating the column family. After that I am adding a fourth column to the column family and inserting a

Re: Error while starting Cassandra for the first time

2015-02-04 Thread Mark Reddy
> > INFO 22:17:19 Loading settings from file:/home/csduser/cassandra/ > conf/cassandra.yaml > ERROR 22:17:20 Fatal configuration error > org.apache.cassandra.exceptions.ConfigurationException: Invalid yaml You have an malformed cassandra.yaml config file that is resulting in Cassandra not being

Re: Error while starting Cassandra for the first time

2015-02-04 Thread Michael Dykman
I would start looking in /home/csduser/cassandra/conf/cassandra.yaml. Perhaps you could validate the YAML format of that file with an independent tool such as http://yaml-online-parser.appspot.com/ On Wed, Feb 4, 2015 at 5:23 PM, Krish Donald wrote: > Hi, > > I am getting below error: > Not abl

Error while starting Cassandra for the first time

2015-02-04 Thread Krish Donald
Hi, I am getting below error: Not able to understand why ?? [csduser@master bin]$ ./cassandra -f CompilerOracle: inline org/apache/cassandra/db/AbstractNativeCell.compareTo (Lorg/apache/cassandra/db/composites/Composite;)I CompilerOracle: inline org/apache/cassandra/db/composites/AbstractSimpleCe

Re: A connection attempt failed

2015-02-04 Thread Robert Coli
On Wed, Feb 4, 2015 at 10:45 AM, Rene Kochen wrote: > This large batch_mutate took 5 seconds in Cassandra 1.0.11 > The same batch_mutate takes three minutes in Cassandra 1.2.18 > Doesn't seem normal to me. But large batch_mutates are generally considered an anti-pattern... =Rob

Re: A connection attempt failed

2015-02-04 Thread Rene Kochen
After a little testing I see that the client has a time-out because the server takes longer. This large batch_mutate took 5 seconds in Cassandra 1.0.11 The same batch_mutate takes three minutes in Cassandra 1.2.18 Is that normal!? Thanks, Rene 2015-02-04 18:29 GMT+01:00 Rene Kochen : > Hi

Re: to normalize or not to normalize - read penalty vs write penalty

2015-02-04 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Perfect Tyler. My feeling was leading me to this, but I wasn't being able to put it in words as you did. Thanks a lot for the message. From: user@cassandra.apache.org Subject: Re: to normalize or not to normalize - read penalty vs write penalty Okay. Let's assume with denormalization you h

Re: to normalize or not to normalize - read penalty vs write penalty

2015-02-04 Thread Tyler Hobbs
Okay. Let's assume with denormalization you have to do 1000 writes (and one read per user) and with normalization you have to do 1 write (and maybe 1000 reads for each user). If you execute the writes in the most optimal way (batched by partition, if applicable, and separate, concurrent requests

Re: to normalize or not to normalize - read penalty vs write penalty

2015-02-04 Thread Marcelo Valle (BLOOMBERG/ LONDON)
I don't want to optimize for reads or writes, I want to optimize for having the smallest gap possible between the time I write and the time I read. []s From: user@cassandra.apache.org Subject: Re: to normalize or not to normalize - read penalty vs write penalty Roughly how often do you expect t

Re: to normalize or not to normalize - read penalty vs write penalty

2015-02-04 Thread Tyler Hobbs
Roughly how often do you expect to update alerts? How often do you expect to read the alerts? I suspect you'll be doing 100x more reads (or more), in which case optimizing for reads is the definitely right choice. On Wed, Feb 4, 2015 at 9:50 AM, Marcelo Valle (BLOOMBERG/ LONDON) < mvallemil...@b

A connection attempt failed

2015-02-04 Thread Rene Kochen
Hi all, I have a problem with my client on Cassandra 1.2.18 which I did not have on Cassandra 1.0.11 I create a big row with a lot of super-columns. When writing that row using batch_mutate, I receive the following error in my client: "A connection attempt failed because the connected party did

Re: Smart column searching for a particular rowKey

2015-02-04 Thread Eric Stevens
If you're getting started with Cassandra, definitely prefer CQL over Thrift (Astyanax's default interface). New features will be coming to CQL in later versions, and they will not be backported to Thrift, it's a frozen interface, and will eventually be deprecated. But with Astyanax you want to lo

Re: data distribution along column family partitions

2015-02-04 Thread Marcelo Valle (BLOOMBERG/ LONDON)
From: clohfin...@gmail.com Subject: Re: data distribution along column family partitions > not ok :) don't let a single partition get to 1gb, 100's of mb should be when > flares are going up. The main reasoning is compactions would be horrifically > slow and there will be a lot of gc pain. Bri

Re: data distribution along column family partitions

2015-02-04 Thread Chris Lohfink
> What about 15 gb? not ok :) don't let a single partition get to 1gb, 100's of mb should be when flares are going up. The main reasoning is compactions would be horrifically slow and there will be a lot of gc pain. Bringing the time bucket to be by day will probably be sufficient. It would take b

to normalize or not to normalize - read penalty vs write penalty

2015-02-04 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Hello everyone, I am thinking about the architecture of my application using Cassandra and I am asking myself if I should or shouldn't normalize an entity. I have users and alerts in my application and for each user, several alerts. The first model which came into my mind was creating an "alert

Re: data distribution along column family partitions

2015-02-04 Thread Marcelo Valle (BLOOMBERG/ LONDON)
> The data model lgtm. You may need to balance the size of the time buckets > with the amount of alarms to prevent partitions from getting too large. 1 month may be a little large, I would aim to keep the partitions below 25mb (can check with nodetool cfstats) or so in size to keep everything hap

RE: FW: How to use cqlsh to access Cassandra DB if the client_encryption_options is enabled

2015-02-04 Thread Lu, Boying
Thanks a lot. I think I need the ‘ –alias’ option. From: Adam Holmberg [mailto:adam.holmb...@datastax.com] Sent: 2015年2月4日 23:17 To: user@cassandra.apache.org Subject: Re: FW: How to use cqlsh to access Cassandra DB if the client_encryption_options is enabled Since I don't know what's in you

Re: data distribution along column family partitions

2015-02-04 Thread Chris Lohfink
The data model lgtm. You may need to balance the size of the time buckets with the amount of alarms to prevent partitions from getting too large. 1 month may be a little large, I would aim to keep the partitions below 25mb (can check with nodetool cfstats) or so in size to keep everything happy.

Re: FW: How to use cqlsh to access Cassandra DB if the client_encryption_options is enabled

2015-02-04 Thread Adam Holmberg
Since I don't know what's in your keystore, or how it was generated, I don't know how much help I can be. You probably need "-alias " on the command line, and make sure a cert by the name "" exists in your keystore. You can use "keytool -list ..." to examine the contents. Adam Holmberg On Mon, F

data distribution along column family partitions

2015-02-04 Thread Marcelo Elias Del Valle
Hello, I am designing a model to store alerts users receive over time. I will want to store probably the last two years of alerts for each user. The first thought I had was having a column family partitioned by user + timebucket, where time bucket could be something like year + month. For instanc

Re: Suggestion Date as a Partition key

2015-02-04 Thread Srinivasa T N
I would not suggest only date as the partition key. This creates all the records related to a single day go into a single partition and will create load on one partition when other partitions are free. Try to add some other field also to the primary key so that the load is distributed. Check thi