I will have just 6 columns in my CF, but I will have about a billion writes per hour. In this case, I think Cassandra applies then, by what you are saying. This answer helped a lot too, thanks!
2012/9/18 Hiller, Dean <dean.hil...@nrel.gov> > I wanted to clarify the where that statement comes from on wide rows …. > > Realize some people make the claim that if you don’t' have 1000's of > columns in "some" rows in cassandra you are doing something wrong. This is > not true, BUT it comes from the fact that people are setting up indexes. > This is what leads to the very wide row affect. playOrm is one such > library using wide rows like this BUT it is NOT necessary for all > applications. > > You can easily use map/reduce on a cassandra cluster. You can map/reduce > your dataset into a new model if you make a mistake as well and don't get > it right the first time. This wide row affect is 80% of the time used for > indexing. I draw off playOrm examples a lot but one table may be > partitioned by time so each month of data is in a partition, you can then > have indexes on each partition allowing you to do quick queries into > partitions. > > Later, > Dean > > From: Marcelo Elias Del Valle <mvall...@gmail.com<mailto: > mvall...@gmail.com>> > Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" < > user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > Date: Monday, September 17, 2012 4:28 PM > To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" < > user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > Subject: Is Cassandra right for me? > > Hello, > > I am new to Cassandra and I am in doubt if Cassandra is the right > technology to use in the architecture I am defining. Also, I saw a > presentation which said that if I don't have rows with more than a hundred > rows in Cassandra, whether I am doing something wrong or I shouldn't be > using Cassandra. Therefore, it might be the case I am doing something > wrong. If you could help me to find out the answer for these questions by > giving any feedback, it would be highly appreciated. > Here is my need and what I am thinking in using Cassandra for: > > * I need to support a high volume of writes per second. I might have a > billion writes per hour > * I need to write non-structured data that will be processed later by > hadoop processes to generate structured data from it. Later, I index the > structured data using SOLR or SOLANDRA, so the data can be consulted by my > end user application. Is Cassandra recommended for that, or should I be > thinking in writting directly to HDFS files, for instance? What's the main > advantage I get from storing data in a nosql service like Cassandra, when > compared to storing files into HDFS? > * Usually I will write json data associated to an ID and my hadoop > processes will process this data to write data to a database. I have two > doubts here: > * If I don't need to perform complicated queries in Cassandra, > should I store the json-like data just as a column value? I am afraid of > doing something wrong here, as I would need just to store the json file and > some more 5 or 6 fields to query the files later. > * Does it make sense to you to use hadoop to process data from > Cassandra and store the results in a database, like HBase? Once I have > structured data, is there any reason I should use Cassandra instead of > HBase? > > I am sorry if the questions are too dummy, I have been watching a lot > of videos and reading a lot of documentation about Cassandra, but honestly, > more I read more I have questions. > > Thanks in advance. > > Best regards, > -- > Marcelo Elias Del Valle > http://mvalle.com - @mvallebr > -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr