I would suggest you build one cluster, using all your nodes, and create one 
keyspace for all users.

There are lots of reasons, here a few:

* many nodes in a single clusters spreads the load and gives you fault 
tolerance. 
* read and write requests can be distributed in a many node cluster.
* cassandra caches and os level file caches will shared
* cassandra does not suffer from locking and contention during reads and writes
* you can prefix row keys to create "virtual keyspaces"  

Hope that helps. 

Aaron

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/04/2012, at 4:33 AM, Trevor Francis wrote:

> We are launching a data-intensive application that will store in upwards of 
> 50 million 150-byte records per day per user. We have identified Cassandra as 
> our database technology and Flume as what we will use to seed the data from 
> log files into the database. 
> 
> Each user is given their own server instance, but the schema of the data for 
> each user will be the same.
> 
> We will be performing realtime analysis on this information as part of our 
> application and was considering the advantages/disadvantages of all users 
> using the same keyspace. All data will be treated the same as far as 
> replication factor and the only difference is we won't be displaying one 
> user's info to another user. They will be compartmentalized and one user's 
> data will not affect or ever be compared against another user.
> 
> Conceptualize this as a each user has their own Apache server and that server 
> spits out 50 million records per day and each user will only be analyzing the 
> data for their particular server, not anyone elses. The log formats are 
> exactly the same.
> 
> My experience lies in relational databases and not key-value stores, like 
> Cassandra. So, in the mysql world we would put each user in their own 
> database to avoid the locking contention and to make queries faster. 
> 
> If we don't post info into different keyspaces, i assume we will have to add 
> an additional field to our records to identify the user that owns that 
> particular record. How does a single large Keyspace affect query speed, etc. 
> etc.
> 
> 
> 
> Trevor Francis
> 
> 

Reply via email to