You should be ok, depending on the partitioner strategy you use. The keys end up created as a hash (which is why when you're setting up your nodes you can give them a specific key. Then, whatever your key is will be used to create an MD5 hash, that hash will then determine what node your data will live on.
So while your distribution won't necessarily be completely balanced, it should at least be in the right ballpark. To give you an idea of this in practice, we've got consecutive integer values as our keys and we're using the random partitioner...we have VERY close to the same number of keys on each of our nodes. Then the bigger question about balancing your load is how big each record is...if they are consistent in size, vary widely, ect, as that is just as likely to impact how balanced your loads are. On Mon, Oct 10, 2011 at 9:09 AM, Laurent Aufrechter < laurent.aufrech...@yahoo.fr> wrote: > Hi, > > I am planing to make tests on Cassandra with a few nodes. I want to create > a column family where the key will be the date down to the second (like > 2011/10/10-16:07:53). Doing so, my keys will be very similar from each > others. Is it ok to use such keys if I want my data to be evenly distributed > across my nodes or do I have to "do something" ? > > Thanks in advance. > > L. Aufrechter > -- *David McNelis* Lead Software Engineer Agentis Energy www.agentisenergy.com o: 630.359.6395 c: 219.384.5143 *A Smart Grid technology company focused on helping consumers of energy control an often under-managed resource.*