It was my first thought. Then I md5 uuid and used the digest as a key: MessageDigest md = MessageDigest.getInstance("MD5");
//in the loop UUID uuid = UUID.randomUUID(); byte[] bytes = md.digest(asByteArray(uuid)); the result is exactly the same, first node takes 66%, second 33% and third one is empty. for some reason rows which should be placed on third node moved to first one. Address DC Rack Status State Load Effective-Ownership Token Token(bytes[56713727820156410577229101238628035242]) 127.0.0.1 datacenter1 rack1 Up Normal 7.68 MB 33.33% Token(bytes[00]) 127.0.0.3 datacenter1 rack1 Up Normal 79.17 KB 33.33% Token(bytes[0113427455640312821154458202477256070485]) 127.0.0.2 datacenter1 rack1 Up Normal 3.81 MB 33.33% Token(bytes[56713727820156410577229101238628035242]) On Thu, Oct 4, 2012 at 12:33 AM, Tom <fivemile...@gmail.com> wrote: > Hi Andrey, > > while the data values you generated might be following a true random > distribution, your row key, UUID, is not (because it is created on the same > machines by the same software with a certain window of time) > > For example, if you were using the UUID class in Java, these would be > composed from several components (related to dimensions such as time and > version), so you can not expect a random distribution over the whole space. > > > Cheers > Tom > > > > > On Wed, Oct 3, 2012 at 5:39 PM, Andrey Ilinykh <ailin...@gmail.com> wrote: >> >> Hello, everybody! >> >> I'm observing very strange behavior. I have 3 node cluster with >> ByteOrderPartitioner. (I run 1.1.5) >> I created a key space with replication factor of 1. >> Then I created one column family and populated it with random data. >> I use UUID as a row key, and Integer as a column name. >> Row keys were generated as >> >> UUID uuid = UUID.randomUUID(); >> >> I populated about 100000 rows with 100 column each. >> >> I would expect equal load on each node, but the result is totally >> different. This is what nodetool gives me: >> >> Address DC Rack Status State Load >> Effective-Ownership Token >> >> >> Token(bytes[56713727820156410577229101238628035242]) >> 127.0.0.1 datacenter1 rack1 Up Normal 27.61 MB >> 33.33% Token(bytes[00]) >> 127.0.0.3 datacenter1 rack1 Up Normal 206.47 KB >> 33.33% >> Token(bytes[0113427455640312821154458202477256070485]) >> 127.0.0.2 datacenter1 rack1 Up Normal 13.86 MB >> 33.33% >> Token(bytes[56713727820156410577229101238628035242]) >> >> >> one node (127.0.0.3) is almost empty. >> Any ideas what is wrong? >> >> >> Thank you, >> Andrey > >