We ran nodetool repair on all nodes for all Keyspaces / CFs, restarted cassandra and this is what we get for nodetool status :
bin/nodetool -h localhost status Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.238.133.174 885.36 MB 256 8.4% e41d8863-ce37-4d5c-a428-bfacea432a35 1a UN 10.238.133.97 468.66 MB 256 7.7% 1bf42b5e-4aed-4b06-bdb3-65a78823b547 1a UN 10.151.86.146 1.08 GB 256 8.0% 8952645d-4a27-4670-afb2-65061c205734 1a UN 10.138.10.9 941.44 MB 256 8.6% 25ccea82-49d2-43d9-830c-b9c9cee026ec 1a UN 10.87.87.240 99.69 MB 256 8.6% ea066827-83bc-458c-83e8-bd15b7fc783c 1b UN 10.93.5.157 87.44 MB 256 7.6% 4ab9111c-39b4-4d15-9401-359d9d853c16 1b UN 10.238.137.250 561.42 MB 256 7.8% 84301648-afff-4f06-aa0b-4be421e0d08f 1a UN 10.92.231.170 893.75 MB 256 9.3% a18ce761-88a0-4407-bbd1-c867c4fecd1f 1b UN 10.138.2.20 31.89 MB 256 7.9% a6d4672a-0915-4c64-ba47-9f190abbf951 1a UN 10.93.31.44 312.52 MB 256 7.8% 67a6c0a6-e89f-4f3e-b996-cdded1b94faf 1b UN 10.93.91.139 30.46 MB 256 8.1% 682dd848-7c7f-4ddb-a960-119cf6491aa1 1b UN 10.236.138.169 260.15 MB 256 9.1% cbbf27b0-b53a-4530-bfdf-3764730b89d8 1a UN 10.137.7.90 38.45 MB 256 7.4% 17b79aa7-64fc-4e16-b96a-955b0aae9bb4 1a UN 10.93.77.166 867.15 MB 256 8.8% 9a821d1e-40e5-445d-b6b7-3cdd58bdb8cb 1b UN 10.120.249.140 863.98 MB 256 9.4% e1fb69b0-8e66-4deb-9e72-f901d7a14e8a 1b UN 10.90.246.128 242.63 MB 256 8.4% 054911ec-969d-43d9-aea1-db445706e4d2 1b UN 10.123.95.248 171.51 MB 256 7.2% a17deca1-9644-4520-9e62-ac66fc6fef60 1b UN 10.136.11.40 33.8 MB 256 8.5% 66be1173-b822-40b5-b650-cb38ae3c7a51 1a UN 10.87.90.42 38.01 MB 256 8.0% dac0c6ea-56c6-44da-a4ec-6388f39ecba1 1b UN 10.87.75.147 579.29 MB 256 8.3% ac060edf-dc48-44cf-a1b5-83c7a465f3c8 1b UN 10.151.49.88 151.06 MB 256 8.9% 57043573-ab1b-4e3c-8044-58376f7ce08f 1a UN 10.87.83.107 512.91 MB 256 8.3% 0019439b-9f8a-4965-91b8-7108bbb55593 1b UN 10.238.170.159 85.04 MB 256 9.4% 32ce322e-4f7c-46c7-a8ce-bd73cdd54684 1a UN 10.137.20.183 167.41 MB 256 8.4% 15951592-8ab2-473d-920a-da6e9d99507d 1a It doesn't seem to have changed by much. The loads are still highly uneven. As for the number of keys in each node's CFs : the largest node now has 5589120 keys for the column-family that had 6527744 keys before (load is now 1.08 GB as compares to 1.05 GB before), while the smallest node now has 71808 keys as compared to 3840 keys before (load is now 31.89 MB as compares to 1.12 MB before). On Thu, Sep 19, 2013 at 5:18 PM, Mohit Anchlia <mohitanch...@gmail.com>wrote: > Can you run nodetool repair on all the nodes first and look at the keys? > > > On Thu, Sep 19, 2013 at 1:22 PM, Suruchi Deodhar < > suruchi.deod...@generalsentiment.com> wrote: > >> Yes, the key distribution does vary across the nodes. For example, on the >> node with the highest data, Number of Keys (estimate) is 6527744 for a >> particular column family, whereas for the same column family on the node >> with least data, Number of Keys (estimate) = 3840. >> >> Is there a way to control this distribution by setting some parameter of >> cassandra. >> >> I am using the Murmur3 partitioner with NetworkTopologyStrategy. >> >> Thanks, >> Suruchi >> >> >> >> On Thu, Sep 19, 2013 at 3:59 PM, Mohit Anchlia <mohitanch...@gmail.com>wrote: >> >>> Can you check cfstats to see number of keys per node? >>> >>> >>> On Thu, Sep 19, 2013 at 12:36 PM, Suruchi Deodhar < >>> suruchi.deod...@generalsentiment.com> wrote: >>> >>>> Thanks for your replies. I wiped out my data from the cluster and also >>>> cleared the commitlog before restarting it with num_tokens=256. I then >>>> uploaded data using sstableloader. >>>> >>>> However, I am still not able to see a uniform distribution of data >>>> across nodes of the clusters. >>>> >>>> The output of the bin/nodetool -h localhost status commands looks like >>>> follows. Some nodes have data as low as 1.12MB while some have as high as >>>> 912.57 MB. >>>> >>>> Datacenter: us-east >>>> =================== >>>> Status=Up/Down >>>> |/ State=Normal/Leaving/Joining/Moving >>>> -- Address Load Tokens Owns (effective) Host >>>> ID Rack >>>> UN 10.238.133.174 856.66 MB 256 8.4% >>>> e41d8863-ce37-4d5c-a428-bfacea432a35 1a >>>> UN 10.238.133.97 439.02 MB 256 7.7% >>>> 1bf42b5e-4aed-4b06-bdb3-65a78823b547 1a >>>> UN 10.151.86.146 1.05 GB 256 8.0% >>>> 8952645d-4a27-4670-afb2-65061c205734 1a >>>> UN 10.138.10.9 912.57 MB 256 8.6% >>>> 25ccea82-49d2-43d9-830c-b9c9cee026ec 1a >>>> UN 10.87.87.240 70.85 MB 256 8.6% >>>> ea066827-83bc-458c-83e8-bd15b7fc783c 1b >>>> UN 10.93.5.157 60.56 MB 256 7.6% >>>> 4ab9111c-39b4-4d15-9401-359d9d853c16 1b >>>> UN 10.92.231.170 866.73 MB 256 9.3% >>>> a18ce761-88a0-4407-bbd1-c867c4fecd1f 1b >>>> UN 10.238.137.250 533.77 MB 256 7.8% >>>> 84301648-afff-4f06-aa0b-4be421e0d08f 1a >>>> UN 10.93.91.139 478.45 KB 256 8.1% >>>> 682dd848-7c7f-4ddb-a960-119cf6491aa1 1b >>>> UN 10.138.2.20 1.12 MB 256 7.9% >>>> a6d4672a-0915-4c64-ba47-9f190abbf951 1a >>>> UN 10.93.31.44 282.65 MB 256 7.8% >>>> 67a6c0a6-e89f-4f3e-b996-cdded1b94faf 1b >>>> UN 10.236.138.169 223.66 MB 256 9.1% >>>> cbbf27b0-b53a-4530-bfdf-3764730b89d8 1a >>>> UN 10.137.7.90 11.36 MB 256 7.4% >>>> 17b79aa7-64fc-4e16-b96a-955b0aae9bb4 1a >>>> UN 10.93.77.166 837.64 MB 256 8.8% >>>> 9a821d1e-40e5-445d-b6b7-3cdd58bdb8cb 1b >>>> UN 10.120.249.140 838.59 MB 256 9.4% >>>> e1fb69b0-8e66-4deb-9e72-f901d7a14e8a 1b >>>> UN 10.90.246.128 216.75 MB 256 8.4% >>>> 054911ec-969d-43d9-aea1-db445706e4d2 1b >>>> UN 10.123.95.248 147.1 MB 256 7.2% >>>> a17deca1-9644-4520-9e62-ac66fc6fef60 1b >>>> UN 10.136.11.40 4.24 MB 256 8.5% >>>> 66be1173-b822-40b5-b650-cb38ae3c7a51 1a >>>> UN 10.87.90.42 11.56 MB 256 8.0% >>>> dac0c6ea-56c6-44da-a4ec-6388f39ecba1 1b >>>> UN 10.87.75.147 549 MB 256 8.3% >>>> ac060edf-dc48-44cf-a1b5-83c7a465f3c8 1b >>>> UN 10.151.49.88 119.86 MB 256 8.9% >>>> 57043573-ab1b-4e3c-8044-58376f7ce08f 1a >>>> UN 10.87.83.107 484.3 MB 256 8.3% >>>> 0019439b-9f8a-4965-91b8-7108bbb55593 1b >>>> UN 10.137.20.183 137.67 MB 256 8.4% >>>> 15951592-8ab2-473d-920a-da6e9d99507d 1a >>>> UN 10.238.170.159 49.17 MB 256 9.4% >>>> 32ce322e-4f7c-46c7-a8ce-bd73cdd54684 1a >>>> >>>> Is there something else that I should be doing differently? >>>> >>>> Thanks for your help! >>>> >>>> Suruchi >>>> >>>> >>>> >>>> On Thu, Sep 19, 2013 at 3:20 PM, Richard Low <rich...@wentnet.com>wrote: >>>> >>>>> The only thing you need to guarantee is that Cassandra doesn't start >>>>> with num_tokens=1 (the default in 1.2.x) or, if it does, that you wipe all >>>>> the data before starting it with higher num_tokens. >>>>> >>>>> >>>>> On 19 September 2013 19:07, Robert Coli <rc...@eventbrite.com> wrote: >>>>> >>>>>> On Thu, Sep 19, 2013 at 10:59 AM, Suruchi Deodhar < >>>>>> suruchi.deod...@generalsentiment.com> wrote: >>>>>> >>>>>>> Do you suggest I should try with some other installation mechanism? >>>>>>> Are there any known problems with the tar installation of cassandra >>>>>>> 1.2.9 >>>>>>> that I should be aware of? >>>>>>> >>>>>> >>>>>> I was asking in the context of this JIRA : >>>>>> >>>>>> https://issues.apache.org/jira/browse/CASSANDRA-2356 >>>>>> >>>>>> Which does not seem to apply in your case! >>>>>> >>>>>> =Rob >>>>>> >>>>> >>>>> >>>> >>> >> >