Hi Luke, I've never found nodetool status' load to be useful beyond a general indicator.
You should expect some small skew, as this will depend on your current compaction status, tombstones, etc. IIRC repair will not provide consistency of intermediate states nor will it remove tombstones, it only guarantees consistency in the final state. This means, in the case of dropped hints or mutations, you will see differences in intermediate states, and therefore storage footrpint, even in fully repaired nodes. This includes intermediate UPDATE operations as well. Your one node with sub 1GB sticks out like a sore thumb, though. Where did you originate the nodetool repair from? Remember that repair will only ensure consistency for ranges held by the node you're running it on. While I am not sure if missing ranges are included in this, if you ran nodetool repair only on a machine with partial ownership, you will need to complete repairs across the ring before data will return to full consistency. I would query some older data using consistency = ONE on the affected machine to determine if you are actually missing data. There are a few outstanding bugs in the 2.1.x and older release families that may result in tombstone creation even without deletes, for example CASSANDRA-10547, which impacts updates on collections in pre-2.1.13 Cassandra. You can also try examining the output of nodetool ring, which will give you a breakdown of tokens and their associations within your cluster. --Bryan On Tue, May 24, 2016 at 3:49 PM, kurt Greaves <k...@instaclustr.com> wrote: > Not necessarily considering RF is 2 so both nodes should have all > partitions. Luke, are you sure the repair is succeeding? You don't have > other keyspaces/duplicate data/extra data in your cassandra data directory? > Also, you could try querying on the node with less data to confirm if it > has the same dataset. > > On 24 May 2016 at 22:03, Bhuvan Rawal <bhu1ra...@gmail.com> wrote: > >> For the other DC, it can be acceptable because partition reside on one >> node, so say if you have a large partition, it may skew things a bit. >> On May 25, 2016 2:41 AM, "Luke Jolly" <l...@getadmiral.com> wrote: >> >>> So I guess the problem may have been with the initial addition of the >>> 10.128.0.20 node because when I added it in it never synced data I >>> guess? It was at around 50 MB when it first came up and transitioned to >>> "UN". After it was in I did the 1->2 replication change and tried repair >>> but it didn't fix it. From what I can tell all the data on it is stuff >>> that has been written since it came up. We never delete data ever so we >>> should have zero tombstones. >>> >>> If I am not mistaken, only two of my nodes actually have all the data, >>> 10.128.0.3 and 10.142.0.14 since they agree on the data amount. 10.142.0.13 >>> is almost a GB lower and then of course 10.128.0.20 which is missing >>> over 5 GB of data. I tried running nodetool -local on both DCs and it >>> didn't fix either one. >>> >>> Am I running into a bug of some kind? >>> >>> On Tue, May 24, 2016 at 4:06 PM Bhuvan Rawal <bhu1ra...@gmail.com> >>> wrote: >>> >>>> Hi Luke, >>>> >>>> You mentioned that replication factor was increased from 1 to 2. In >>>> that case was the node bearing ip 10.128.0.20 carried around 3GB data >>>> earlier? >>>> >>>> You can run nodetool repair with option -local to initiate repair local >>>> datacenter for gce-us-central1. >>>> >>>> Also you may suspect that if a lot of data was deleted while the node >>>> was down it may be having a lot of tombstones which is not needed to be >>>> replicated to the other node. In order to verify the same, you can issue a >>>> select count(*) query on column families (With the amount of data you have >>>> it should not be an issue) with tracing on and with consistency local_all >>>> by connecting to either 10.128.0.3 or 10.128.0.20 and store it in a >>>> file. It will give you a fair amount of idea about how many deleted cells >>>> the nodes have. I tried searching for reference if tombstones are moved >>>> around during repair, but I didnt find evidence of it. However I see no >>>> reason to because if the node didnt have data then streaming tombstones >>>> does not make a lot of sense. >>>> >>>> Regards, >>>> Bhuvan >>>> >>>> On Tue, May 24, 2016 at 11:06 PM, Luke Jolly <l...@getadmiral.com> >>>> wrote: >>>> >>>>> Here's my setup: >>>>> >>>>> Datacenter: gce-us-central1 >>>>> =========================== >>>>> Status=Up/Down >>>>> |/ State=Normal/Leaving/Joining/Moving >>>>> -- Address Load Tokens Owns (effective) Host ID >>>>> Rack >>>>> UN 10.128.0.3 6.4 GB 256 100.0% >>>>> 3317a3de-9113-48e2-9a85-bbf756d7a4a6 default >>>>> UN 10.128.0.20 943.08 MB 256 100.0% >>>>> 958348cb-8205-4630-8b96-0951bf33f3d3 default >>>>> Datacenter: gce-us-east1 >>>>> ======================== >>>>> Status=Up/Down >>>>> |/ State=Normal/Leaving/Joining/Moving >>>>> -- Address Load Tokens Owns (effective) Host ID >>>>> Rack >>>>> UN 10.142.0.14 6.4 GB 256 100.0% >>>>> c3a5c39d-e1c9-4116-903d-b6d1b23fb652 default >>>>> UN 10.142.0.13 5.55 GB 256 100.0% >>>>> d0d9c30e-1506-4b95-be64-3dd4d78f0583 default >>>>> >>>>> And my replication settings are: >>>>> >>>>> {'class': 'NetworkTopologyStrategy', 'aws-us-west': '2', >>>>> 'gce-us-central1': '2', 'gce-us-east1': '2'} >>>>> >>>>> As you can see 10.128.0.20 in the gce-us-central1 DC only has a load >>>>> of 943 MB even though it's supposed to own 100% and should have 6.4 GB. >>>>> Also 10.142.0.13 seems also not to have everything as it only has a >>>>> load of 5.55 GB. >>>>> >>>>> On Mon, May 23, 2016 at 7:28 PM, kurt Greaves <k...@instaclustr.com> >>>>> wrote: >>>>> >>>>>> Do you have 1 node in each DC or 2? If you're saying you have 1 node >>>>>> in each DC then a RF of 2 doesn't make sense. Can you clarify on what >>>>>> your >>>>>> set up is? >>>>>> >>>>>> On 23 May 2016 at 19:31, Luke Jolly <l...@getadmiral.com> wrote: >>>>>> >>>>>>> I am running 3.0.5 with 2 nodes in two DCs, gce-us-central1 and >>>>>>> gce-us-east1. I increased the replication factor of gce-us-central1 >>>>>>> from 1 >>>>>>> to 2. Then I ran 'nodetool repair -dc gce-us-central1'. The >>>>>>> "Owns" for the node switched to 100% as it should but the Load showed >>>>>>> that >>>>>>> it didn't actually sync the data. I then ran a full 'nodetool repair' >>>>>>> and >>>>>>> it didn't fix it still. This scares me as I thought 'nodetool repair' >>>>>>> was >>>>>>> a way to assure consistency and that all the nodes were synced but it >>>>>>> doesn't seem to be. Outside of that command, I have no idea how I would >>>>>>> assure all the data was synced or how to get the data correctly synced >>>>>>> without decommissioning the node and re-adding it. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Kurt Greaves >>>>>> k...@instaclustr.com >>>>>> www.instaclustr.com >>>>>> >>>>> >>>>> >>>> > > > -- > Kurt Greaves > k...@instaclustr.com > www.instaclustr.com >