Hi Luke, You mentioned that replication factor was increased from 1 to 2. In that case was the node bearing ip 10.128.0.20 carried around 3GB data earlier?
You can run nodetool repair with option -local to initiate repair local datacenter for gce-us-central1. Also you may suspect that if a lot of data was deleted while the node was down it may be having a lot of tombstones which is not needed to be replicated to the other node. In order to verify the same, you can issue a select count(*) query on column families (With the amount of data you have it should not be an issue) with tracing on and with consistency local_all by connecting to either 10.128.0.3 or 10.128.0.20 and store it in a file. It will give you a fair amount of idea about how many deleted cells the nodes have. I tried searching for reference if tombstones are moved around during repair, but I didnt find evidence of it. However I see no reason to because if the node didnt have data then streaming tombstones does not make a lot of sense. Regards, Bhuvan On Tue, May 24, 2016 at 11:06 PM, Luke Jolly <l...@getadmiral.com> wrote: > Here's my setup: > > Datacenter: gce-us-central1 > =========================== > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 10.128.0.3 6.4 GB 256 100.0% > 3317a3de-9113-48e2-9a85-bbf756d7a4a6 default > UN 10.128.0.20 943.08 MB 256 100.0% > 958348cb-8205-4630-8b96-0951bf33f3d3 default > Datacenter: gce-us-east1 > ======================== > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 10.142.0.14 6.4 GB 256 100.0% > c3a5c39d-e1c9-4116-903d-b6d1b23fb652 default > UN 10.142.0.13 5.55 GB 256 100.0% > d0d9c30e-1506-4b95-be64-3dd4d78f0583 default > > And my replication settings are: > > {'class': 'NetworkTopologyStrategy', 'aws-us-west': '2', > 'gce-us-central1': '2', 'gce-us-east1': '2'} > > As you can see 10.128.0.20 in the gce-us-central1 DC only has a load of > 943 MB even though it's supposed to own 100% and should have 6.4 GB. Also > 10.142.0.13 > seems also not to have everything as it only has a load of 5.55 GB. > > On Mon, May 23, 2016 at 7:28 PM, kurt Greaves <k...@instaclustr.com> > wrote: > >> Do you have 1 node in each DC or 2? If you're saying you have 1 node in >> each DC then a RF of 2 doesn't make sense. Can you clarify on what your set >> up is? >> >> On 23 May 2016 at 19:31, Luke Jolly <l...@getadmiral.com> wrote: >> >>> I am running 3.0.5 with 2 nodes in two DCs, gce-us-central1 and >>> gce-us-east1. I increased the replication factor of gce-us-central1 from 1 >>> to 2. Then I ran 'nodetool repair -dc gce-us-central1'. The "Owns" >>> for the node switched to 100% as it should but the Load showed that it >>> didn't actually sync the data. I then ran a full 'nodetool repair' and it >>> didn't fix it still. This scares me as I thought 'nodetool repair' was a >>> way to assure consistency and that all the nodes were synced but it doesn't >>> seem to be. Outside of that command, I have no idea how I would assure all >>> the data was synced or how to get the data correctly synced without >>> decommissioning the node and re-adding it. >>> >> >> >> >> -- >> Kurt Greaves >> k...@instaclustr.com >> www.instaclustr.com >> > >