hi list, AntiEntropyService started syncing ranges of entire nodes ( ?! ) across my data centers and i'd like to understand why.
I see log lines like this on all my nodes in my two ( east/west ) data centres... INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (85070591730234615865843651857942052964,0] for ( .. ) ( this is around 80-100 GB of data for a single node. ) - i did not observe any network failures or nodes falling off the ring - good distribution of data ( load is equal on all nodes ) - hinted handoff is on - read repair chance is 0.1 on the CF - 2 replicas in each data centre ( which is also the number of nodes in each ) with NetworkTopologyStrategy - repair -pr is scheduled to run off-peak hours, daily - leveled compaction with stable max size 256mb ( i have found this to trigger compaction in acceptable intervals while still keeping the stable count down ) - i am on 1.1.6 - java heap 10G - max memtables 2G - 1G row cache - 256M key cache my nodes' ranges are: DC west 0 85070591730234615865843651857942052864 DC east 100 85070591730234615865843651857942052964 symptoms are: - logs show sstables being streamed over to other nodes - 140k files in data dir of CF on all nodes - cfstats reports 20k sstables, up from 6 on all nodes - compaction continuously running with no results whatsoever ( number of stables growing ) i tried the following: - offline scrub ( has gone OOM, i noticed the script in the debian package specifies 256MB heap? ) - online scrub ( no effect ) - repair ( no effect ) - cleanup ( no effect ) my questions are: - how do i stop repair before i run out of storage? ( can't let this finish ) - how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely ) thanks, Andras Andras Szerdahelyi Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A M: +32 493 05 50 88 | Skype: sandrew84 [cid:7BDF7228-D831-4D98-967A-BE04FEB17544]
<<inline: C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>>