Re: Database grows 10X bigger after running nodetool repair

Dominic Williams Wed, 25 May 2011 11:07:31 -0700

Jeepers creepers that's it Jeeves!!! Arrrrrghhhhh.

Basically once my repair hit a big column family db size exploded until the
node ran out of disk space..


Firstly any ideas for a quick fix because this is giving me big production
problems. Write/read with QUORUM is reportedly producing unpredictable
results (people have called support regarding monsters in my MMO appearing
and disappearing magically) and many operations are just failing with
SocketTimeoutException I guess because of the continuing compactions over
huge sstables. I'm going to have to try making adjustments to client timeout
settings etc but this feels like using a hanky to protect oneself from a
downpour.

Secondly does anyone know if this is just a waiting game - will the node
eventually correct itself and shrink back down?
I'm down to 204G now from 270G.

Thirdly does anyone know if the problem is contagious i.e. should I consider
decommissioning the whole node and try to rebuild from replicas?

Thanks, Dominic

On 25 May 2011 17:16, Daniel Doubleday <daniel.double...@gmx.net> wrote:

> We are having problems with repair too.
>
> It sounds like yours are the same. From today:
> http://permalink.gmane.org/gmane.comp.db.cassandra.user/16619
> <http://permalink.gmane.org/gmane.comp.db.cassandra.user/16619>
> On May 25, 2011, at 4:52 PM, Dominic Williams wrote:
>
> Hi,
>
> I've got a strange problem, where the database on a node has inflated 10X
> after running repair. This is not the result of receiving missed data.
>
> I didn't perform repair within my usual 10 day cycle, so followed
> recommended practice:
>
> http://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds
>
> The sequence of events was like this:
>
> 1) set GCGraceSeconds to some huge value
> 2) perform rolling upgrade from 0.7.4 to 0.7.6-2
> 3) run nodetool repair on the first node in cluster ~10pm. It has a ~30G
> database
> 3) 2.30am decide to leave it running all night and wake up 9am to find
> still running
> 4) late morning investigation shows that db size has increased to 370G. The
> snapshot folder accounts for only 30G
> 5) node starts to run out of disk space http://pastebin.com/Sm0B7nfR
> 6) decide to bail! Reset GCGraceSeconds to 864000 and restart node to stop
> repair
> 7) as node restarts it deletes a bunch of tmp files, reducing db size from
> 370G to 270G
> 8) node now constantly performing minor compactions and du rising slightly
> then falling by a greater amount after minor compaction deletes sstable
> 9) gradually disk usage is coming down. Currently at 254G (3pm)
> 10) performance of node obviously not great!
>
> Investigation of the database reveals the main problem to have occurred in
> a single column family, UserFights. This contains millions of fight records
> from our MMO, but actually exactly the same number as the MonsterFights cf.
> However, the comparative size is
>
> Column Family: MonsterFights
> SSTable count: 38
> Space used (live): 13867454647
>  Space used (total): 13867454647 (13G)
> Memtable Columns Count: 516
> Memtable Data Size: 598770
>  Memtable Switch Count: 4
> Read Count: 514
> Read Latency: 157.649 ms.
>  Write Count: 4059
> Write Latency: 0.025 ms.
> Pending Tasks: 0
>  Key cache capacity: 200000
> Key cache size: 183004
> Key cache hit rate: 0.0023566218452145135
>  Row cache: disabled
> Compacted row minimum size: 771
> Compacted row maximum size: 943127
>  Compacted row mean size: 3208
>
> Column Family: UserFights
> SSTable count: 549
>  Space used (live): 185355019679
> Space used (total): 219489031691 (219G)
> Memtable Columns Count: 483
>  Memtable Data Size: 560569
> Memtable Switch Count: 8
> Read Count: 2159
>  Read Latency: 2589.150 ms.
> Write Count: 4080
> Write Latency: 0.018 ms.
>  Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 200000
>  Key cache hit rate: 0.03357770764288416
> Row cache: disabled
> Compacted row minimum size: 925
>  Compacted row maximum size: 12108970
> Compacted row mean size: 503069
>
> These stats were taken at 3pm, and at 1pm UserFights was using 224G total,
> so overall size is gradually coming down.
>
> Another observation is the following appearing in the logs during the minor
> compactions:
> Compacting large row 536c69636b5061756c (121235810 bytes) incrementally
>
> The largest number of fights any user has performed on our MMO that I can
> find is short of 10,000. Each fight record is smaller than 1K... so it looks
> like these rows have grown +10X somehow.
>
> The size of UserFights on another replica node, which actually has a
> slightly higher proportion of ring is
>
> Column Family: UserFights
>  SSTable count: 14
> Space used (live): 17844982744
> Space used (total): 17936528583 (18G)
>  Memtable Columns Count: 767
> Memtable Data Size: 891153
> Memtable Switch Count: 6
>  Read Count: 2298
> Read Latency: 61.020 ms.
> Write Count: 4261
>  Write Latency: 0.104 ms.
> Pending Tasks: 0
> Key cache capacity: 200000
>  Key cache size: 55172
> Key cache hit rate: 0.8079570484581498
> Row cache: disabled
>  Compacted row minimum size: 925
> Compacted row maximum size: 12108970
> Compacted row mean size: 846477
> ...
>
> All ideas and suggestions greatly appreciated as always!
>
> Dominic
> ria101.wordpress.com
>
>
>

Re: Database grows 10X bigger after running nodetool repair

Reply via email to