Re: persistent compaction issue (1.1.4 and 1.1.5)

Michael Kjellman Thu, 20 Sep 2012 20:21:27 -0700

Ended up switching the biggest offending column families back to size tiered 
compaction and pending compactions across the cluster dropped to 0 very quickly.


On Sep 19, 2012, at 10:55 PM, "Michael Kjellman" <mkjell...@barracuda.com> 
wrote:

> After changing my ss_table_size as recommended my pending compactions across 
> the cluster have leveled off at 34808 but it isn't progressing after 24 hours 
> at that level.
> 
> As I've already changed the most offending column families I think the only 
> option I have left is to remove the .json files from all of the column 
> families and do another rolling restart...
> 
> Developing... Thanks for the help so far
> 
> On Sep 19, 2012, at 10:35 PM, "Віталій Тимчишин" 
> <tiv...@gmail.com<mailto:tiv...@gmail.com>> wrote:
> 
> I did see problems with schema agreement on 1.1.4, but they did go away after 
> rolling restart (BTW: it would be still good to check describe schema for 
> unreachable). Same rolling restart helped to force compactions after moving 
> to Leveled compaction. If your compactions still don't go, you can try 
> removing *.json files from the data directory of the stopped node to force 
> moving all SSTables to level0.
> 
> Best regards, Vitalii Tymchyshyn
> 
> 2012/9/19 Michael Kjellman 
> <mkjell...@barracuda.com<mailto:mkjell...@barracuda.com>>
> Potentially the pending compactions are a symptom and not the root
> cause/problem.
> 
> When updating a 3rd column family with a larger sstable_size_in_mb it
> looks like the schema may not be in a good state
> 
> [default@xxxx] UPDATE COLUMN FAMILY screenshots WITH
> compaction_strategy=LeveledCompactionStrategy AND
> compaction_strategy_options={sstable_size_in_mb: 200};
> 290cf619-57b0-3ad1-9ae3-e313290de9c9
> Waiting for schema agreement...
> Warning: unreachable nodes 10.8.30.102The schema has not settled in 10
> seconds; further migrations are ill-advised until it does.
> Versions are UNREACHABLE:[10.8.30.102],
> 290cf619-57b0-3ad1-9ae3-e313290de9c9:[10.8.30.15, 10.8.30.14, 10.8.30.13,
> 10.8.30.103, 10.8.30.104, 10.8.30.105, 10.8.30.106],
> f1de54f5-8830-31a6-9cdd-aaa6220cccd1:[10.8.30.101]
> 
> 
> However, tpstats looks good. And the schema changes eventually do get
> applied on *all* the nodes (even the ones that seem to have different
> schema versions). There are no communications issues between the nodes and
> they are all in the same rack
> 
> root@xxxx:~# nodetool tpstats
> Pool Name                    Active   Pending      Completed   Blocked
> All time blocked
> ReadStage                         0         0        1254592         0
>            0
> RequestResponseStage              0         0        9480827         0
>            0
> MutationStage                     0         0        8662263         0
>            0
> ReadRepairStage                   0         0         339158         0
>            0
> ReplicateOnWriteStage             0         0              0         0
>            0
> GossipStage                       0         0        1469197         0
>            0
> AntiEntropyStage                  0         0              0         0
>            0
> MigrationStage                    0         0           1808         0
>            0
> MemtablePostFlusher               0         0            248         0
>            0
> StreamStage                       0         0              0         0
>            0
> FlushWriter                       0         0            248         0
>            4
> MiscStage                         0         0              0         0
>            0
> commitlog_archiver                0         0              0         0
>            0
> InternalResponseStage             0         0           5286         0
>            0
> HintedHandoff                     0         0             21         0
>            0
> 
> Message type           Dropped
> RANGE_SLICE                  0
> READ_REPAIR                  0
> BINARY                       0
> READ                         0
> MUTATION                     0
> REQUEST_RESPONSE             0
> 
> So I'm guessing maybe the different schema versions may be potentially
> stopping compactions? Will compactions still happen if there are different
> versions of the schema?
> 
> 
> 
> 
> 
> On 9/18/12 11:38 AM, "Michael Kjellman" 
> <mkjell...@barracuda.com<mailto:mkjell...@barracuda.com>> wrote:
> 
>> Thanks, I just modified the schema on the worse offending column family
>> (as determined by the .json) from 10MB to 200MB.
>> 
>> Should I kick off a compaction on this cf now/repair?/scrub?
>> 
>> Thanks
>> 
>> -michael
>> 
>> From: Віталій Тимчишин 
>> <tiv...@gmail.com<mailto:tiv...@gmail.com><mailto:tiv...@gmail.com<mailto:tiv...@gmail.com>>>
>> Reply-To: 
>> "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
>> <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
>> To: 
>> "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
>> <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
>> Subject: Re: persistent compaction issue (1.1.4 and 1.1.5)
>> 
>> I've started to use LeveledCompaction some time ago and from my
>> experience this indicates some SST on lower levels than they should be.
>> The compaction is going, moving them up level by level, but total count
>> does not change as new data goes in.
>> The numbers are pretty high as for me. Such numbers mean a lot of files
>> (over 100K in single directory) and a lot of thinking for compaction
>> executor to decide what to compact next. I can see numbers like 5K-10K
>> and still thing this is high number. If I were you, I'd increase
>> sstable_size_in_mb 10-20 times it is now.
>> 
>> 2012/9/17 Michael Kjellman
>> <mkjell...@barracuda.com<mailto:mkjell...@barracuda.com><mailto:mkjell...@barracuda.com<mailto:mkjell...@barracuda.com>>>
>> Hi All,
>> 
>> I have an issue where each one of my nodes (currently all running at
>> 1.1.5) is reporting around 30,000 pending compactions. I understand that
>> a pending compaction doesn't necessarily mean it is a scheduled task
>> however I'm confused why this behavior is occurring. It is the same on
>> all nodes, occasionally goes down 5k pending compaction tasks, and then
>> returns to 25,000-35,000 compaction tasks pending.
>> 
>> I have tried a repair operation/scrub operation on two of the nodes and
>> while compactions initially happen the number of pending compactions does
>> not decrease.
>> 
>> Any ideas? Thanks for your time.
>> 
>> Best,
>> michael
>> 
>> 
>> 'Like' us on Facebook for exclusive content and other resources on all
>> Barracuda Networks solutions.
>> 
>> Visit http://barracudanetworks.com/facebook
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> Best regards,
>> Vitalii Tymchyshyn
>> 
>> 'Like' us on Facebook for exclusive content and other resources on all
>> Barracuda Networks solutions.
>> 
>> Visit http://barracudanetworks.com/facebook
> 
> 
> 'Like' us on Facebook for exclusive content and other resources on all 
> Barracuda Networks solutions.
> 
> Visit http://barracudanetworks.com/facebook
> 
> 
> 
> 
> 
> 
> 
> --
> Best regards,
> Vitalii Tymchyshyn
> 
> 'Like' us on Facebook for exclusive content and other resources on all 
> Barracuda Networks solutions.
> Visit http://barracudanetworks.com/facebook
> 
> 

'Like' us on Facebook for exclusive content and other resources on all 
Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook

Re: persistent compaction issue (1.1.4 and 1.1.5)

Reply via email to