Hi,
We are experiencing periodic slow response issue. We investigated the issue and found it's related to hfile compaction. The slow down happens when there are many compaction activities in log. So we tuned some compaction parameters and also started to monitor the metric: compactionQueueLength. When the slow response happens, we can see the compactionQueueLength keeps increasing. In the log there is one item of major compaction completion every several minutes. One interesting finding is that compactionQueueLength keeps increasing to more than 1000 or even 3000 on some servers until at some point it drops to 0 suddenly, like it is it cleared by someone. There is nothing special in the log at the time and after that there is not much compaction activity. I searched the doc and web but couldn't find any explanation for that. Can anyone explain what happened? Thanks in advance. btw, our hbase version is 1.1.3