Re: Understanding Blocked and All Time Blocked columns in tpstats

Chris Lohfink Fri, 23 Mar 2018 10:31:05 -0700

Increasing queue would increase the number of requests waiting. It could make 
GCs worse if the requests are like large INSERTs, but for a lot of super tiny 
queries it helps to increase queue size (to a point). Might want to look into 
what and how queries are being made, since there are possibly options to help 
with that (ie prepared queries, what queries are, limiting number of async 
inflight queries)


Chris

> On Mar 23, 2018, at 11:42 AM, John Sanda <john.sa...@gmail.com> wrote:
> 
> Thanks for the explanation. In the past when I have run into problems related 
> to CASSANDRA-11363, I have increased the queue size via the 
> cassandra.max_queued_native_transport_requests system property. If I find 
> that the queue is frequently at capacity, would that be an indicator that the 
> node is having trouble keeping up with the load? And if so, will increasing 
> the queue size just exacerbate the problem?
> 
> On Fri, Mar 23, 2018 at 11:51 AM, Chris Lohfink <clohf...@apple.com 
> <mailto:clohf...@apple.com>> wrote:
> It blocks the caller attempting to add the task until theres room in queue, 
> applying back pressure. It does not reject it. It mimics the behavior from 
> pre-SEP DebuggableThreadPoolExecutor's RejectionExecutionHandler that the 
> other thread pools use (exception on sampling/trace which just throw away on 
> rejections).
> 
> Worth noting this is only really possible in the native transport pool (sep 
> pool) last I checked. Since 2.1 at least, before that there were a few 
> others. That changes version to version. For (basically) all other thread 
> pools the queue is limited by memory.
> 
> Chris
> 
> 
>> On Mar 22, 2018, at 10:44 PM, John Sanda <john.sa...@gmail.com 
>> <mailto:john.sa...@gmail.com>> wrote:
>> 
>> I have been doing some work on a cluster that is impacted by 
>> https://issues.apache.org/jira/browse/CASSANDRA-11363 
>> <https://issues.apache.org/jira/browse/CASSANDRA-11363>. Reading through the 
>> ticket prompted me to take a closer look at 
>> org.apache.cassandra.concurrent.SEPExecutor. I am looking at the 3.0.14 
>> code. I am a little confused about the Blocked and All Time Blocked columns 
>> reported in nodetool tpstats and reported by StatusLogger. I understand that 
>> there is a queue for tasks. In the case of RequestThreadPoolExecutor, the 
>> size of that queue can be controlled via the 
>> cassandra.max_queued_native_transport_requests system property.
>> 
>> I have been looking at SEPExecutor.addTask(FutureTask<?> task), and here is 
>> my question. If the queue is full, as defined by SEPExector.maxTasksQueued, 
>> are tasks rejected? I do not fully grok the code, but it looks like it is 
>> possible for tasks to be rejected here (some code and comments omitted for 
>> brevity):
>> 
>> public void addTask(FutureTask<?> task)
>> {
>>     tasks.add(task);
>>     ...
>>     else if (taskPermits >= maxTasksQueued) 
>>     {
>>         WaitQueue.Signal s = hasRoom.register();
>>         
>>         if (taskPermits(permits.get()) > maxTasksQueued)
>>         {
>>             if (takeWorkPermit(true))
>>                 pool.schedule(new Work(this))
>> 
>>             metrics.totalBlocked.inc();
>>             metrics.currentBlocked.inc();
>>             s.awaitUninterruptibly();
>>             metrics.currentBlocked.dec();
>>         }
>>         else
>>             s.cancel();
>>     }   
>> }
>> 
>> The first thing that happens is that the task is added to the tasks queue. 
>> pool.schedule() only gets called if takeWorkPermit() returns true. I am 
>> still studying the code, but can someone explain what exactly happens when 
>> the queue is full?
>> 
>> 
>> - John
> 
> 
> 
> 
> -- 
> 
> - John

Re: Understanding Blocked and All Time Blocked columns in tpstats

Reply via email to