[jira] [Created] (KAFKA-747) RequestChannel re-design

Jay Kreps (JIRA) Fri, 01 Feb 2013 09:40:13 -0800

Jay Kreps created KAFKA-747:
-------------------------------

             Summary: RequestChannel re-design
                 Key: KAFKA-747
                 URL: https://issues.apache.org/jira/browse/KAFKA-747
             Project: Kafka
          Issue Type: New Feature
            Reporter: Jay Kreps
             Fix For: 0.8.1



We have had some discussion around how to handle queuing requests. There are 
two competing concerns:
1. We need to maintain request order on a per-socket basis.
2. We want to be able to balance load flexibly over a pool of threads so that 
if one thread blocks on I/O request processing continues.

Two Approaches We Have Considered

1. Have a global queue of unprocessed requests. All I/O threads read requests 
off this global queue and process them. To avoid re-ordering have the network 
layer only read one request at a time.
2. Have a queue per I/O thread and have the network threads statically map 
sockets to I/O thread request queues.

Problems With These Approaches

In the first case you are not able to get any per-producer parallelism. That is 
you can't read the next request while the current one is being handled. This 
seems like it would not be a big deal, but preliminary benchmarks show that it 
might be. 

In the second case there are two problems. The first is that when an I/O thread 
gets blocked all request processing for sockets attached to that I/O thread 
will grind to a halt. If you have 10,000 connections, and  10 I/O threads, then 
each blockage will stop 1,000 producers. If there is one topic that has long 
synchronous flush times enabled (or is experiencing fsync locking) this will 
cause big latency blips for all producers using that I/O thread. The next 
problem is around backpressure and memory management. Say we use BlockingQueues 
to feed the I/O threads. And say that one I/O thread stalls. It's request queue 
will fill up and it will then block ALL network threads, since they will block 
on inserting into that queue, even though the other I/O threads are unused and 
have empty queues.

A Proposed Better Solution

The problem with the first solution is that we are not pipelining requests. The 
problem with the second approach is that we are too constrained in moving work 
from one I/O thread to another.

Instead we should have a single request queue-like structure, but internally 
enforce the condition that requests are not re-ordered.

Here are the details. We retain RequestChannel but refactor its internals. 
Internally we replace the blocking queue with a linked list. We also keep an 
in-flight-keys array with one entry per I/O thread. When removing a work item 
from the list we can't just take the first thing. Instead we need to walk the 
list and look for something with a request key not in the in-flight-keys array. 
When a response is sent, we remove that key from the in-flight array.

This guarantees that requests for a socket with key K are ordered, but that 
processing for K can only block requests made by K.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (KAFKA-747) RequestChannel re-design

Reply via email to