Hello Warren,

I seems your C# client is both a producer and a consumer. Then with the
behavior of the broker, your suspension is correct that a long pooling
fetch using the same TCP connection will block subsequent produce /
metadata requests.

I think the statement that "it should not generally be necessary to
maintain multiple connections ..." is not valid anymore if your client acts
both as a producer and a consumer. In fact, in the 0.9 Java clients
(producer and consumer), we already could possibly maintain multiple
connections to a single broker even though the client only send produce or
fetch requests, because we need a separate channel for consumer
coordinator, etc, and we have also once discussed about using a separate
channel for metadata refresh.

So I think we should modify the above statement in the wiki. Thanks for
pointing out.

Guozhang

On Wed, May 13, 2015 at 7:44 AM, Warren Falk <war...@warrenfalk.com> wrote:

> I'm working on the C# client.  The current Kafka Protocol page says this:
>
> "it should not generally be necessary to maintain multiple connections to a
> single broker from a single client instance (i.e. connection pooling)"
>
> But then says this:
>
> "The server guarantees that on a single TCP connection, requests will be
> processed in the order they are sent and responses will return in that
> order as well".
>
> (
>
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> )
>
> Given that fetch requests can be long-polling, these two statements are
> mutually exclusive, are they not?  E.g. if I issue a long polling fetch to
> one broker for a topic partition and then need to issue another request to
> that same broker for any other reason (fetch/produce/metadata), my second
> request will hang until my long poll times out.  (I either need to use an
> unacceptably low poll timeout for the first request or I have to accept an
> unacceptably high latency for any second request to that broker, or I have
> to implement connection pooling and/or multiple connections to a single
> broker).
>
> Three things:
>
> 1. Am I just missing something obvious?
> 2. Is this changing in 0.9?  I know the consumer is getting a redesign, but
> is this broker issue addressed in some way?
> 3. Is this ordering over all requests even useful?
>
> On #3 the documentation goes on to say:  "The broker's request processing
> allows only a single in-flight request per connection in order to guarantee
> this ordering"
>
> As far as I can tell, order preservation is valuable only for produce
> requests and only per topic-partition. What else?  But especially once a
> fetch request goes to purgatory, why would the broker not continue
> processing other incoming requests?  (and of what actual use is a
> "correlation id" when all responses the server sends are always for the
> oldest in-flight request?)
>
> This problem affects the C# client (kafka-net) among other things.
> Brilliantly, after the C# consumer returns fetched messages to its caller,
> it immediately reissues another fetch request in the background, but this
> brilliance ends up backfiring because of the broker's behavior mentioned
> above.  Any attempt to publish to another topic while processing the
> consumed messages will have a mysterious sudden 95% drop in performance if
> the two topic partitions happen to be on the same broker.  The only
> solution seems to be to implement connection pooling.  This seems wrong.
>
> Despite the note that "it should not generally be necessary to maintain
> multiple connections to a single broker", the java (scala) SimpleConsumer
> appears to make a separate connection for each instance.
>
> So is the correct solution to have the C# client try to transparently
> manage multiple connections to the broker, or is it to have the broker more
> intelligently use a single connection?
>
> Thanks in advance and my apologies if this has been discussed elsewhere on
> the list.  I searched but couldn't find anything.
>



-- 
-- Guozhang

Reply via email to