Hello Warren, I seems your C# client is both a producer and a consumer. Then with the behavior of the broker, your suspension is correct that a long pooling fetch using the same TCP connection will block subsequent produce / metadata requests.
I think the statement that "it should not generally be necessary to maintain multiple connections ..." is not valid anymore if your client acts both as a producer and a consumer. In fact, in the 0.9 Java clients (producer and consumer), we already could possibly maintain multiple connections to a single broker even though the client only send produce or fetch requests, because we need a separate channel for consumer coordinator, etc, and we have also once discussed about using a separate channel for metadata refresh. So I think we should modify the above statement in the wiki. Thanks for pointing out. Guozhang On Wed, May 13, 2015 at 7:44 AM, Warren Falk <war...@warrenfalk.com> wrote: > I'm working on the C# client. The current Kafka Protocol page says this: > > "it should not generally be necessary to maintain multiple connections to a > single broker from a single client instance (i.e. connection pooling)" > > But then says this: > > "The server guarantees that on a single TCP connection, requests will be > processed in the order they are sent and responses will return in that > order as well". > > ( > > https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol > ) > > Given that fetch requests can be long-polling, these two statements are > mutually exclusive, are they not? E.g. if I issue a long polling fetch to > one broker for a topic partition and then need to issue another request to > that same broker for any other reason (fetch/produce/metadata), my second > request will hang until my long poll times out. (I either need to use an > unacceptably low poll timeout for the first request or I have to accept an > unacceptably high latency for any second request to that broker, or I have > to implement connection pooling and/or multiple connections to a single > broker). > > Three things: > > 1. Am I just missing something obvious? > 2. Is this changing in 0.9? I know the consumer is getting a redesign, but > is this broker issue addressed in some way? > 3. Is this ordering over all requests even useful? > > On #3 the documentation goes on to say: "The broker's request processing > allows only a single in-flight request per connection in order to guarantee > this ordering" > > As far as I can tell, order preservation is valuable only for produce > requests and only per topic-partition. What else? But especially once a > fetch request goes to purgatory, why would the broker not continue > processing other incoming requests? (and of what actual use is a > "correlation id" when all responses the server sends are always for the > oldest in-flight request?) > > This problem affects the C# client (kafka-net) among other things. > Brilliantly, after the C# consumer returns fetched messages to its caller, > it immediately reissues another fetch request in the background, but this > brilliance ends up backfiring because of the broker's behavior mentioned > above. Any attempt to publish to another topic while processing the > consumed messages will have a mysterious sudden 95% drop in performance if > the two topic partitions happen to be on the same broker. The only > solution seems to be to implement connection pooling. This seems wrong. > > Despite the note that "it should not generally be necessary to maintain > multiple connections to a single broker", the java (scala) SimpleConsumer > appears to make a separate connection for each instance. > > So is the correct solution to have the C# client try to transparently > manage multiple connections to the broker, or is it to have the broker more > intelligently use a single connection? > > Thanks in advance and my apologies if this has been discussed elsewhere on > the list. I searched but couldn't find anything. > -- -- Guozhang