Protocol documentation draft

Jay Kreps Thu, 29 Nov 2012 15:17:30 -0800

I started trying to document the 0.8 protocol from the code and write a
guide to client implementation. This is meant to be a more user-friendly
and up-to-date version of the proposal wiki we had on the protocol changes.


Here is what I wrote up so far:
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol

I would love feedback on this document. It would be great if we could
document everything you need to know to write a client so people don't need
to reverse engineer our code.

In doing this I found a number of things, some which I feel should be fixed
in 0.8 some of which maybe can wait.

1. Correlation id is not used across all the requests. I don't think it can
work as intended because of this.
2. On reflection I am not sure that we need a correlation id field. I think
that since we need to guarantee that processing is sequential on any
particular socket we can correlate with a simple queue. (e.g. as the client
sends messages it adds them to a queue and as it receives responses it just
correlates to whatever is at the head of the queue).
3. The metadata response seems to have a number of problems. Among them is
that it weirdly repeats all the broker information many times. The response
includes the ISR, leader (maybe), and the replicas. Each of these repeat
all the broker information. This is super weird. I think what we should be
doing here is including all broker information for all brokers and then
just having the appropriate ids for the isr, leader, and replicas.
4. For topic discovery I think we need to support the case where no topics
are specified in the metadata request and for this return information about
all topics. I don't think we do this now.
5. I don't understand what the creator id is.
6. The offset request and response is not fully thought through and should
be generalized.

-Jay

Protocol documentation draft

Reply via email to