Are you referring to the same message class as: https://github.com/apache/kafka/blob/0.7/core/src/main/scala/kafka/message/Message.scala or are you talking bout a wrapper around this message class which has its own magic byte followed by SHA of schema? If its the former, I'm confused.
FYI, Looks like Camus gets a 4 byte identifier from a schema registry. https://github.com/linkedin/camus/blob/master/camus-etl-kafka/src/main/java/com/linkedin/camus/etl/kafka/coders/KafkaAvroMessageEncoder.java On Aug 22, 2013, at 9:37 AM, Neha Narkhede <neha.narkh...@gmail.com> wrote: > The point of the magic byte is to indicate the current version of the > message format. One part of the format is the fact that it is Avro encoded. > I'm not sure how Camus gets a 4 byte id, but at LinkedIn we use the 16 byte > MD5 hash of the schema. Since AVRO-1124 is not resolved yet, I'm not sure > if I can comment on the compatibility just yet. > > Thanks, > Neha > > > On Wed, Aug 21, 2013 at 9:00 PM, Mark <static.void....@gmail.com> wrote: > >> Neha, thanks for the response. >> >> So the only point of the magic byte is to indicate that the rest of the >> message is Avro encoded? I noticed that in Camus a 4 byte int id of the >> schema is written instead of the 16 byte SHA. Is this the new preferred >> way? Which is compatible with >> https://issues.apache.org/jira/browse/AVRO-1124? >> >> Thanks again >> >> On Aug 21, 2013, at 8:38 PM, Neha Narkhede <neha.narkh...@gmail.com> >> wrote: >> >>> We define the LinkedIn Kafka message to have a magic byte (indicating >> Avro >>> serialization), MD5 header followed by the payload. The Hadoop consumer >>> reads the MD5, looks up the schema in the repository and deserializes the >>> message. >>> >>> Thanks, >>> Neha >>> >>> >>> On Wed, Aug 21, 2013 at 8:15 PM, Mark <static.void....@gmail.com> wrote: >>> >>>> Does LinkedIn include the SHA of the schema into the header of each Avro >>>> message they write or do they wrap the avro message and prepend the SHA? >>>> >>>> In either case, how does the Hadoop consumer know what schema to read? >> >>