We define the LinkedIn Kafka message to have a magic byte (indicating Avro serialization), MD5 header followed by the payload. The Hadoop consumer reads the MD5, looks up the schema in the repository and deserializes the message.
Thanks, Neha On Wed, Aug 21, 2013 at 8:15 PM, Mark <static.void....@gmail.com> wrote: > Does LinkedIn include the SHA of the schema into the header of each Avro > message they write or do they wrap the avro message and prepend the SHA? > > In either case, how does the Hadoop consumer know what schema to read?