This paper has more information on what we are doing at LinkedIn: http://sites.computer.org/debull/A12june/pipeline.pdf
This Avro JIRA has a schema repository implementation similar to the one LinkedIn uses: https://issues.apache.org/jira/browse/AVRO-1124 -Jay On Tue, Aug 20, 2013 at 7:08 AM, Mark <static.void....@gmail.com> wrote: > Can someone break down how message serialization would work with Avro? > I've read instead of adding a schema to every single event it would be wise > to add some sort of fingerprint with each message to identify which schema > it should used. What I'm having trouble understanding is, how do we read > the fingerprint without a schema? Don't we need the schema to deserialize? > Same question goes for working with Hadoop.. how does the input format > know which schema to use? > > Thanks