Hi, Roger, Good question on that. I am actually not aware of any "automatic" way of doing this in Avro. I have tried to add generic Schema and Data interface in samza-sql branch to address the morphing of the schemas from input streams to the output streams. The basic idea is to have wrapper Schema and Data classes on-top-of the deserialized objects to access the data fields according to the schema w/o changing and copying the actual data fields. Hence, when there is a need to morph the input data schemas into a new output data schema, we just need an implementation of the new output data Schema class that can read the corresponding data fields from the input data and write them out in the output schema. An interface function transform() is added in the Schema class for this exact purpose. Currently, it only takes one input data and one example of "projection" transformation can be found in the implementation of AvroSchema class. A join case as you presented may well be a reason to have an implementation of "join" with multiple input data.
All the above solution is still experimental and please feel free to provide your feedback and comments on that. If we agree that this solution is good and suit for a broader use case, it can be considered to be used outside the "SQL" context as well. Best regards! -Yi On Thu, Apr 9, 2015 at 8:55 AM, Roger Hoover <roger.hoo...@gmail.com> wrote: > Hi Milinda and others, > > This is an Avro question but since you guys are working on Avro support for > stream SQL, I thought I'd ask you for help. > > If I have a two records of type A and B as below and want to join them > similar to "SELECT *" in SQL to produce a record of type AB, is there an > simple way to do this with Avro without writing code to copy each field > individually? > > I appreciate any help. > > Thanks, > > Roger > > { > "name": "A", > "type": "record", > "namespace": "fubar", > "fields": [{"name": "a", "type" : "int"}] > } > > { > "name": "B", > "type": "record", > "namespace": "fubar", > "fields": [{"name": "b", "type" : "int"}] > } > > { > "name": "AB", > "type": "record", > "namespace": "fubar", > "fields": [{"name": "a", "type" : "int"}, {"name": "b", "type" : "int"}] > } >