Cool thanks! On 4/30/13 9:10 PM, "Cheolsoo Park" <[email protected]> wrote:
>Hi Steven, > >The new AvroStorage will let you specify the input schema: >https://issues.apache.org/jira/browse/PIG-3015 > >In fact, somebody made the same request in a comment of the jira that I am >copying and pasting below: > >Furthermore, we occasionally have issues with pig jobs picking the old >> schema when we have a schema update. Manually specifying the schema >>would >> fix this and give us more flexibility in defining the data we want pig >>to >> pull from a file. > > >This jira is work in progress, but hopefully it will be in next major >released. > >Thanks, >Cheolsoo > > > >On Sat, Apr 27, 2013 at 3:24 PM, Enns, Steven <[email protected]> wrote: > >> Resending now that I am subscribed :) >> >> On 4/25/13 4:01 PM, "Enns, Steven" <[email protected]> wrote: >> >> >Hi everyone, >> > >> >I would like to override the input schema in AvroStorage to make a pig >> >script robust to schema evolution. For example, suppose a new field is >> >added to an avro schema with a default value of null. If the input to >>a >> >pig script using this field includes both old and new data, AvroStorage >> >will merge the input schemas from the old and new data. However, if >>the >> >input includes only old data, the new schema will not be available to >> >AvroStorage and pig will fail to interpret the script with an error >>such >> >as "projected field [newField] does not exist in schema". If >>AvroStorage >> >accepted an input schema, the script would be valid for both the new >>and >> >old data. Is there any plan to implement this? >> > >> >Thanks, >> >Steve >> > >> >>
