Cool thanks!

On 4/30/13 9:10 PM, "Cheolsoo Park" <[email protected]> wrote:

>Hi Steven,
>
>The new AvroStorage will let you specify the input schema:
>https://issues.apache.org/jira/browse/PIG-3015
>
>In fact, somebody made the same request in a comment of the jira that I am
>copying and pasting below:
>
>Furthermore, we occasionally have issues with pig jobs picking the old
>> schema when we have a schema update. Manually specifying the schema
>>would
>> fix this and give us more flexibility in defining the data we want pig
>>to
>> pull from a file.
>
>
>This jira is work in progress, but hopefully it will be in next major
>released.
>
>Thanks,
>Cheolsoo
>
>
>
>On Sat, Apr 27, 2013 at 3:24 PM, Enns, Steven <[email protected]> wrote:
>
>> Resending now that I am subscribed :)
>>
>> On 4/25/13 4:01 PM, "Enns, Steven" <[email protected]> wrote:
>>
>> >Hi everyone,
>> >
>> >I would like to override the input schema in AvroStorage to make a pig
>> >script robust to schema evolution.  For example, suppose a new field is
>> >added to an avro schema with a default value of null.  If the input to
>>a
>> >pig script using this field includes both old and new data, AvroStorage
>> >will merge the input schemas from the old and new data.  However, if
>>the
>> >input includes only old data, the new schema will not be available to
>> >AvroStorage and pig will fail to interpret the script with an error
>>such
>> >as "projected field [newField] does not exist in schema".  If
>>AvroStorage
>> >accepted an input schema, the script would be valid for both the new
>>and
>> >old data.  Is there any plan to implement this?
>> >
>> >Thanks,
>> >Steve
>> >
>>
>>

Reply via email to