So no one knows about this ?
I was hoping to use some knowledge already acquired on this subject :(


On Tue, Apr 11, 2017 at 2:09 AM, S G <sg.online.em...@gmail.com> wrote:

> Hi,
>
> There is a concept of JsonSerDe where you need to specify a structure for
> your tables in order to query them.
>
> However, since the schema for an object is prone to change (once every few
> months is not unexpected), how do you handle that change in your hive/pig
> queries?
>
> Moreover, since JSON files are not demarcated according to schema, it is
> possible that a single JSON file has json-data for multiple evolutions of a
> schema (Like 10 objects of ClassAnimal1, 20 of ClassAnimal2, 100 of
> ClassAnimal3 etc where ClassAnimal1, ClassAnimal2 and ClassAnimal3
> represent schema for ClassAnimal at different times).
>
> For such a JSON file, what is the recommended way of querying?
>
> I know that Avro solves this problem by maintaining a single file for a
> single-kind of schema. So it will have 3 files for the above case, 1 each
> for ClassAnimal1, ClassAnimal2 and ClassAnimal3)
>
> But since Avro is binary, hard to debug and requires a schema-repository
> (for non-hive use-cases), we were hoping to solve this problem in JSON.
>
> Related questions:
> 1) Is it even a problem worth solving?
> 2) How many people use AvroSerDe as compared to JsonSerDe?
>
> Thanks
> SG
>
>

Reply via email to