Hi Narayanan,
We have had some success with a similar use case using a custom input
format / record reader to recursively split arbitrary json into a set of
discreet records at runtime. No schema is needed. Doing something similar
might give you the functionality you are looking for.
https://githu
I am glad that I could help.
> >
> > Br,
> > Petter
> >
> >
> > 2014-04-04 6:02 GMT+02:00 David Quigley :
> >>
> >> Thanks again Petter, the custom input format was exactly what I needed.
> >> Here is example of my code in case anyone is
d.myserde'
> STORED AS INPUTFORMAT 'quigley.david.myinputformat' OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION 'mylocation';
>
>
> Hope this helps.
>
> Br,
> Petter
>
>
>
>
> 2014-04-02 5:45 GMT+02:00 David Quigley :
>
> We are cur
hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION 'mylocation';
>
>
> Hope this helps.
>
> Br,
> Petter
>
>
>
>
> 2014-04-02 5:45 GMT+02:00 David Quigley :
>
> We are currently streaming complex documents to hdfs with the hope of
>>
We are currently streaming complex documents to hdfs with the hope of being
able to query. Each single document logically breaks down into a set of
individual records. In order to use Hive, we preprocess each input document
into a set of discreet records, which we save on HDFS and create an
externa