Thanks Yong!

On Mon, Apr 7, 2014 at 5:07 PM, java8964 <java8...@hotmail.com> wrote:
> Hi, Narayanan:
>
> The current problem is that for a generic solution, there is no way that we
> know that element in the Json is an array. Keep in mind that in any element
> of Json, it could be any valid structure. So it could be array, another
> structure, or map etc.
>
> You know your data, so you can say in this level, it is array. But computer
> doesn't know, that is why you need to provide a schema.
>
> Think about it, in programming, we can cast that to array, but normally that
> is NOT a good solution, so for a generic solution like any hadoop json UDF,
> it will and should ask for a schema.
>
> For you case, if you know the data, it gets to be array, then write your own
> UDF to cast it to an array, without any schema. But I don't think any good,
> generic Json UDFs will support that for your case.
>
> Yong
>
>> Date: Mon, 7 Apr 2014 16:47:44 -0700
>> Subject: Re: get_json_object for nested field returning a String instead
>> of an Array
>> From: knarayana...@gmail.com
>> To: user@hive.apache.org
>
>>
>> Thanks Peyman.
>>
>> Actually the problem with Hive-Json-Serde is that we need to provide
>> the entire schema upfront while creating the table.
>>
>> My requirement is that we just project/aggregate on the fields using
>> get_json_object after creating the external table without schema. This
>> way the external table is agnostic to any new schema changes.
>>
>> Would love to get a solution for converting get_json_object to return
>> an Array instead of a string.. Can we use any Hive UDFs to convert
>> string into an explodable Array object ?
>>
>> Thanks
>> Narayanan
>>
>> On Mon, Apr 7, 2014 at 4:14 PM, Peyman Mohajerian <mohaj...@gmail.com>
>> wrote:
>> > perhaps: https://github.com/rcongiu/Hive-JSON-Serde
>> >
>> >
>> > On Mon, Apr 7, 2014 at 6:52 PM, Narayanan K <knarayana...@gmail.com>
>> > wrote:
>> >>
>> >> Hi all
>> >>
>> >> I am using get_json_object to read a json text file. I have created
>> >> the external table as below :
>> >>
>> >> CREATE EXTERNAL TABLE EXT_TABLE ( json string)
>> >> PARTITIONED BY (dt string)
>> >> LOCATION '/users/abc/';
>> >>
>> >>
>> >> The json data has some fields that are not simple fields but fields
>> >> which are nested fields like - "field" : [{"id":1},{"id":2}.. ].
>> >>
>> >> While using the get_json_object to retrieve that field, it is
>> >> returning back a string instead of an Array. Hence I am not able to
>> >> explode the array as it is a string.
>> >>
>> >> Is there some way we can get an array of get_json_object instead of a
>> >> string so that we can perform explode on this nested field ? or Anyway
>> >> we can convert the string into an array so that I can use explode ?
>> >>
>> >> Thanks in advance,
>> >> Narayanan
>> >
>> >

Reply via email to