Zoltán Borók-Nagy created HIVE-20448:
----------------------------------------

             Summary: get_json_object() unnecessarily unnests array values
                 Key: HIVE-20448
                 URL: https://issues.apache.org/jira/browse/HIVE-20448
             Project: Hive
          Issue Type: Bug
            Reporter: Zoltán Borók-Nagy


Hive's implementation of get_json_object() unnecessarily unnests array values, 
e.g.:
{noformat}
hive> select get_json_object('[{"key": [1,2,3]}, {"key": [4,5]}]', '$[*].key');
OK
[1,2,3,4,5]{noformat}
I think for the above query the following output is more reasonable:
{noformat}
[[1,2,3],[4,5]]{noformat}
This is also the output that one gets from
 * [https://jsonpath.curiousconcept.com/] 
 * [http://jsonpath.com/]
 * [https://jsonpath.herokuapp.com/]

Python's jsonpath package also produces this output.

I tested it with Hive 2.1.1, but it is likely that it affects other versions as 
well.

If we'd really want to unnest the arrays, then '$[*].key[*]' is the way to go. 
It produces the same output everywhere.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to