I have a json document that has several sub documents. I can query this doc and 
pull out numbers from a repetitive subdocument. The result is a list within 
brackets. I want to sum these numbers, and cannot figure out how.

Here is a sample document. 
{    "_id": "005f2c46-4a70-4fed-a8e9-a445011d2aba",    "Revision": 7,    
"ActivityCountedCollection": [        {            "Lesson": 
"98d66ab9-1ef4-4b61-a05d-857b3e07e0f8",            "DateTime": 
"2015-03-02T14:11:05.5407801+00:00",            "ElapsedSeconds": 66.80226      
  },        {            "Lesson": "30698aac-5a3d-4464-935c-16de4ba9db70",
            "DateTime": "2015-03-02T14:18:29.7132608+00:00",
            "ElapsedSeconds": 435.24593        },        {            "Lesson": 
"30698aac-5a3d-4464-935c-16de4ba9db70",
            "DateTime": "2015-03-02T14:21:38.4381108+00:00",
            "ElapsedSeconds": 182.47075        }    ]}


I can use the following hive code from within HDInsight to pull out the values. 
DROP TABLE IF EXISTS HDStudents;CREATE EXTERNAL TABLE HDStudents(  
json_response STRING)STORED AS TEXTFILE LOCATION 
'wasb://b...@tjaystorage.blob.core.windows.net/HDStudentData';
DROP TABLE IF EXISTS HDStudent_Usage;CREATE TABLE HDStudent_Usage(  id STRING,  
sec STRING); FROM HDStudentsINSERT OVERWRITE TABLE HDStudent_UsageSELECT    
CAST(get_json_object(json_response, '$.id') as STRING),    
get_json_object(json_response, '$.ActivityCountedCollection.ElapsedSeconds') as 
STRING;
and the result ends up being something like the following
    \N[66.80226,435.24593,182.47075]

I dont know whats happening to my ID, but that is not important now. But the 
numbers from the above example come out as a list, within brackets. Its not 
even formed as JSon anymore, so the previous tricks to pull them out of their 
key value pair do not apply. Since it is reading it as a string, with commas 
and numbers, i cannot perform aggregate functions on it.Online, i get 
suggestions to create some maven compiled code function that is referenced, and 
i get lost attempting this too. I thought about pushing these results into yet 
another table, and doing some processing on it. But do not know how to extract 
the items from the list, or array, or whatever it is in now. 

Help?



   

Reply via email to