Thanks for the reply, Matt. This is exactly what I'm looking for. I'll look 
into the explode function more deeply.

Another question: For that example table, your query generates a slightly 
different different result. The values for id 3 are not aggregated. So the 
result is as follows. Any idea what's happening?

id   value cnt
1    10    1
1    20    1
1    30    1
2    20    2
2    30    1
3    20    1
3    30    1
3    40    1
3    20    1
3    30    1
3    40    1


> From: matt.tuc...@disney.com
> To: user@hive.apache.org
> Date: Wed, 28 Mar 2012 14:25:43 -0400
> Subject: RE: Help in aggregating comma separated values
> 
> Hi,
> 
> The values column needs to be split across multiple rows, so you can then 
> aggregate them.  Try this:
> 
> SELECT id, value, COUNT(*) cnt
> FROM table1 LATERAL VIEW explode(split(values, ",")) values_tbl as value
> GROUP BY id, value
> 
> 
> 
> Matt Tucker
> 
> -----Original Message-----
> From: Saurabh S [mailto:saurab...@live.com] 
> Sent: Wednesday, March 28, 2012 2:21 PM
> To: user@hive.apache.org
> Subject: Help in aggregating comma separated values
> 
> 
> Hello,
> 
> How do I get count from a list of comma separated values? For the lack of 
> better wording, here is an example:
> 
> Suppose there is a table with two columns, id (integers) and values (string) 
> in the following fashion.
> 
> id   values
> 1    10,20,30
> 2    20,30
> 2    20
> 3    20,30,40
> 3    40,20,30
> 
> I want an output like this:
> 
> id   value count
> 1    10    1
> 1    20    1
> 1    30    1
> 2    20    2
> 2    30    1
> 3    20    2
> 3    30    2
> 3    40    2
> 
> So, it's a list containing the values those appear against each id, along 
> with their count.
> 
> I understand this is more of a SQL kind of question rather than one specific 
> to Hive, but I'm at a roadblock here.
> 
> Thanks,
> Saurabh
>                                         
                                          

Reply via email to