Thanks for the reply, Matt. This is exactly what I'm looking for. I'll look into the explode function more deeply.
Another question: For that example table, your query generates a slightly different different result. The values for id 3 are not aggregated. So the result is as follows. Any idea what's happening? id value cnt 1 10 1 1 20 1 1 30 1 2 20 2 2 30 1 3 20 1 3 30 1 3 40 1 3 20 1 3 30 1 3 40 1 > From: matt.tuc...@disney.com > To: user@hive.apache.org > Date: Wed, 28 Mar 2012 14:25:43 -0400 > Subject: RE: Help in aggregating comma separated values > > Hi, > > The values column needs to be split across multiple rows, so you can then > aggregate them. Try this: > > SELECT id, value, COUNT(*) cnt > FROM table1 LATERAL VIEW explode(split(values, ",")) values_tbl as value > GROUP BY id, value > > > > Matt Tucker > > -----Original Message----- > From: Saurabh S [mailto:saurab...@live.com] > Sent: Wednesday, March 28, 2012 2:21 PM > To: user@hive.apache.org > Subject: Help in aggregating comma separated values > > > Hello, > > How do I get count from a list of comma separated values? For the lack of > better wording, here is an example: > > Suppose there is a table with two columns, id (integers) and values (string) > in the following fashion. > > id values > 1 10,20,30 > 2 20,30 > 2 20 > 3 20,30,40 > 3 40,20,30 > > I want an output like this: > > id value count > 1 10 1 > 1 20 1 > 1 30 1 > 2 20 2 > 2 30 1 > 3 20 2 > 3 30 2 > 3 40 2 > > So, it's a list containing the values those appear against each id, along > with their count. > > I understand this is more of a SQL kind of question rather than one specific > to Hive, but I'm at a roadblock here. > > Thanks, > Saurabh >