Is collect_set what you are looking for? I havent used it myself, but it seems to remove the duplicates..
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Built-in_Aggregate_Functions_.28UDAF.29 Thanks and Regards, Sonal <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases, Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho> Nube Technologies <http://www.nubetech.co> <http://in.linkedin.com/in/sonalgoyal> On Fri, Feb 11, 2011 at 9:43 AM, Tim Robertson <timrobertson...@gmail.com>wrote: > Hi all, > > Sorry if I am missing something obvious but is there an inverse of an > explode? > > E.g. given t1 > > ID Name > 1 Tim > 2 Tim > 3 Tom > 4 Frank > 5 Tim > > Can you create t2: > > Name ID > Tim 1,2,5 > Tom 3 > Frank 4 > > In Oracle it would be a > select name,collect(id) from t1 group by name > > I suspect in Hive it is related to an Array but can't find the syntax > > Thanks for any pointers, > Tim >