Sorry to ask what is probably a very naïve Hive question but here goes: I have a table as follows: Col1 Col2 K1 V1 K1 V1 K2 V1 K3 V1 K1 V2 K1 V3 K2 V2
Now I have managed to SELECT Col1,COUNT(DISTINCT Col2) FROM ... BY COL1; to obtain K1 3 K2 2 K3 1 But what I want is a concatenated list of all the distinct Col2 values for each Col1 key. K1 V1 V2 V3 K2 V1 V2 K3 V1 This is something absolutely trivial in MR but I cannot seem to find anything in Hive that will do this for me. Do I have to write a UDF to accomplish this? Alan