Hi everyone,
We have data set in the following format:
user1    item1    valueuser2    item1   valueuser3     item1   
value……………….user1     item2  valueuser20   item2  valueuser35   item2  
value………………user2     item3 valueuser25   item3  value…….
We have around 20 items and millions of users and not all users have entries 
for all the items. We would like to transform this into
user1 item1 value, item2, value, item3, value….user2 item4 value, item 18 
value, item 19 value…..
I can think of a couple of ways for doing this in Pig Latin. For example, one 
way would be to create a map (where key is item name and value is the 
associated value) and then fill out that map as you read the data. Then write 
it out to a file. I am not sure how efficient will that be. I would love to get 
suggestions for doing this in Pig Latin.

                                          

Reply via email to