Re: Dynamic columns in Hive Table - Best Design for the problem

2013-12-29 Thread Edward Capriolo
Basically when you have data like this, it is best to treat the all the columns as a single string and write a tool to break the entire row apart. You could use a UDF or a UDTF actually. Look at something like parseUrl... select myRow(row) as id string, events List A UDTF allows you to produ

Re: Dynamic columns in Hive Table - Best Design for the problem

2013-12-29 Thread Raj Hadoop
Matt, Thanks for the suggestion. Can you please provide more details on what type of UDAF should I develop ? I have never worked on a UDAF earlier. But would like to explore it. Any tips on how to proceed. Thanks, Raj On Saturday, December 28, 2013 2:47 PM, Matt Tucker wrote: It looks li

Re: Dynamic columns in Hive Table - Best Design for the problem

2013-12-28 Thread Matt Tucker
It looks like you're essentially doing a pivot function. Your best bet is to write a custom UDAF or look at the windowing functions available in recent releases. Matt On Dec 28, 2013 12:57 PM, "Raj Hadoop" wrote: > Dear All Hive Group Members, > > I have the following requirement. > > Input: > >

Dynamic columns in Hive Table - Best Design for the problem

2013-12-28 Thread Raj Hadoop
Dear All Hive Group Members, I have the following requirement. Input: Ticket#|Date of booking|Price 100|20-Oct-13|54 100|21-Oct-13|56 100|22-Oct-13|54 100|23-Oct-13|55 100|27-Oct-13|60 100|30-Oct-13|47 101|10-Sep-13|12 101|13-Sep-13|14 101|20-Oct-13|6 Expected Output: Ticket#|Initial|Delta1