Option 1 ) Use pig or oozie, write a workflow and join the files to a single file Option 2 ) Create a temp table for each of the different file and then join them to a single table and delete temp table Option 3 ) don't do anything, change your queries to look at three different files when they query about different files
Wait for others to give better suggestions :) On Fri, Jul 26, 2013 at 4:22 PM, Ramasubramanian Narayanan < ramasubramanian.naraya...@gmail.com> wrote: > Hi, > > Please help in providing solution for the below problem... this scenario > is applicable in Banking atleast... > > I have a HIVE table with the below structure... > > Hive Table: > Field1 > ... > Field 10 > > > For the above table, I will get the values for each feed in different > file. You can imagine that these files belongs to same branch and will get > at any time interval. I have to load into table only if I get all 3 files > for the same branch. (assume that we have a common field in all the files > to join) > > *Feed file 1 :* > EMP ID > Field 1 > Field 2 > Field 6 > Field 9 > > *Feed File2 :* > EMP ID > Field 5 > Field 7 > Field 10 > > *Feed File3 :* > EMP ID > Field 3 > Field 4 > Field 8 > > Now the question is, > what is the best way to make all these files to make it as a single file > so that it can be placed under the HIVE structure. > > regards, > Rams > -- Nitin Pawar