You may want to look at partitioned tables and load data into partitions. For my that seems like the easiest way.
If you do not have a defined partition column in your data, then another approach is load data into a temporary staging table and from there load into partitioned table. In this approach the catch would be that the data you are getting does not have data for older partitions. I normally have an extra column added to my tables. Something like data_load_date which is my partition table. Then from the staging table I load data in this table with partition to be the date on which I am loading new data to table. On Thu, Mar 6, 2014 at 2:30 PM, Raj hadoop <raj.had...@gmail.com> wrote: > Hi Nitin, > > existing records should remain same and the new records should get > inserted into the table > > > On Thu, Mar 6, 2014 at 2:11 PM, Nitin Pawar <nitinpawar...@gmail.com>wrote: > >> are you talking about adding new records to tables or updating records in >> already existing table? >> >> >> On Thu, Mar 6, 2014 at 1:59 PM, Raj hadoop <raj.had...@gmail.com> wrote: >> >>> Query in HIVE >>> >>> >>> >>> I tried merge kind of operation in Hive to retain the existing records >>> and append the new records instead of dropping the table and populating it >>> again. >>> >>> >>> >>> If anyone can come help with any other approach other than this or the >>> approach to perform merge operation >>> >>> >>> >>> will be great help >>> >> >> >> >> -- >> Nitin Pawar >> > > -- Nitin Pawar