Sending it again. As I haven't got any reply on this. Any personal experience will be appreciated.
*Raihan Jamal* On Mon, Jul 9, 2012 at 3:37 PM, Raihan Jamal <jamalrai...@gmail.com> wrote: > *Problem Statement:-* > > I need to compare two tables Table1 and Table2 and they both store same > thing. So I need to compare Table2 with Table1 as Table1 is the main > table through which comparisons need to be made. So after comparing I need > to make a report that Table2 has some sort of discrepancy. And these two > tables has lots of data, around TB of data. So currently I have written > HiveQL to do the comparisons and get the data back. > > So my question is which is better in terms of PERFORMANCE, writing a CUSTOM > MAPPER and REDUCERto do this kind of job or the HiveQL that I wrote will > be fine as I will be joining these two tables on millions of records. As > far as I know HiveQL internally (behind the scenes) generates optimized > custom map-reducer and submits for execution and gets back the results. > > > *Raihan Jamal* > >