Hi,

I have uploaded few csv files from windows into hive and configured few
external tables using them. When I am trying to run a join on two tables
one of the int columns
get changed to 0. The structure of the tables are as follows:


Table-1                                        Table-2
------------                                        -----------

Id(int)                                          id(int)   datetime
eid(int)
--                                                  ----     ------------
   -----
1                                                    1   2011-02-01   3
2                                                    1   2011-03-01   4
3                                                    2   2011-04-01   5
                                                      4   2011-05-01   6
                                                      6   2011-06-01   7


The join query is - select a.* from Table-2 a join Table-1 b on (a.id=b.id);

The output is:

1  2011-02-01   0
1  2011-03-01   0
2  2011-04-01   0


I checked the logs and noticed the following warning : WARN
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct: Extra bytes
detected at the end of the row! Ignoring similar problems.Could this be
causing it ?

When I turn on hive.auto.convert.join=true , the error goes away as there
is no reduce phase.The output is:

1  2011-02-01   3
1  2011-03-01   4
2  2011-04-01   5

Could somebody please help me figure out why we get the wrong results when
running through the reducer.
-- 
Thanks

Reply via email to