This is a bug, or rather an unexpected usage. I suspect the correct count value is coming from statistics. Can you file a JIRA?
On 16/8/29, 00:51, "naveen mahadevuni" <nmahadev...@gmail.com> wrote: >Hi, > >Is the following behavior a bug? I believe at least one part of it is a >bug. I created two Hive tables at the same location and inserted rows in >two tables. count(*) returns the correct count for each individual table, >but SELECT * on one tables reads the rows from other table files too. > >CREATE TABLE test1 (col1 INT, col2 INT) >stored as orc >LOCATION '/apps/hive/warehouse/test1'; > >insert into test1 values(1,2); >insert into test1 values(3,4); > >hive> select count(*) from test1; >OK >2 >Time taken: 0.177 seconds, Fetched: 1 row(s) > > >CREATE TABLE test2 (col1 INT, col2 INT) >stored as orc >LOCATION '/apps/hive/warehouse/test1'; > >insert into test2 values(1,2); >insert into test2 values(3,4); > >hive> select count(*) from test2; >OK >2 >Time taken: 2.683 seconds, Fetched: 1 row(s) > >-- SELECT * fetches 4 records where as COUNT(*) above returns count of 2. > >hive> select * from test2; >OK >1 2 >3 4 >1 2 >3 4 >Time taken: 0.107 seconds, Fetched: 4 row(s) >hive> select * from test1; >OK >1 2 >3 4 >1 2 >3 4 >Time taken: 0.054 seconds, Fetched: 4 row(s) > >Thanks, >Naveen