This is a bug, or rather an unexpected usage. I suspect the correct count
value is coming from statistics.
Can you file a JIRA?

On 16/8/29, 00:51, "naveen mahadevuni" <nmahadev...@gmail.com> wrote:

>Hi,
>
>Is the following behavior a bug? I believe at least one part of it is a
>bug. I created two Hive tables at the same location and inserted rows in
>two tables. count(*) returns the correct count for each individual table,
>but SELECT * on one tables reads the rows from other table files too.
>
>CREATE TABLE test1 (col1 INT, col2 INT)
>stored as orc
>LOCATION '/apps/hive/warehouse/test1';
>
>insert into test1 values(1,2);
>insert into test1 values(3,4);
>
>hive> select count(*) from test1;
>OK
>2
>Time taken: 0.177 seconds, Fetched: 1 row(s)
>
>
>CREATE TABLE test2 (col1 INT, col2 INT)
>stored as orc
>LOCATION '/apps/hive/warehouse/test1';
>
>insert into test2 values(1,2);
>insert into test2 values(3,4);
>
>hive> select count(*) from test2;
>OK
>2
>Time taken: 2.683 seconds, Fetched: 1 row(s)
>
>-- SELECT * fetches 4 records where as COUNT(*) above returns count of 2.
>
>hive> select * from test2;
>OK
>1       2
>3       4
>1       2
>3       4
>Time taken: 0.107 seconds, Fetched: 4 row(s)
>hive> select * from test1;
>OK
>1       2
>3       4
>1       2
>3       4
>Time taken: 0.054 seconds, Fetched: 4 row(s)
>
>Thanks,
>Naveen

Reply via email to