[ https://issues.apache.org/jira/browse/HIVE-16792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957193#comment-16957193 ]
David Mollitor commented on HIVE-16792: --------------------------------------- Any updates on this watchers? > Estimate Rows When Joining BIGINT to INT Column > ----------------------------------------------- > > Key: HIVE-16792 > URL: https://issues.apache.org/jira/browse/HIVE-16792 > Project: Hive > Issue Type: Improvement > Affects Versions: 2.1.1 > Reporter: David Mollitor > Priority: Minor > > {code:sql} > create table test1 > (a int); > create table test2 > (z bigint); > INSERT INTO test1 VALUES (1); > INSERT INTO test2 VALUES (2147483648); > analyze table test1 compute statistics for columns; > analyze table test2 compute statistics for columns; > EXPLAIN SELECT * FROM test1 t1 INNER JOIN test2 t2 ON t1.a=t2.z; > {code} > {code} > Explain > STAGE DEPENDENCIES: > Stage-4 is a root stage > Stage-3 depends on stages: Stage-4 > Stage-0 depends on stages: Stage-3 > "" > STAGE PLANS: > Stage: Stage-4 > Map Reduce Local Work > Alias -> Map Local Tables: > t2 > Fetch Operator > limit: -1 > Alias -> Map Local Operator Tree: > t2 > TableScan > alias: t2 > filterExpr: z is not null (type: boolean) > Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column > stats: COMPLETE > Filter Operator > predicate: z is not null (type: boolean) > Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: COMPLETE > HashTable Sink Operator > keys: > 0 UDFToLong(a) (type: bigint) > 1 z (type: bigint) > Stage: Stage-3 > Map Reduce > Map Operator Tree: > TableScan > alias: t1 > filterExpr: UDFToLong(a) is not null (type: boolean) > Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column > stats: NONE > Filter Operator > predicate: UDFToLong(a) is not null (type: boolean) > Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE > Column stats: NONE > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 UDFToLong(a) (type: bigint) > 1 z (type: bigint) > outputColumnNames: _col0, _col4" > Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: _col0 (type: int), _col4 (type: bigint)" > outputColumnNames: _col0, _col1" > Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE > Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 1 Data size: 1 Basic stats: > COMPLETE Column stats: NONE > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Local Work: > Map Reduce Local Work > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} > I would expect that perhaps Hive would be smart enough to know that this join > is not going to produce any rows because the MIN VALUE of table test2 is more > than INTEGER.MAX_VALUE. -- This message was sent by Atlassian Jira (v8.3.4#803005)