Wenlei Xie created HIVE-12228:
---------------------------------

             Summary: Hive Error When query nested query with UDF returns 
Struct type
                 Key: HIVE-12228
                 URL: https://issues.apache.org/jira/browse/HIVE-12228
             Project: Hive
          Issue Type: Bug
          Components: Hive, Query Planning, UDF
    Affects Versions: 0.13.1
            Reporter: Wenlei Xie


The following simple nested query with UDF returns Struct would fail on Hive 
0.13.1 . The UDF java code is attached.

{noformat}
ADD JAR simplestruct.jar;
CREATE TEMPORARY FUNCTION simplestruct AS 'test.SimpleStruct';

SELECT *
  FROM (
    SELECT *
    from mytest
 ) subquery
WHERE simplestruct(subquery.testStr).first
{noformat}

The error message is 
{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row {"testint":1,"testname":"haha","teststr":"hehe"}
        at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
        ... 8 more
Caused by: java.lang.RuntimeException: cannot find field teststr from [0:_col0, 
1:_col1, 2:_col2]
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415)
        at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150)
..............................
{noformat}

The query works fine if we replace the UDF returns Boolean. By comparing the 
query plan, we note when using the {{SimpleStruct}} UDF, the query plan is 
{noformat}
          TableScan
            Select Operator
              Filter Operator
                Select Operator
{noformat}
The first Select Operator would rename the columns to {{col_k}}, which cause 
this trouble. If we use some UDF returns Boolean, the query plan becomes 
{noformat}
          TableScan
            Filter Operator
              Select Operator
{noformat}

It looks like the Query Planner failed to push down the Filter Operator when 
the predicate is based on a UDF returns Struct. 

This bug was fixed in Hive 1.2.1, but we cannot find the ticket to fix it.


Appendix: 
The table {{mytest}} is created in the following way
{noformat}
CREATE TABLE mytest(testInt INT, testName STRING, testStr STRING) ROW FORMAT 
DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

LOAD DATA LOCAL INPATH 'test.txt' INTO TABLE mytest;
{noformat}
The file {{test.txt}} is a simple CSV file.
{noformat}
1,haha,hehe
2,my,test
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to