Wenlei Xie created HIVE-12228: --------------------------------- Summary: Hive Error When query nested query with UDF returns Struct type Key: HIVE-12228 URL: https://issues.apache.org/jira/browse/HIVE-12228 Project: Hive Issue Type: Bug Components: Hive, Query Planning, UDF Affects Versions: 0.13.1 Reporter: Wenlei Xie
The following simple nested query with UDF returns Struct would fail on Hive 0.13.1 . The UDF java code is attached. {noformat} ADD JAR simplestruct.jar; CREATE TEMPORARY FUNCTION simplestruct AS 'test.SimpleStruct'; SELECT * FROM ( SELECT * from mytest ) subquery WHERE simplestruct(subquery.testStr).first {noformat} The error message is {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"testint":1,"testname":"haha","teststr":"hehe"} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: java.lang.RuntimeException: cannot find field teststr from [0:_col0, 1:_col1, 2:_col2] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150) .............................. {noformat} The query works fine if we replace the UDF returns Boolean. By comparing the query plan, we note when using the {{SimpleStruct}} UDF, the query plan is {noformat} TableScan Select Operator Filter Operator Select Operator {noformat} The first Select Operator would rename the columns to {{col_k}}, which cause this trouble. If we use some UDF returns Boolean, the query plan becomes {noformat} TableScan Filter Operator Select Operator {noformat} It looks like the Query Planner failed to push down the Filter Operator when the predicate is based on a UDF returns Struct. This bug was fixed in Hive 1.2.1, but we cannot find the ticket to fix it. Appendix: The table {{mytest}} is created in the following way {noformat} CREATE TABLE mytest(testInt INT, testName STRING, testStr STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; LOAD DATA LOCAL INPATH 'test.txt' INTO TABLE mytest; {noformat} The file {{test.txt}} is a simple CSV file. {noformat} 1,haha,hehe 2,my,test {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)