[ https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382839#comment-14382839 ]
Hive QA commented on HIVE-10086: -------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12707516/HIVE-10086.2.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 8678 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_table_with_subschema org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1 org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3171/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3171/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3171/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12707516 - PreCommit-HIVE-TRUNK-Build > Hive throws error when accessing Parquet file schema using field name match > --------------------------------------------------------------------------- > > Key: HIVE-10086 > URL: https://issues.apache.org/jira/browse/HIVE-10086 > Project: Hive > Issue Type: Bug > Affects Versions: 1.0.0 > Reporter: Sergio Peña > Assignee: Sergio Peña > Attachments: HIVE-10086.3.patch, HiveGroup.parquet > > > When Hive table schema contains a portion of the schema of a Parquet file, > then the access to the values should work if the field names match the > schema. This does not work when a struct<> data type is in the schema, and > the Hive schema contains just a portion of the struct elements. Hive throws > an error instead. > This is the example and how to reproduce: > First, create a parquet table, and add some values on it: > {code} > CREATE TABLE test1 (id int, name string, address > struct<number:int,street:string,zip:string>) STORED AS PARQUET; > INSERT INTO TABLE test1 SELECT 1, 'Roger', > named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM > srcpart LIMIT 1; > {code} > Note: {{srcpart}} could be any table. It is just used to leverage the INSERT > statement. > The above table example generates the following Parquet file schema: > {code} > message hive_schema { > optional int32 id; > optional binary name (UTF8); > optional group address { > optional int32 number; > optional binary street (UTF8); > optional binary zip (UTF8); > } > } > {code} > Afterwards, I create a table that contains just a portion of the schema, and > load the Parquet file generated above, a query will fail on that table: > {code} > CREATE TABLE test1 (name string, address struct<street:string>) STORED AS > PARQUET; > LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1; > hive> SELECT name FROM test1; > OK > Roger > Time taken: 0.071 seconds, Fetched: 1 row(s) > hive> SELECT address FROM test1; > OK > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.UnsupportedOperationException: Cannot inspect > org.apache.hadoop.io.IntWritable > Time taken: 0.085 seconds > {code} > I would expect that Parquet can access the matched names, but Hive throws an > error instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)