[ https://issues.apache.org/jira/browse/HIVE-22495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974753#comment-16974753 ]
Hive QA commented on HIVE-22495: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12985879/HIVE-22495.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 17706 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.TestHiveMetaStoreAlterColumnPar.org.apache.hadoop.hive.metastore.TestHiveMetaStoreAlterColumnPar (batchId=248) org.apache.hadoop.hive.metastore.TestPartitionManagement.testPartitionDiscoveryTransactionalTable (batchId=224) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=284) org.apache.hive.service.server.TestHS2HttpServerPamConfiguration.testPamCorrectConfiguration (batchId=240) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19430/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19430/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19430/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12985879 - PreCommit-HIVE-Build > Parquet count(*) read in all data > --------------------------------- > > Key: HIVE-22495 > URL: https://issues.apache.org/jira/browse/HIVE-22495 > Project: Hive > Issue Type: Bug > Components: Reader > Reporter: Jason Xu > Assignee: Jason Xu > Priority: Major > Attachments: HIVE-22495.patch, HIVE-22495.patch > > > Running a hive query on a Parquet table > select count ( * ) from test_table > The query read in all data (all columns) instead of just metadata. > For comparison, hive 0.13 and Spark read in much less data with my test table. > > ||engine||HDFS data read|| > |Hive 2.3.4| 452.9 MB| > |Hive 0.13| 22.5 KB| > |Spark| 41.6 KB| > > Seems cause is that Parquet read support fall back to file schema if > indexColumnsWanted is empty, logic still exist in master branch. > Don't know why this empty list check was added, please suggest if there're > any other impact. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)