[ https://issues.apache.org/jira/browse/HIVE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16544385#comment-16544385 ]
Hive QA commented on HIVE-20079: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931668/HIVE-20079.2.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 14645 tests executed *Failed tests:* {noformat} TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=191) [druidmini_dynamic_partition.q,druidmini_expressions.q,druidmini_test_alter.q,druidmini_test1.q,druidmini_test_insert.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_struct_type_vectorization] (batchId=27) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_non_dictionary_encoding_vectorization] (batchId=89) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_vectorization] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_0] (batchId=17) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_10] (batchId=23) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_11] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_12] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_1] (batchId=11) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_parquet_projection] (batchId=45) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types] (batchId=70) org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_timestamptz] (batchId=192) org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_joins] (batchId=192) org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_masking] (batchId=192) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12619/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12619/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12619/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12931668 - PreCommit-HIVE-Build > Populate more accurate rawDataSize for parquet format > ----------------------------------------------------- > > Key: HIVE-20079 > URL: https://issues.apache.org/jira/browse/HIVE-20079 > Project: Hive > Issue Type: Improvement > Components: File Formats > Affects Versions: 2.0.0 > Reporter: Aihua Xu > Assignee: Aihua Xu > Priority: Major > Attachments: HIVE-20079.1.patch, HIVE-20079.2.patch > > > Run the following queries and you will see the raw data for the table is 4 > (that is the number of fields) incorrectly. We need to populate correct data > size so data can be split properly. > {noformat} > SET hive.stats.autogather=true; > CREATE TABLE parquet_stats (id int,str string) STORED AS PARQUET; > INSERT INTO parquet_stats values(0, 'this is string 0'), (1, 'string 1'); > DESC FORMATTED parquet_stats; > {noformat} > {noformat} > Table Parameters: > COLUMN_STATS_ACCURATE true > numFiles 1 > numRows 2 > rawDataSize 4 > totalSize 373 > transient_lastDdlTime 1530660523 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)