[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364974#comment-16364974 ]
Hive QA commented on HIVE-18553: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12910631/HIVE-18553.91.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 13103 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=78) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=174) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=166) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=170) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=160) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde] (batchId=179) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=121) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] (batchId=250) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.client.TestFunctions.testGetFunctionNullDatabase[Embedded] (batchId=205) org.apache.hadoop.hive.metastore.client.TestTablesGetExists.testGetAllTablesCaseInsensitive[Embedded] (batchId=205) org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded] (batchId=205) org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=224) org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testInsertFromUnion (batchId=280) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=187) org.apache.hive.hcatalog.listener.TestDbNotificationListener.alterIndex (batchId=242) org.apache.hive.hcatalog.listener.TestDbNotificationListener.createIndex (batchId=242) org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropIndex (batchId=242) org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd (batchId=235) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveConflictKill (batchId=235) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9218/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9218/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9218/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 29 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12910631 - PreCommit-HIVE-Build > Support schema evolution in Parquet Vectorization reader > -------------------------------------------------------- > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task > Affects Versions: 3.0.0, 2.4.0, 2.3.2 > Reporter: Vihang Karajgaonkar > Assignee: Ferdinand Xu > Priority: Major > Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, > HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, > HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, > HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.91.patch, > HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx > > > For schema evolution, it includes the following points: > 1. column changes > column reorder > column add, column delete > column rename > 2. type conversion > low precision to high precision > type to String > For 1st type, current the code is not supporting the column addition > operation. Detailed error is as follows: > {code} > 0: jdbc:hive2://localhost:10000/default> desc test_p; > +-----------+------------+----------+ > | col_name | data_type | comment | > +-----------+------------+----------+ > | t1 | tinyint | | > | t2 | tinyint | | > | i1 | int | | > | i2 | int | | > +-----------+------------+----------+ > 0: jdbc:hive2://localhost:10000/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:10000/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:10000/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:10000/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > ~[hadoop-mapreduce-client-common-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[?:1.8.0_121] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[?:1.8.0_121] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[?:1.8.0_121] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > ~[?:1.8.0_121] > at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_121] > {code} > For 2nd type operation, non Vectorized Parquet reader leverages existing > Parquet String inspector to do the conversion while vectorized path does not. > To support, this JIRA is providing an abstract layer to read the underlying > data and convert it to what Hive required for further computing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)