0x574C opened a new issue #8685: URL: https://github.com/apache/incubator-doris/issues/8685
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version 0.15.0-rc04 ### What's Wrong? Create a hive table: ``` create table target_table(id int,code string,p_day string) stored as parquet; insert into target_table values(1,'code1','2022-03-28'),(2,'code2','2022-03-28'); alter table target_table add columns (content string); insert into target_table values(4,'code1','2022-03-28','content4'),(5,'code2','2022-03-28','content5'); ``` There are two parquet file in hdfs after twice insert data: ``` [root@dev-master2 ~]# hdfs dfs -ls -h hdfs://dev-master2:8020/user/hive/warehouse/testdb.db/target_table Found 2 items -rwxrwx--x 3 hive hive 640 2022-03-28 09:49 hdfs://dev-master2:8020/user/hive/warehouse/testdb.db/target_table/000000_0 -rwxrwx--x 3 hive hive 803 2022-03-28 09:56 hdfs://dev-master2:8020/user/hive/warehouse/testdb.db/target_table/000000_0_copy_1 ``` Create a doris table: ``` create table doris_table (id int,content string,code string,p_day date) partition by range(p_day) ( partition p20220328 values less than ("2022-03-29") ) DISTRIBUTED BY HASH(code) PROPERTIES("replication_num" = "1"); ``` Load data from hive table: ``` LOAD LABEL test.target_table_label ( DATA INFILE("hdfs://dev-master2:8020/user/hive/warehouse/testdb.db/target_table/*") INTO TABLE `doris_table` FORMAT AS "parquet" ) WITH BROKER broker_name ("username"="hdfs", "password"="hdfs") ``` <b>The load job failed with follow error message because the `content` column info not in parquet file `000000_0`.</b> ``` type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = file: hdfs://dev-master2:8020/user/hive/warehouse/testdb.db/target_table/000000_0 error:Invalid Column Name:content ``` Merge two parquet file: ``` insert overwrite table target_table select * from target_table; ``` Rerun the load task, and the task finished ### What You Expected? Before merging two parquet files, the load task should run successfully, and the columns not in the parquet file are automatically set to null. ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
