imay opened a new issue #1155: Support format in LoadStmt URL: https://github.com/apache/incubator-doris/issues/1155 **Is your feature request related to a problem? Please describe.** For current load statement in Doris, only CSV file format is supported. However we need support more source file format for load or external table, such as Parquet, JSON. In this issue, I want to discuss about add format support in our `Load` statement. **Describe the solution you'd like** Now, our Load statement syntax is ``` LOAD LABEL load_label ( data_desc1[, data_desc2, ...] ) [opt_properties]; data_desc := DATA INFILE ( "file_path1"[, file_path2, ...] ) [NEGATIVE] INTO TABLE `table_name` [PARTITION (p1, p2)] [COLUMNS TERMINATED BY "column_separator"] [(column_list)] [SET (k1 = func(k2))] ``` I want to add `FORMAT` in `data_desc` clause. If FORMAT is not exist, FORMAT is decided by the file name's suffix, if the suffix isn't known by Doris, we seem it as a CSV format. After supporting `FORMAT`, the syntax of `data_desc` will change to ``` data_desc := DATA INFILE ( "file_path1"[, file_path2, ...] ) [NEGATIVE] INTO TABLE `table_name` [PARTITION (p1, p2)] [COLUMNS TERMINATED BY "column_separator"] [FORMAT AS format] [(column_list)] [SET (k1 = func(k2))] ``` And for `SET` clause, we currently only support some function. I want to support column reference in `column_list`. For example in following statement ``` DATA INFILE ("file_path") INTO TABLE testTable (c1_tmp, c2_tmp) SET (c1=c1_tmp, c2=c2_tmp) ``` `c1_tmp` and `c2_tmp` will be the name in source file. because for some formats, such as parquet, name is contained in file. so we need to use `column_list` to express which fields are we need from source file. And we use SET to convert fields in source file to content which is need in Doris table. So I would support column reference in SET clause
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@doris.apache.org For additional commands, e-mail: dev-h...@doris.apache.org