milton4code opened a new issue, #20168: URL: https://github.com/apache/doris/issues/20168
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Description Version : doris-1.2.4-1 最近测试使用spark写入数据到doris中遇到一个问题,, 当dataFrame中的字段和目标表不一致,程序一直报错,错误信息不够详细. ### Use case 比如dataFrame中包含一个字段vmId,而目标表中对应的字段为vm_id ### Related issues 23/05/29 16:19:28 INFO spark.DorisStreamLoad: Streamload Response:status: 200, resp msg: OK, resp content: { "TxnId": 3346, "Label": "spark_streamload_20230529_161928_a36161a3a95444df97959b613f2564e8", "TwoPhaseCommit": "false", "Status": "Fail", "Message": "[ANALYSIS_ERROR]errCode = 2, detailMessage = Column has no default value. column: vm_id", "NumberTotalRows": 0, "NumberLoadedRows": 0, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 0, "LoadTimeMs": 0, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 1, "ReadDataTimeMs": 0, "WriteDataTimeMs": 0, "CommitAndPublishTimeMs": 0} 23/05/29 16:19:28 INFO sql.DorisSourceProvider: Send request to Doris FE 'http://star01:18030/api/backends?is_alive=true' with user 'root'. 23/05/29 16:19:28 INFO sql.DorisSourceProvider: Backend Info:{"backends":[{"ip":"192.125.50.11","http_port":18040,"is_alive":true},{"ip":"192.125.50.10","http_port":18040,"is_alive":true},{"ip":"192.125.50.12","http_port":18040,"is_alive":true}]} 23/05/29 16:19:28 INFO spark.DorisStreamLoad: Streamload Response:status: 200, resp msg: OK, resp content: { "TxnId": 3347, "Label": "spark_streamload_20230529_161928_79e20740ed2f4904a6e831cb3bdb2770", "TwoPhaseCommit": "false", "Status": "Fail", "Message": "[ANALYSIS_ERROR]errCode = 2, detailMessage = Column has no default value. column: vm_id", "NumberTotalRows": 0, "NumberLoadedRows": 0, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 0, "LoadTimeMs": 0, "BeginTxnTimeMs": 1, "StreamLoadPutTimeMs": 1, "ReadDataTimeMs": 0, "WriteDataTimeMs": 0, "CommitAndPublishTimeMs": 0} 23/05/29 16:19:28 INFO sql.DorisSourceProvider: Send request to Doris FE 'http://star01:18030/api/backends?is_alive=true' with user 'root'. 23/05/29 16:19:28 INFO sql.DorisSourceProvider: Backend Info:{"backends":[{"ip":"192.125.50.11","http_port":18040,"is_alive":true},{"ip":"192.125.50.10","http_port":18040,"is_alive":true},{"ip":"192.125.50.12","http_port":18040,"is_alive":true}]} 23/05/29 16:19:29 WARN sql.DorisSourceProvider: Data that failed to load : 419450 f1d42d65-de03-4dd4-a6bd-31ded40f5ca8 2023-05-29 15:51:05.19 23/05/29 16:19:29 ERROR executor.Executor: Exception in task 0.0 in stage 0.0 (TID 0) java.io.IOException: Failed to load data on BE: http://192.125.50.10:18040/api/mydb/dwd_virtual_meter/_stream_load? node and exceeded the max retry times. at org.apache.doris.spark.sql.DorisSourceProvider$$anonfun$createRelation$1$$anonfun$org$apache$doris$spark$sql$DorisSourceProvider$$anonfun$$flush$1$1.apply$mcV$sp(DorisSourceProvider.scala:118) at scala.util.control.Breaks.breakable(Breaks.scala:38) at org.apache.doris.spark.sql.DorisSourceProvider$$anonfun$createRelation$1.org$apache$doris$spark$sql$DorisSourceProvider$$anonfun$$flush$1(DorisSourceProvider.scala:92) at org.apache.doris.spark.sql.DorisSourceProvider$$anonfun$createRelation$1.apply(DorisSourceProvider.scala:83) at org.apache.doris.spark.sql.DorisSourceProvider$$anonfun$createRelation$1.apply(DorisSourceProvider.scala:68) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 23/05/29 16:19:29 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.io.IOException: Failed to load data on BE: http://192.125.50.10:18040/api/mydb/dwd_virtual_meter/_stream_load? node and exceeded the max retry times. at org.apache.doris.spark.sql.DorisSourceProvider$$anonfun$createRelation$1$$anonfun$org$apache$doris$spark$sql$DorisSourceProvider$$anonfun$$flush$1$1.apply$mcV$sp(DorisSourceProvider.scala:118) at scala.util.control.Breaks.breakable(Breaks.scala:38) at org.apache.doris.spark.sql.DorisSourceProvider$$anonfun$createRelation$1.org$apache$doris$spark$sql$DorisSourceProvider$$anonfun$$flush$1(DorisSourceProvider.scala:92) at org.apache.doris.spark.sql.DorisSourceProvider$$anonfun$createRelation$1.apply(DorisSourceProvider.scala:83) at org.apache.doris.spark.sql.DorisSourceProvider$$anonfun$createRelation$1.apply(DorisSourceProvider.scala:68) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org