wang-zhiang opened a new issue, #4455: URL: https://github.com/apache/incubator-seatunnel/issues/4455
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened I want to import a test data of mongo into hbase, and I have built a table in hbase, but an error was reported during execution, and the error message was not obvious. I suspect this is a bug, and I hope you can give me an answer ### SeaTunnel Version 2.1.2 ### SeaTunnel Config ```conf #!/bin/bash env { execution.parallelism = 20 spark.executor.cores = 1 spark.executor.memory = "6g" } source { mongodb { readconfig.uri = "mongodb://smartpath:[email protected]:27017,192.168.5.102:27017,192.168.5.103:27017/admin" readconfig.database = "test2" readconfig.collection = ${sqlserver_table} readconfig.spark.mongodb.input.partitioner = "MongoPaginateBySizePartitioner" schema="{\"_id\": \"string\",\"name\": \"string\"}" result_table_name = "mongodb_result_table" } } transform { } sink { hbase { source_table_name = "mongodb_result_table" hbase.zookeeper.quorum = "hadoop104:2181,hadoop105:2181,hadoop106:2181,hadoop107:2181,hadoop108:2181,hadoop109:2181,hadoop110:2181" catalog ="{\"table\":{ \"namespace\":\"test1\", \"name\":\"test66\"},\"rowkey\":\"_id\",\"columns\":{\"_id\":{\"cf\":\"rowkey\", \"col\":\"_id\", \"type\":\"string\"},\"name\":{\"cf\":\"info\", \"col\":\"name\", \"type\":\"string\"}}}" staging_dir = "/hbase/test1/test66/" save_mode = "overwrite" hbase.bulkload.retries.number = "0" } } ``` ### Running Command ```shell /opt/module/seatunnel-2.1.2/bin/start-seatunnel-spark.sh \ --master spark://192.168.5.104:7077 \ --deploy-mode client \ --config /opt/module/seatunnel-2.1.2/script_spark/test/mongo-hbase-test.conf\ --variable sqlserver_table="copy1" ``` ### Error Exception ```log 2023-03-30 05:48:44,701 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on hadoop104:41444 in memory (size: 2.8 KB, free: 366.3 MB) 2023-03-30 05:48:44,724 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on 192.168.5.107:44381 in memory (size: 2.8 KB, free: 3.0 GB) 2023-03-30 05:48:44,770 WARN tool.LoadIncrementalHFiles: Attempt to bulk load region containing into table test1:test88 with files [family:info path:hdfs://mycluster/hbase/test1/test88/1680169709262/info/d73b5e5892e94c59ab162a55d233f8e2] failed. This is recoverable and they will be retried. 2023-03-30 05:48:44,777 INFO tool.LoadIncrementalHFiles: Split occurred while grouping HFiles, retry attempt 1 with 1 files remaining to group or split 2023-03-30 05:48:44,778 INFO hfile.CacheConfig: Created cacheConfig: CacheConfig:disabled 2023-03-30 05:48:44,786 INFO tool.LoadIncrementalHFiles: Trying to load hfile=hdfs://mycluster/hbase/test1/test88/1680169709262/info/d73b5e5892e94c59ab162a55d233f8e2 first=Optional[62e8df0cb7020000830054b2] last=Optional[ewfefw] 2023-03-30 05:48:44,801 WARN tool.LoadIncrementalHFiles: Attempt to bulk load region containing into table test1:test88 with files [family:info path:hdfs://mycluster/hbase/test1/test88/1680169709262/info/d73b5e5892e94c59ab162a55d233f8e2] failed. This is recoverable and they will be retried. 2023-03-30 05:48:44,806 INFO tool.LoadIncrementalHFiles: Split occurred while grouping HFiles, retry attempt 2 with 1 files remaining to group or split 2023-03-30 05:48:44,835 ERROR tool.LoadIncrementalHFiles: ------------------------------------------------- Bulk load aborted with some files not yet loaded: ------------------------------------------------- hdfs://mycluster/hbase/test1/test88/1680169709262/info/d73b5e5892e94c59ab162a55d233f8e2 2023-03-30 05:48:44,836 INFO client.ConnectionImplementation: Closing master protocol: MasterService 2023-03-30 05:48:44,838 INFO zookeeper.ReadOnlyZKClient: Close zookeeper connection 0x44f23927 to hadoop104:2181,hadoop105:2181,hadoop106:2181,hadoop107:2181,hadoop108:2181,hadoop109:2181,hadoop110:2181 2023-03-30 05:48:44,842 INFO zookeeper.ZooKeeper: Session: 0x70052749b6f004c closed 2023-03-30 05:48:44,842 INFO zookeeper.ClientCnxn: EventThread shut down 2023-03-30 05:48:44,955 ERROR base.Seatunnel: =============================================================================== 2023-03-30 05:48:44,956 ERROR base.Seatunnel: Fatal Error, 2023-03-30 05:48:44,956 ERROR base.Seatunnel: Please submit bug report in https://github.com/apache/incubator-seatunnel/issues 2023-03-30 05:48:44,956 ERROR base.Seatunnel: Reason:Execute Spark task error 2023-03-30 05:48:44,960 ERROR base.Seatunnel: Exception StackTrace:java.lang.RuntimeException: Execute Spark task error at org.apache.seatunnel.core.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:79) at org.apache.seatunnel.core.base.Seatunnel.run(Seatunnel.java:39) at org.apache.seatunnel.core.spark.SeatunnelSpark.main(SeatunnelSpark.java:32) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:855) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:930) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:939) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.IOException: Retry attempted 2 times without completing, bailing out at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.performBulkLoad(LoadIncrementalHFiles.java:420) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:343) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:256) at org.apache.seatunnel.spark.hbase.sink.Hbase.output(Hbase.scala:132) at org.apache.seatunnel.spark.hbase.sink.Hbase.output(Hbase.scala:41) at org.apache.seatunnel.spark.SparkEnvironment.sinkProcess(SparkEnvironment.java:179) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:54) at org.apache.seatunnel.core.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:76) ... 14 more 2023-03-30 05:48:44,960 ERROR base.Seatunnel: ``` ### Flink or Spark Version spark2.4 ### Java or Scala Version 1.8 ### Screenshots fail in import ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
