Bingz2 opened a new issue, #5042: URL: https://github.com/apache/seatunnel/issues/5042
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened There is only one degree of parallelism when synchronizing data from hive to clickhouse using Spark2 ### SeaTunnel Version 2.3.2 ### SeaTunnel Config ```conf env { # You can set SeaTunnel environment configuration here job.mode = "BATCH" job.name = "seatunnel" checkpoint.interval = 10000 spark.executor.instances = 2 spark.executor.cores = 10 spark.executor.memory = "20g" spark.driver.memory = "2g" spark.dynamicAllocation.enabled = false } source { Hive { table_name = "ads.ads_dvblive_user_personas_vertical_stat_dd" metastore_uri = "thrift://slave6.test.we:9083" result_table_name = "test" } } transform { Sql { source_table_name="test" query = "select user_id,label_id,label_code,label_name,label_value,day from test where day='2023-06-11' " result_table_name = "fake1" } } sink { Clickhouse{ host="10.5.13.23:8123" database="ads" table="ads_dvblive_user_personas_vertical_stat_dd" username="test" password="123456" bulk_size=50000 clickhouse.confg={"socket_timeout": "50000"} } } ``` ### Running Command ```shell sh bin/start-seatunnel-spark-2-connector-v2.sh -m yarn -e client -c config/hive2ck.conf ``` ### Error Exception ```log 23/07/07 17:50:34 INFO v2.DataSourceV2Strategy: Pushing operators to class org.apache.seatunnel.translation.spark.source.SeaTunnelSourceSupport Pushed Filters: Post-Scan Filters: Output: user_id#0, mac#1, reserve_column#2, label_id#3, label_code#4, label_name#5, label_value#6, raw_partner_code#7, day#8, partner_code#9, product_line#10 23/07/07 17:50:34 INFO codegen.CodeGenerator: Code generated in 334.942834 ms 23/07/07 17:50:34 INFO codegen.CodeGenerator: Code generated in 24.618955 ms 23/07/07 17:50:35 INFO v2.WriteToDataSourceV2Exec: Start processing data source writer: org.apache.seatunnel.translation.spark.sink.writer.SparkDataSourceWriter@318353. The input RDD has 1 partitions. 23/07/07 17:50:35 INFO spark.SparkContext: Starting job: save at SinkExecuteProcessor.java:123 23/07/07 17:50:35 INFO scheduler.DAGScheduler: Got job 0 (save at SinkExecuteProcessor.java:123) with 1 output partitions 23/07/07 17:50:35 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (save at SinkExecuteProcessor.java:123) 23/07/07 17:50:35 INFO scheduler.DAGScheduler: Parents of final stage: List() 23/07/07 17:50:35 INFO scheduler.DAGScheduler: Missing parents: List() 23/07/07 17:50:35 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[4] at save at SinkExecuteProcessor.java:123), which has no missing parents 23/07/07 17:50:35 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 55.2 KB, free 4.1 GB) 23/07/07 17:50:35 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 18.9 KB, free 4.1 GB) 23/07/07 17:50:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on slave5.test.gitv.we:20280 (size: 18.9 KB, free: 4.1 GB) 23/07/07 17:50:35 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1161 23/07/07 17:50:35 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[4] at save at SinkExecuteProcessor.java:123) (first 15 tasks are for partitions Vector(0)) 23/07/07 17:50:35 INFO cluster.YarnScheduler: Adding task set 0.0 with 1 tasks 23/07/07 17:50:35 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all 23/07/07 17:50:35 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, slave6.test.gitv.we, executor 2, partition 0, PROCESS_LOCAL, 15260 bytes) 23/07/07 17:50:36 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on slave6.test.gitv.we:36895 (size: 18.9 KB, free: 10.5 GB) ``` ### Flink or Spark Version Spark Version:2.4.0 ### Java or Scala Version 1.8 ### Screenshots   ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
