antony-do opened a new issue, #7536: URL: https://github.com/apache/seatunnel/issues/7536
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened ### Description: I am encountering an issue while trying to run SeaTunnel on a GCP Dataproc cluster with the following setup: - Cluster Configuration: • GCP Dataproc cluster with Spark installed using YARN in cluster mode. • SeaTunnel plugins installed on every machine within the Dataproc cluster. • Airflow workers are located on separate machines, distinct from the Dataproc instances. - I've tried to run SeaTunnel in any node of the cluster it's working just fine, but I want to call Seatunnel from Airflow's workers which are different machine instead, so I've been setting up as the followings. ### Steps Taken: 1. Copied the Hadoop configuration files (yarn-site.xml, hdfs-site.xml, and core-site.xml) from the Dataproc cluster to the Airflow worker machines. 2. Updated these configuration files to point to the master node of the Dataproc cluster. ### Problem: When attempting to run SeaTunnel from the Airflow workers using the following command, I encounter an error: ### SeaTunnel Version 2.3.4 ### SeaTunnel Config ```conf env { parallelism = 4 job.mode = "BATCH" spark.executor.cores = 6 } source { Jdbc { url = "jdbc:mysql://mysql-host:3306/dbname" driver = "com.mysql.cj.jdbc.Driver" connection_check_timeout_sec = 100 user = "user_xxx" password = "pass_xxx" query = "SELECT *, DATE(created_time) as dt FROM TableX sh WHERE 1=1 AND `updated_time` >= STR_TO_DATE('2024-08-19_08:20:00.000','%Y-%m-%d_%H:%i:%S') AND `updated_time` < STR_TO_DATE('2024-08-19_08:25:00.000','%Y-%m-%d_%H:%i:%S') AND status IN (0,1,2,3,4,5,6);" } } transform { } sink { Hive { table_name = "table_xxx" metastore_uri = "thrift://metastore-host:9083" } } ``` ### Running Command ```shell ${SEATUNNEL_HOME}/bin/start-seatunnel-spark-3-connector-v2.sh --config test-seatunnel.conf --master yarn --deploy-mode cluster --name test-seatunnel ``` ### Error Exception ```log 24/08/30 09:41:43 ERROR SeaTunnel: Fatal Error, 24/08/30 09:41:43 ERROR SeaTunnel: Please submit bug report in https://github.com/apache/seatunnel/issues 24/08/30 09:41:43 ERROR SeaTunnel: Reason:Run SeaTunnel on spark failed 24/08/30 09:41:43 ERROR SeaTunnel: Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: Run SeaTunnel on spark failed at org.apache.seatunnel.core.starter.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:62) at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40) at org.apache.seatunnel.core.starter.spark.SeaTunnelSpark.main(SeaTunnelSpark.java:35) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:757) Caused by: org.apache.seatunnel.api.table.catalog.exception.CatalogException: ErrorCode:[API-03], ErrorDescription:[Catalog initialize failed] - Failed connecting to jdbc:mysql://mysql-host:3306/dbname via JDBC. at org.apache.seatunnel.connectors.seatunnel.jdbc.catalog.AbstractJdbcCatalog.getConnection(AbstractJdbcCatalog.java:121) at org.apache.seatunnel.connectors.seatunnel.jdbc.catalog.AbstractJdbcCatalog.open(AbstractJdbcCatalog.java:127) at org.apache.seatunnel.connectors.seatunnel.jdbc.utils.JdbcCatalogUtils.getTables(JdbcCatalogUtils.java:78) at org.apache.seatunnel.connectors.seatunnel.jdbc.source.JdbcSource.<init>(JdbcSource.java:57) at org.apache.seatunnel.connectors.seatunnel.jdbc.source.JdbcSourceFactory.lambda$createSource$0(JdbcSourceFactory.java:78) at org.apache.seatunnel.core.starter.execution.PluginUtil.createSource(PluginUtil.java:85) at org.apache.seatunnel.core.starter.spark.execution.SourceExecuteProcessor.initializePlugins(SourceExecuteProcessor.java:130) at org.apache.seatunnel.core.starter.spark.execution.SparkAbstractPluginExecuteProcessor.<init>(SparkAbstractPluginExecuteProcessor.java:50) at org.apache.seatunnel.core.starter.spark.execution.SourceExecuteProcessor.<init>(SourceExecuteProcessor.java:62) at org.apache.seatunnel.core.starter.spark.execution.SparkExecution.<init>(SparkExecution.java:54) at org.apache.seatunnel.core.starter.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:59) ... 7 more Caused by: java.sql.SQLException: No suitable driver found for jdbc:mysql://mysql-host:3306/dbname at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:702) at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:228) at org.apache.seatunnel.connectors.seatunnel.jdbc.catalog.AbstractJdbcCatalog.getConnection(AbstractJdbcCatalog.java:117) ... 17 more ``` ### Zeta or Flink or Spark Version _No response_ ### Java or Scala Version _No response_ ### Screenshots _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@seatunnel.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org