antony-do opened a new issue, #7536:
URL: https://github.com/apache/seatunnel/issues/7536

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22)
 and found no similar issues.
   
   
   ### What happened
   
   
   ### Description:
   
   I am encountering an issue while trying to run SeaTunnel on a GCP Dataproc 
cluster with the following setup:
   
   - Cluster Configuration:
        • GCP Dataproc cluster with Spark installed using YARN in cluster mode.
        • SeaTunnel plugins installed on every machine within the Dataproc 
cluster.
        • Airflow workers are located on separate machines, distinct from the 
Dataproc instances.
   
   - I've tried to run SeaTunnel in any node of the cluster it's working just 
fine, but I want to call Seatunnel from Airflow's workers which are different 
machine instead, so I've been setting up as the followings.
   
   ### Steps Taken:
   
        1. Copied the Hadoop configuration files (yarn-site.xml, hdfs-site.xml, 
and core-site.xml) from the Dataproc cluster to the Airflow worker machines.
        2. Updated these configuration files to point to the master node of the 
Dataproc cluster.
   
   ### Problem:
   When attempting to run SeaTunnel from the Airflow workers using the 
following command, I encounter an error:
   
   ### SeaTunnel Version
   
   2.3.4
   
   ### SeaTunnel Config
   
   ```conf
   env {  
     parallelism = 4
     job.mode = "BATCH"
     spark.executor.cores = 6
   }
   
   source {
     Jdbc {
       url = "jdbc:mysql://mysql-host:3306/dbname"
       driver = "com.mysql.cj.jdbc.Driver"
       connection_check_timeout_sec = 100
       user = "user_xxx"
       password = "pass_xxx"
       query = "SELECT *, DATE(created_time) as dt FROM TableX sh WHERE 1=1 AND 
`updated_time` >= STR_TO_DATE('2024-08-19_08:20:00.000','%Y-%m-%d_%H:%i:%S') 
AND `updated_time` < STR_TO_DATE('2024-08-19_08:25:00.000','%Y-%m-%d_%H:%i:%S') 
AND status IN (0,1,2,3,4,5,6);"
     }
   }
   
   transform {
   }
   
   sink {
     Hive {
       table_name = "table_xxx"
       metastore_uri = "thrift://metastore-host:9083"
     }
   }
   ```
   
   
   ### Running Command
   
   ```shell
   ${SEATUNNEL_HOME}/bin/start-seatunnel-spark-3-connector-v2.sh --config 
test-seatunnel.conf --master yarn  --deploy-mode cluster --name test-seatunnel
   ```
   
   
   ### Error Exception
   
   ```log
   24/08/30 09:41:43 ERROR SeaTunnel: Fatal Error, 
   
   24/08/30 09:41:43 ERROR SeaTunnel: Please submit bug report in 
https://github.com/apache/seatunnel/issues
   
   24/08/30 09:41:43 ERROR SeaTunnel: Reason:Run SeaTunnel on spark failed 
   
   24/08/30 09:41:43 ERROR SeaTunnel: Exception 
StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: 
Run SeaTunnel on spark failed
        at 
org.apache.seatunnel.core.starter.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:62)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at 
org.apache.seatunnel.core.starter.spark.SeaTunnelSpark.main(SeaTunnelSpark.java:35)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:757)
   Caused by: 
org.apache.seatunnel.api.table.catalog.exception.CatalogException: 
ErrorCode:[API-03], ErrorDescription:[Catalog initialize failed] - Failed 
connecting to jdbc:mysql://mysql-host:3306/dbname via JDBC.
        at 
org.apache.seatunnel.connectors.seatunnel.jdbc.catalog.AbstractJdbcCatalog.getConnection(AbstractJdbcCatalog.java:121)
        at 
org.apache.seatunnel.connectors.seatunnel.jdbc.catalog.AbstractJdbcCatalog.open(AbstractJdbcCatalog.java:127)
        at 
org.apache.seatunnel.connectors.seatunnel.jdbc.utils.JdbcCatalogUtils.getTables(JdbcCatalogUtils.java:78)
        at 
org.apache.seatunnel.connectors.seatunnel.jdbc.source.JdbcSource.<init>(JdbcSource.java:57)
        at 
org.apache.seatunnel.connectors.seatunnel.jdbc.source.JdbcSourceFactory.lambda$createSource$0(JdbcSourceFactory.java:78)
        at 
org.apache.seatunnel.core.starter.execution.PluginUtil.createSource(PluginUtil.java:85)
        at 
org.apache.seatunnel.core.starter.spark.execution.SourceExecuteProcessor.initializePlugins(SourceExecuteProcessor.java:130)
        at 
org.apache.seatunnel.core.starter.spark.execution.SparkAbstractPluginExecuteProcessor.<init>(SparkAbstractPluginExecuteProcessor.java:50)
        at 
org.apache.seatunnel.core.starter.spark.execution.SourceExecuteProcessor.<init>(SourceExecuteProcessor.java:62)
        at 
org.apache.seatunnel.core.starter.spark.execution.SparkExecution.<init>(SparkExecution.java:54)
        at 
org.apache.seatunnel.core.starter.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:59)
        ... 7 more
   Caused by: java.sql.SQLException: No suitable driver found for 
jdbc:mysql://mysql-host:3306/dbname
        at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:702)
        at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:228)
        at 
org.apache.seatunnel.connectors.seatunnel.jdbc.catalog.AbstractJdbcCatalog.getConnection(AbstractJdbcCatalog.java:117)
        ... 17 more
   ```
   
   
   ### Zeta or Flink or Spark Version
   
   _No response_
   
   ### Java or Scala Version
   
   _No response_
   
   ### Screenshots
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@seatunnel.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to