zengyangjie opened a new issue, #9974:
URL: https://github.com/apache/hudi/issues/9974

   I use spark-shell to run hudi,  and the following are the configuration 
arguments:
   `spark-shell --packages org.apache.hudi:hudi-spark3.3-bundle_2.12:0.12.0 
--conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' 
--conf 'spark.kryo.registrator=org.apache.spark.HoodieSparkKryoRegistrar'`
   In spark-shell, I import the following libraries and run a test case:
   `import org.apache.spark.sql.SaveMode._
   import org.apache.hudi.DataSourceWriteOptions._
   import org.apache.hudi.config.HoodieWriteConfig._
   spark.range(1).write.format("org.apache.hudi").option(TABLE_NAME, 
"hudi_tab01").option(PRECOMBINE_FIELD_OPT_KEY, 
"id").option(RECORDKEY_FIELD_OPT_KEY, 
"id").mode(Overwrite).save("/tmp/hudi_tab01")`
   Then the following error occurred:
   `warning: one deprecation; for details, enable :setting -deprecation or 
:replay -deprecation
   23/11/01 20:25:03 WARN HoodieSparkSqlWriter$: hoodie table at 
/tmp/hudi_tab01 already exists. Deleting existing data & overwriting with new 
data.
   23/11/01 20:25:03 WARN HoodieBackedTableMetadata: Metadata table was not 
found at path /tmp/hudi_tab01/.hoodie/metadata
   23/11/01 20:25:04 ERROR TorrentBroadcast: Store broadcast broadcast_1 fail, 
remove all pieces of the broadcast
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 
serialization failed: org.apache.spark.SparkException: Failed to register 
classes with Kryo
   org.apache.spark.SparkException: Failed to register classes with Kryo
           at 
org.apache.spark.serializer.KryoSerializer.$anonfun$newKryo$5(KryoSerializer.scala:183)
           at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
           at 
org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:233)
           at 
org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:171)`
   Provided that I did not compile the hudi source code with Maven and only 
downloaded the hudi dependency package in the spark-shell. 
   Is the error related to a jar dependency package? Is it related to the 
HUDI_CONF_DIR or hudi-defaults. conf file? If so, how should I configure it? 
Have any friends encountered this error and found a solution
   I would greatly appreciate it if it could be resolved!
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to