sunke38 opened a new issue, #5765:
URL: https://github.com/apache/hudi/issues/5765
I use Spark Sql to insert record to hudi. It work for a short time. However
It throw "java.lang.NoSuchMethodError:
org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()" after a
while.
Steps to reproduce the behavior:
I wrote a scala fuction to make instert sql
```
private def write2Table(row: Row)(implicit sparkSession: SparkSession):
Unit = {
val filedSql = new StringBuilder()
val filed = row.schema.fields.map(field =>{
var value = ""
if(row.getString(row.fieldIndex(field.name)).isEmpty){
value = s"""null as ${field.name}"""
value
}else{
field.dataType match {
case StringType =>{value=s"""\'${row.getAs[String](field.name)}\' as
${field.name}"""}
case BooleanType =>{value=s"""${row.getAs[Boolean](field.name)} as
${field.name}"""}
case ByteType =>{value=s"""${row.getAs[Byte](field.name)} as
${field.name}"""}
case ShortType =>{value=s"""${row.getAs[Short](field.name)} as
${field.name}"""}
case IntegerType =>{value=s"""${row.getAs[Int](field.name)} as
${field.name}"""}
case LongType =>{value=s"""${row.getAs[Long](field.name)} as
${field.name}"""}
case FloatType =>{value=s"""${row.getAs[Float](field.name)} as
${field.name}"""}
case DoubleType =>{value=s"""${row.getAs[Double](field.name)} as
${field.name}"""}
case DateType =>{value=s"""\'${row.getAs[String](field.name)}\' as
${field.name}"""}
case TimestampType =>{value=s"""\'${row.getAs[String](field.name)}\'
as ${field.name}"""}
}
value
}}).mkString(",")
val insertSql = s"""insert into
${row.getAs("database")}.${row.getAs("table")}_cow select ${filed};"""
try{
println(s""" 插入 ${row.getAs("table")}_cow;""")
sparkSession.sql(insertSql)
}catch{
case ex:Throwable=> {
println(row.prettyJson)
println(insertSql)
throw ex
}
}
}
}
```
Then call it in foreachRDD() of a DSteam
```
saveRdd.foreachRDD ( rdd => {
rdd.collect().foreach(x=>{
//println(x.json)
// println(x.schema.sql)
val row = x._1
chackAndCreateTable(row)
if(x._2.equals("INSERT")){
write2Table(row)
}
})
})
```
**Expected behavior**
A clear and concise description of what you expected to happen.
**Environment Description**
Environment Description
Hudi version : 0.11
Spark version : 3.2.1
Hadoop version : 3.2.2
Storage (HDFS/S3/GCS..) : HDFS
Running on Docker? (yes/no) : no
Here is my config
code:
```
.appName("SparkHudi")
.master("spark://hadoop203:7077")
.config("spark.sql.warehouse.dir","/user/hive/warehouse")
.config("spark.serialize","org.apache.spark.serializer.KryoSerializer")
.config("spark.sql.extensions","org.apache.spark.sql.hudi.HoodieSparkSessionExtension")
.config("spark.sql.catalog.spark_catalog","org.apache.spark.sql.hudi.catalog.HoodieCatalog")
.config("spark.sql.legacy.exponentLiteralAsDecimal.enabled",true)
.enableHiveSupport()
.config("hive.metastore.uris","thrift://10.10.9.203:9083")
.getOrCreate()
```
spark-submit:
```
spark-submit --jars
/home/kadm/module/hudi-0.11/packaging/hudi-spark-bundle/target/hudi-spark3.2-bundle_2.12-0.11.0.jar
--packages
org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.1,org.apache.spark:spark-avro_2.12:3.2.1,org.apache.kafka:kafka-clients:3.1.0
--conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --conf
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
--conf
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
--conf
"spark.driver.extraJavaOptions=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5445"
--master spark://hadoop203:7077 SparkHudi-1.0-SNAPSHOT-shaded.jar
```
**Stacktrace**
```
22/06/06 09:47:13 ERROR Javalin: Exception occurred while servicing
http-request
java.lang.NoSuchMethodError:
org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()Lorg/apache/hadoop/hdfs/DFSInputStream$ReadStatistics;
at
org.apache.hudi.org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.updateInputStreamStatistics(FSDataInputStreamWrapper.java:249)
at
org.apache.hudi.org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.close(FSDataInputStreamWrapper.java:296)
at
org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.closeStreams(HFileBlock.java:1825)
at
org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFilePreadReader.close(HFilePreadReader.java:107)
at
org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.close(HFileReaderImpl.java:1421)
at
org.apache.hudi.io.storage.HoodieHFileReader.close(HoodieHFileReader.java:218)
at
org.apache.hudi.metadata.HoodieBackedTableMetadata.closeReader(HoodieBackedTableMetadata.java:574)
at
org.apache.hudi.metadata.HoodieBackedTableMetadata.close(HoodieBackedTableMetadata.java:567)
at
org.apache.hudi.metadata.HoodieBackedTableMetadata.close(HoodieBackedTableMetadata.java:554)
at
org.apache.hudi.metadata.HoodieMetadataFileSystemView.close(HoodieMetadataFileSystemView.java:83)
at
org.apache.hudi.common.table.view.FileSystemViewManager.clearFileSystemView(FileSystemViewManager.java:86)
at
org.apache.hudi.timeline.service.handlers.FileSliceHandler.refreshTable(FileSliceHandler.java:118)
at
org.apache.hudi.timeline.service.RequestHandler.lambda$registerFileSlicesAPI$19(RequestHandler.java:390)
at
org.apache.hudi.timeline.service.RequestHandler$ViewHandler.handle(RequestHandler.java:501)
at
io.javalin.security.SecurityUtil.noopAccessManager(SecurityUtil.kt:22)
at io.javalin.Javalin.lambda$addHandler$0(Javalin.java:606)
at
io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:46)
at
io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:17)
at
io.javalin.core.JavalinServlet$service$1.invoke(JavalinServlet.kt:143)
at io.javalin.core.JavalinServlet$service$2.invoke(JavalinServlet.kt:41)
at io.javalin.core.JavalinServlet.service(JavalinServlet.kt:107)
at
io.javalin.core.util.JettyServerUtil$initialize$httpHandler$1.doHandle(JettyServerUtil.kt:72)
at
org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at
org.apache.hudi.org.apache.jetty.servlet.ServletHandler.doScope(ServletHandler.java:482)
at
org.apache.hudi.org.apache.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1668)
at
org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at
org.apache.hudi.org.apache.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
at
org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at
org.apache.hudi.org.apache.jetty.server.handler.HandlerList.handle(HandlerList.java:61)
at
org.apache.hudi.org.apache.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
at
org.apache.hudi.org.apache.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at
org.apache.hudi.org.apache.jetty.server.Server.handle(Server.java:502)
at
org.apache.hudi.org.apache.jetty.server.HttpChannel.handle(HttpChannel.java:370)
at
org.apache.hudi.org.apache.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
at
org.apache.hudi.org.apache.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at
org.apache.hudi.org.apache.jetty.io.FillInterest.fillable(FillInterest.java:103)
at
org.apache.hudi.org.apache.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at
org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
at
org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
at
org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at
org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
at
org.apache.hudi.org.apache.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:367)
at
org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:782)
at
org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:918)
at java.lang.Thread.run(Thread.java:748)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]