[
https://issues.apache.org/jira/browse/HUDI-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-3096:
---------------------------------
Labels: pull-request-available (was: )
> fixed the bug that the cow table(contains decimalType) write by flink cannot
> be read by spark
> ----------------------------------------------------------------------------------------------
>
> Key: HUDI-3096
> URL: https://issues.apache.org/jira/browse/HUDI-3096
> Project: Apache Hudi
> Issue Type: Bug
> Components: Flink Integration
> Affects Versions: 0.10.0
> Environment: flink 1.13.1
> spark 3.1.1
> Reporter: Tao Meng
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.11.0
>
>
> now, flink will write decimalType as byte[]
> when spark read that decimal Type, if spark find the precision of current
> decimal is small spark treat it as int/long which caused the fllow error:
>
> Caused by: org.apache.spark.sql.execution.QueryExecutionException: Parquet
> column cannot be converted in file
> hdfs://xxxxx/tmp/hudi/hudi_xxxxx/46d44c57-aa43-41e2-a8aa-76dcc9dac7e4_0-4-0_20211221201230.parquet.
> Column: [c7], Expected: decimal(10,4), Found: BINARY
> at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:179)
> at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93)
> at
> org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:517)
> at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
> Source)
--
This message was sent by Atlassian Jira
(v8.20.1#820001)