tao meng created HUDI-2958: ------------------------------ Summary: Automatically set spark.sql.parquet.writelegacyformat. When using bulkinsert to insert data will contains decimal Type. Key: HUDI-2958 URL: https://issues.apache.org/jira/browse/HUDI-2958 Project: Apache Hudi Issue Type: Improvement Components: Spark Integration Reporter: tao meng Fix For: 0.11.0
Now by default ParquetWriteSupport will write DecimalType to parquet as int32/int64 when the scale of decimalType < Decimal.MAX_LONG_DIGITS(), but AvroParquetReader which used by HoodieParquetReader cannot support read int32/int64 as DecimalType. this will lead follow error Caused by: java.lang.UnsupportedOperationException: org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary at org.apache.parquet.column.Dictionary.decodeToBinary(Dictionary.java:41) at org.apache.parquet.avro.AvroConverters$BinaryConverter.setDictionary(AvroConverters.java:75) ...... -- This message was sent by Atlassian Jira (v8.20.1#820001)