tao meng created HUDI-2958:
------------------------------

             Summary: Automatically set spark.sql.parquet.writelegacyformat. 
When using bulkinsert to insert data will contains decimal Type.
                 Key: HUDI-2958
                 URL: https://issues.apache.org/jira/browse/HUDI-2958
             Project: Apache Hudi
          Issue Type: Improvement
          Components: Spark Integration
            Reporter: tao meng
             Fix For: 0.11.0


Now by default ParquetWriteSupport will write DecimalType to parquet as 
int32/int64 when the scale of decimalType < Decimal.MAX_LONG_DIGITS(),
but AvroParquetReader which used by HoodieParquetReader cannot support read 
int32/int64 as DecimalType. this will lead follow error

Caused by: java.lang.UnsupportedOperationException: 
org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary
    at org.apache.parquet.column.Dictionary.decodeToBinary(Dictionary.java:41)
    at 
org.apache.parquet.avro.AvroConverters$BinaryConverter.setDictionary(AvroConverters.java:75)
    ......



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to