Quanlong Huang created IMPALA-13990: ---------------------------------------
Summary: Allow padding/truncating decimal values from Parquet/ORC files Key: IMPALA-13990 URL: https://issues.apache.org/jira/browse/IMPALA-13990 Project: IMPALA Issue Type: New Feature Components: Backend Reporter: Quanlong Huang When a column in decimal type has different precision or scale in the table schema and file schema, Impala will reject reading the file to avoid lossing precisions. For Parquet files, the error is {code:java} File 'hdfs://localhost:20500/test-warehouse/parq_tbl/000000_0' column 'd38_18' has a precision that does not match the table metadata precision. File metadata precision: 38, table metadata precision: 22. {code} For ORC files, the error is {code:java} Type mismatch: table column DECIMAL(22,6) is map to column decimal(38,18) in ORC file 'hdfs://localhost:20500/test-warehouse/tbl/000000_0'{code} Hive is able to support such scenario: {code:sql} create external table tbl (d22_6 decimal(22,6), d38_18 decimal(38,18)) stored as orc; insert into tbl select pi(), pi(); select * from tbl; +------------+-----------------------+ | tbl.d22_6 | tbl.d38_18 | +------------+-----------------------+ | 3.141593 | 3.141592653589793000 | +------------+-----------------------+ -- create a new table pointing to the above location, using decimal(22,6) for the second column create external table tbl2 (d22_6 decimal(22,6), d38_18 decimal(22,6)) stored as orc location '/test-warehouse/tbl'; select * from tbl2; +-------------+--------------+ | tbl2.d22_6 | tbl2.d38_18 | +-------------+--------------+ | 3.141593 | 3.141593 | +-------------+--------------+{code} Though lossing precissions, it's still helpful to show the truncated values. We can add a query option to allow such behavior to be consistent with Hive. -- This message was sent by Atlassian Jira (v8.20.10#820010)