Stamatis Zampetakis created HIVE-26877:
------------------------------------------

             Summary: Parquet CTAS with JOIN on decimals with different 
precision/scale fail
                 Key: HIVE-26877
                 URL: https://issues.apache.org/jira/browse/HIVE-26877
             Project: Hive
          Issue Type: Bug
          Components: HiveServer2
    Affects Versions: 4.0.0-alpha-2
            Reporter: Stamatis Zampetakis
            Assignee: Stamatis Zampetakis
         Attachments: ctas_parquet_join.q

Creating a Parquet table using CREATE TABLE AS SELECT syntax (CTAS) leads to 
runtime error when the SELECT statement joins columns with different 
precision/scale.

Steps to reproduce:
{code:sql}
CREATE TABLE table_a (col_dec decimal(5,0));
CREATE TABLE table_b(col_dec decimal(38,10));

INSERT INTO table_a VALUES (1);
INSERT INTO table_b VALUES (1.0000000000);

set hive.default.fileformat=parquet;

create table target as
select table_a.col_dec
from table_a
left outer join table_b on
table_a.col_dec = table_b.col_dec;
{code}

Stacktrace:

{noformat}
2022-12-20T07:02:52,237  INFO [2dfbd95a-7553-467b-b9d0-629100785502 Listener at 
0.0.0.0/46609] reexec.ReExecuteLostAMQueryPlugin: Got exception message: Vertex 
failed, vertexName=Reducer 2, vertexId=vertex_1671548565336_0001_3_02, 
diagnostics=[Task failed, taskId=task_1671548565336_0001_3_02_000000, 
diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
failure ) : 
attempt_1671548565336_0001_3_02_000000_0:java.lang.RuntimeException: 
java.lang.RuntimeException: Hive Runtime Error while closing operators: Fixed 
Binary size 16 does not match field type length 3
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276)
        at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
        at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82)
        at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
        at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69)
        at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Hive Runtime Error while closing 
operators: Fixed Binary size 16 does not match field type length 3
        at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:379)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:310)
        ... 15 more
Caused by: java.lang.IllegalArgumentException: Fixed Binary size 16 does not 
match field type length 3
        at 
org.apache.parquet.column.values.plain.FixedLenByteArrayPlainValuesWriter.writeBytes(FixedLenByteArrayPlainValuesWriter.java:56)
        at 
org.apache.parquet.column.impl.ColumnWriterBase.write(ColumnWriterBase.java:174)
        at 
org.apache.parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.addBinary(MessageColumnIO.java:476)
        at 
org.apache.parquet.io.RecordConsumerLoggingWrapper.addBinary(RecordConsumerLoggingWrapper.java:116)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter$DecimalDataWriter.write(DataWritableWriter.java:571)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter$GroupDataWriter.write(DataWritableWriter.java:228)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter$MessageDataWriter.write(DataWritableWriter.java:251)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:115)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:76)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:35)
        at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
        at 
org.apache.parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:182)
        at 
org.apache.parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:44)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:161)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:174)
        at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1160)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
        at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:889)
        at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:921)
        at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:1013)
        at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:419)
        at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:382)
        at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:372)
        at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:584)
        at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.close(CommonMergeJoinOperator.java:502)
        at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:356)
        ... 16 more
], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : 
attempt_1671548565336_0001_3_02_000000_1:java.lang.RuntimeException: 
java.lang.RuntimeException: Hive Runtime Error while closing operators: Fixed 
Binary size 16 does not match field type length 3
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276)
        at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
        at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82)
        at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
        at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69)
        at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Hive Runtime Error while closing 
operators: Fixed Binary size 16 does not match field type length 3
        at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:379)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:310)
        ... 15 more
Caused by: java.lang.IllegalArgumentException: Fixed Binary size 16 does not 
match field type length 3
        at 
org.apache.parquet.column.values.plain.FixedLenByteArrayPlainValuesWriter.writeBytes(FixedLenByteArrayPlainValuesWriter.java:56)
        at 
org.apache.parquet.column.impl.ColumnWriterBase.write(ColumnWriterBase.java:174)
        at 
org.apache.parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.addBinary(MessageColumnIO.java:476)
        at 
org.apache.parquet.io.RecordConsumerLoggingWrapper.addBinary(RecordConsumerLoggingWrapper.java:116)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter$DecimalDataWriter.write(DataWritableWriter.java:571)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter$GroupDataWriter.write(DataWritableWriter.java:228)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter$MessageDataWriter.write(DataWritableWriter.java:251)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:115)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:76)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:35)
        at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
        at 
org.apache.parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:182)
        at 
org.apache.parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:44)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:161)
        at 
org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:174)
        at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1160)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
        at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:889)
        at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:921)
        at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:1013)
        at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:419)
        at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:382)
        at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:372)
        at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:584)
        at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.close(CommonMergeJoinOperator.java:502)
        at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:356)
        ... 16 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
killedTasks:0, Vertex vertex_1671548565336_0001_3_02 [Reducer 2] killed/failed 
due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. 
failedVertices:1 killedVertices:0 retryPossible: false
{noformat}

The problem is reproducible in master 
(7c343471aa68d7d06209694d9d6b181bd58e0793) by runnning 

{noformat}
mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=ctas_parquet_join.q 
-Dtest.output.overwrite
{noformat}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to