[ https://issues.apache.org/jira/browse/HIVE-22877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mustafa Iman updated HIVE-22877: -------------------------------- Description: During vectorization, decimal fields that are obtained via generic udfs are cast to Decimal64 in some circumstances. For decimal to decimal64 cast, hive compares the source column's `scale + precision` to 18(maximum number of digits that can be represented by a long). A decimal can fit in a long as long as its `precision` is smaller than or equal to 18. Scale is irrelevant. Since vectorized generic udf expression takes scale into account, it computes wrong output column vector: Decimal instead of Decimal64. This in turn causes ClassCastException down the operator chain. Below query fails with class cast exception: {code:java} create table mini_store ( s_store_sk int, s_store_id string ) row format delimited fields terminated by '\t' STORED AS ORC; create table mini_sales ( ss_store_sk int, ss_quantity int, ss_sales_price decimal(7,2) ) row format delimited fields terminated by '\t' STORED AS ORC; insert into mini_store values (1, 'store'); insert into mini_sales values (1, 2, 1.2); select s_store_id, coalesce(ss_sales_price*ss_quantity,0) sumsales from mini_sales, mini_store where ss_store_sk = s_store_sk {code} was: During vectorization, decimal fields that are obtained via generic udfs are cast to Decimal64 in some circumstances. For decimal to decimal64 cast, hive compares the source column's `scale + precision` to 18(maximum number of digits that can be represented by a long). A decimal can fit in a long as long as its `precision` is smaller than or equal to 18. Precision is irrelevant. Since vectorized generic udf expression takes precision into account, it computes wrong output column vector: Decimal instead of Decimal64. This in turn causes ClassCastException down the operator chain. Below query fails with class cast exception: {code:java} create table mini_store ( s_store_sk int, s_store_id string ) row format delimited fields terminated by '\t' STORED AS ORC; create table mini_sales ( ss_store_sk int, ss_quantity int, ss_sales_price decimal(7,2) ) row format delimited fields terminated by '\t' STORED AS ORC; insert into mini_store values (1, 'store'); insert into mini_sales values (1, 2, 1.2); select s_store_id, coalesce(ss_sales_price*ss_quantity,0) sumsales from mini_sales, mini_store where ss_store_sk = s_store_sk {code} > Fix decimal boundary check for casting to Decimal64 > --------------------------------------------------- > > Key: HIVE-22877 > URL: https://issues.apache.org/jira/browse/HIVE-22877 > Project: Hive > Issue Type: Bug > Components: Vectorization > Affects Versions: 4.0.0 > Reporter: Mustafa Iman > Assignee: Mustafa Iman > Priority: Major > Labels: pull-request-available > Attachments: HIVE-22877.patch > > Time Spent: 10m > Remaining Estimate: 0h > > During vectorization, decimal fields that are obtained via generic udfs are > cast to Decimal64 in some circumstances. For decimal to decimal64 cast, hive > compares the source column's `scale + precision` to 18(maximum number of > digits that can be represented by a long). A decimal can fit in a long as > long as its `precision` is smaller than or equal to 18. Scale is irrelevant. > Since vectorized generic udf expression takes scale into account, it computes > wrong output column vector: Decimal instead of Decimal64. This in turn causes > ClassCastException down the operator chain. > Below query fails with class cast exception: > > {code:java} > create table mini_store > ( > s_store_sk int, > s_store_id string > ) > row format delimited fields terminated by '\t' > STORED AS ORC; > create table mini_sales > ( > ss_store_sk int, > ss_quantity int, > ss_sales_price decimal(7,2) > ) > row format delimited fields terminated by '\t' > STORED AS ORC; > insert into mini_store values (1, 'store'); > insert into mini_sales values (1, 2, 1.2); > select s_store_id, coalesce(ss_sales_price*ss_quantity,0) sumsales > from mini_sales, mini_store where ss_store_sk = s_store_sk > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)