[ https://issues.apache.org/jira/browse/HIVE-18421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357718#comment-16357718 ]
Vihang Karajgaonkar commented on HIVE-18421: -------------------------------------------- I can enable this config by default but that would trigger an q.out update of over 300 q files. Also, there were concerns raised by [~mmccline] and [~gopalv] above regarding performance. I didn't investigate the performance overhead of the fix. The issue is there is no well-defined policy within Hive overall on how to handle overflows. So it is arguable if users would want to enable this config by default or not since Hive itself doesn't handle overflows well overall. The config does handle overflows in the "right" way. It only makes vectorized execution overflow handling similar to non-vectorized handling. That was the reason I disabled the config by default. I can investigate the overhead and turn it on by default if there is no significant overhead. Any thoughts [~gopalv] [~aihuaxu] on this? > Vectorized execution handles overflows in a different manner than > non-vectorized execution > ------------------------------------------------------------------------------------------ > > Key: HIVE-18421 > URL: https://issues.apache.org/jira/browse/HIVE-18421 > Project: Hive > Issue Type: Bug > Components: Vectorization > Affects Versions: 2.1.1, 2.2.0, 3.0.0, 2.3.2 > Reporter: Vihang Karajgaonkar > Assignee: Vihang Karajgaonkar > Priority: Major > Attachments: HIVE-18421.01.patch, HIVE-18421.02.patch, > HIVE-18421.03.patch, HIVE-18421.04.patch, HIVE-18421.05.patch, > HIVE-18421.06.patch, HIVE-18421.07.patch > > > In vectorized execution arithmetic operations which cause integer overflows > can give wrong results. Issue is reproducible in both Orc and parquet. > Simple test case to reproduce this issue > {noformat} > set hive.vectorized.execution.enabled=true; > create table parquettable (t1 tinyint, t2 tinyint) stored as parquet; > insert into parquettable values (-104, 25), (-112, 24), (54, 9); > select t1, t2, (t1-t2) as diff from parquettable where (t1-t2) < 50 order by > diff desc; > +-------+-----+-------+ > | t1 | t2 | diff | > +-------+-----+-------+ > | -104 | 25 | 127 | > | -112 | 24 | 120 | > | 54 | 9 | 45 | > +-------+-----+-------+ > {noformat} > When vectorization is turned off the same query produces only one row. -- This message was sent by Atlassian JIRA (v7.6.3#76005)