[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jitendra Nath Pandey updated HIVE-6511: --------------------------------------- Description: select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776 -8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568 -26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULL NULL NULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776 -8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567 -26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULL NULL NULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer si smallint from deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dc decimal(38,18) from deserializer bo boolean from deserializer s string from deserializer s2 string from deserializer ts timestamp from deserializer # Detailed Table Information Database: default Owner: xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles 1 numRows 10000 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format 1 Time taken: 0.196 seconds, Fetched: 41 row(s {code} was: select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776 -8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568 -26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULL NULL NULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776 -8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567 -26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULL NULL NULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer si smallint from deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dc decimal(38,18) from deserializer bo boolean from deserializer s string from deserializer s2 string from deserializer ts timestamp from deserializer # Detailed Table Information Database: default Owner: hrt_qa CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: hdfs://hor13n26.gq1.ygridcore.net:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles 1 numRows 10000 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format 1 Time taken: 0.196 seconds, Fetched: 41 row(s {code} > casting from decimal to tinyint,smallint, int and bigint generates different > result when vectorization is on > ------------------------------------------------------------------------------------------------------------ > > Key: HIVE-6511 > URL: https://issues.apache.org/jira/browse/HIVE-6511 > Project: Hive > Issue Type: Bug > Reporter: Jitendra Nath Pandey > Assignee: Jitendra Nath Pandey > > select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from > vectortab10korc limit 20 generates following result when vectorization is > enabled: > {code} > 4619756289662.078125 -1628520834 -16770 126 > 1553532646710.316406 -1245514442 -2762 54 > 3367942487288.360352 688127224 -776 -8 > 4386447830839.337891 1286221623 12087 55 > -3234165331139.458008 -54957251 27453 61 > -488378613475.326172 1247658269 -16099 29 > -493942492598.691406 -21253559 -19895 73 > 3101852523586.039062 886135874 23618 66 > 2544105595941.381836 1484956709 -23515 37 > -3997512403067.0625 1102149509 30597 -123 > -1183754978977.589355 1655994718 31070 94 > 1408783849655.676758 34576568 -26440 -72 > -2993175106993.426758 417098319 27215 79 > 3004723551798.100586 -1753555402 -8650 54 > 1103792083527.786133 -14511544 -28088 72 > 469767055288.485352 1615620024 26552 -72 > -1263700791098.294434 -980406074 12486 -58 > -4244889766496.484375 -1462078048 30112 -96 > -3962729491139.782715 1525323068 -27332 60 > NULL NULL NULL NULL > {code} > When vectorization is disabled, result looks like this: > {code} > 4619756289662.078125 -1628520834 -16770 126 > 1553532646710.316406 -1245514442 -2762 54 > 3367942487288.360352 688127224 -776 -8 > 4386447830839.337891 1286221623 12087 55 > -3234165331139.458008 -54957251 27453 61 > -488378613475.326172 1247658269 -16099 29 > -493942492598.691406 -21253558 -19894 74 > 3101852523586.039062 886135874 23618 66 > 2544105595941.381836 1484956709 -23515 37 > -3997512403067.0625 1102149509 30597 -123 > -1183754978977.589355 1655994719 31071 95 > 1408783849655.676758 34576567 -26441 -73 > -2993175106993.426758 417098319 27215 79 > 3004723551798.100586 -1753555402 -8650 54 > 1103792083527.786133 -14511545 -28089 71 > 469767055288.485352 1615620024 26552 -72 > -1263700791098.294434 -980406074 12486 -58 > -4244889766496.484375 -1462078048 30112 -96 > -3962729491139.782715 1525323069 -27331 61 > NULL NULL NULL NULL > {code} > This issue is visible only for certain decimal values. In above example, row > 7,11,12, and 15 generates different results. > vectortab10korc table schema: > {code} > t tinyint from deserializer > si smallint from deserializer > i int from deserializer > b bigint from deserializer > f float from deserializer > d double from deserializer > dc decimal(38,18) from deserializer > bo boolean from deserializer > s string from deserializer > s2 string from deserializer > ts timestamp from deserializer > > # Detailed Table Information > Database: default > Owner: xyz > CreateTime: Tue Feb 25 21:54:28 UTC 2014 > LastAccessTime: UNKNOWN > Protect Mode: None > Retention: 0 > Location: > hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc > Table Type: MANAGED_TABLE > Table Parameters: > COLUMN_STATS_ACCURATE true > numFiles 1 > numRows 10000 > rawDataSize 0 > totalSize 344748 > transient_lastDdlTime 1393365281 > > # Storage Information > SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde > InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format 1 > Time taken: 0.196 seconds, Fetched: 41 row(s > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)