[ https://issues.apache.org/jira/browse/HIVE-9109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mohit Sabharwal updated HIVE-9109: ---------------------------------- Description: Hash function differences between Java 7 and Java 8 lead to result order differences. While we have been able to fix a good number by converting hash maps to insert order hash maps, there are several cases where doing so is either not possible (because changes originate in external APIs) or change leads to even more out file differences. For example: (1) In TestCliDriver.testCliDriver_varchar_udf1, for the following query: {code} select str_to_map('a:1,b:2,c:3',',',':'), str_to_map(cast('a:1,b:2,c:3' as varchar(20)),',',':') from varchar_udf_1 limit 1;”) {code} the {{StandardMapObjectInspector}} used in {{LazySimpleSerDe}} to serialize the final output uses a {{HashMap}}. Changing it to {{LinkedHashMap}} will lead to several other q-test output differences. (2) In TestCliDriver.testCliDriver_parquet_map_null, data with {{map}} column is read from an Avro table into a Parquet table. Avro API, specifically {{GenericData.Record}} uses {{HashMap}} and returns data in different order. This patch adds supports to specify a hint called {{JAVA_VERSION_SPECIFIC_OUTPUT}} which may be added to a q-test, only if different outputs are expected for Java versions. For example: Under Java 7, test output file has ".java1.7.out" extension. Under Java 8, test output file has ".java1.8.out" extension. If hint is not added, we continue to generate a single ".out" file for the test. was: Hash function differences between Java 7 and Java 8 lead to result order differences. While we have been able to fix a good number by converting hash maps to insert order hash maps, there are several cases where doing so is either not possible (because changes originate in external APIs) or change leads to even more out file differences. For example: (1) In TestCliDriver.testCliDriver_varchar_udf1, for the following query: {code} select str_to_map('a:1,b:2,c:3',',',':'), str_to_map(cast('a:1,b:2,c:3' as varchar(20)),',',':') from varchar_udf_1 limit 1;”) {code} the {{StandardMapObjectInspector}} used in {{LazySimpleSerDe}} to serialize the final output uses a {{HashMap}}. Changing it to {{LinkedHashMap}} will lead to several other q-test output differences. (2) In TestCliDriver.testCliDriver_parquet_map_null, data with {{map}} column is read from an Avro table into a Parquet table. Avro API, specifically {{GenericData.Record}} uses {{HashMap}} and returns data in different order. This patch adds supports to specify a hint called {{JAVA_VERSION_SPECIFIC_OUTPUT}} which may be added to a q-test, only if different outputs are expected for Java 7 and Java 8. Under Java 7, test output file has (original) ".out" extension. Under Java 8, test output file has ".java8.out" extension. If hint is not added, we continue to generate a single ".out" file for the test. > Add support for Java 8 specific q-test out files > ------------------------------------------------ > > Key: HIVE-9109 > URL: https://issues.apache.org/jira/browse/HIVE-9109 > Project: Hive > Issue Type: Sub-task > Components: Testing Infrastructure > Reporter: Mohit Sabharwal > Assignee: Mohit Sabharwal > Attachments: HIVE-9109.1.patch, HIVE-9109.patch > > > Hash function differences between Java 7 and Java 8 lead to result order > differences. While we have been able to fix a good number by converting hash > maps to insert order hash maps, there are several cases where doing so is > either not possible (because changes originate in external APIs) or change > leads to even more out file differences. > For example: > (1) In TestCliDriver.testCliDriver_varchar_udf1, for the following query: > {code} > select > str_to_map('a:1,b:2,c:3',',',':'), > str_to_map(cast('a:1,b:2,c:3' as varchar(20)),',',':') > from varchar_udf_1 limit 1;”) > {code} > the {{StandardMapObjectInspector}} used in {{LazySimpleSerDe}} to serialize > the final output uses a {{HashMap}}. Changing it to {{LinkedHashMap}} will > lead to several other q-test output differences. > (2) In TestCliDriver.testCliDriver_parquet_map_null, data with {{map}} column > is read from an Avro table into a Parquet table. Avro API, specifically > {{GenericData.Record}} uses {{HashMap}} and returns data in different order. > This patch adds supports to specify a hint called > {{JAVA_VERSION_SPECIFIC_OUTPUT}} which may be added to a q-test, only if > different outputs are expected for Java versions. > For example: > Under Java 7, test output file has ".java1.7.out" extension. > Under Java 8, test output file has ".java1.8.out" extension. > If hint is not added, we continue to generate a single ".out" file for the > test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)