[ 
https://issues.apache.org/jira/browse/HIVE-9109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-9109:
----------------------------------
    Description: 
Hash function differences between Java 7 and Java 8 lead to result order 
differences. While we have been able to fix a good number by converting hash 
maps to insert order hash maps, there are several cases where doing so is 
either not possible (because changes originate in external APIs) or change 
leads to even more out file differences.

For example:

(1) In TestCliDriver.testCliDriver_varchar_udf1, for the following query:
{code}
select
  str_to_map('a:1,b:2,c:3',',',':'),
  str_to_map(cast('a:1,b:2,c:3' as varchar(20)),',',':')
from varchar_udf_1 limit 1;”)
{code}
the {{StandardMapObjectInspector}} used in {{LazySimpleSerDe}} to serialize the 
final output uses a {{HashMap}}. Changing it to {{LinkedHashMap}} will lead to 
several other q-test output differences.

(2) In TestCliDriver.testCliDriver_parquet_map_null, data with {{map}} column 
is read from an Avro table into a Parquet table. Avro API, specifically 
{{GenericData.Record}} uses {{HashMap}} and returns data in different order.

This patch adds supports to specify a hint called 
{{JAVA_VERSION_SPECIFIC_OUTPUT}} which may be added to a q-test, only if 
different outputs are expected for Java versions.

For example:
  Under Java 7, test output file has ".java1.7.out" extension. 
  Under Java 8, test output file has ".java1.8.out" extension.

If hint is not added, we continue to generate a single ".out" file for the test.

  was:
Hash function differences between Java 7 and Java 8 lead to result order 
differences. While we have been able to fix a good number by converting hash 
maps to insert order hash maps, there are several cases where doing so is 
either not possible (because changes originate in external APIs) or change 
leads to even more out file differences.

For example:

(1) In TestCliDriver.testCliDriver_varchar_udf1, for the following query:
{code}
select
  str_to_map('a:1,b:2,c:3',',',':'),
  str_to_map(cast('a:1,b:2,c:3' as varchar(20)),',',':')
from varchar_udf_1 limit 1;”)
{code}
the {{StandardMapObjectInspector}} used in {{LazySimpleSerDe}} to serialize the 
final output uses a {{HashMap}}. Changing it to {{LinkedHashMap}} will lead to 
several other q-test output differences.

(2) In TestCliDriver.testCliDriver_parquet_map_null, data with {{map}} column 
is read from an Avro table into a Parquet table. Avro API, specifically 
{{GenericData.Record}} uses {{HashMap}} and returns data in different order.

This patch adds supports to specify a hint called 
{{JAVA_VERSION_SPECIFIC_OUTPUT}} which may be added to a q-test, only if 
different outputs are expected for Java 7 and Java 8.  

Under Java 7, test output file has (original) ".out" extension. 

Under Java 8, test output file has ".java8.out" extension.

If hint is not added, we continue to generate a single ".out" file for the test.


> Add support for Java 8 specific q-test out files
> ------------------------------------------------
>
>                 Key: HIVE-9109
>                 URL: https://issues.apache.org/jira/browse/HIVE-9109
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Testing Infrastructure
>            Reporter: Mohit Sabharwal
>            Assignee: Mohit Sabharwal
>         Attachments: HIVE-9109.1.patch, HIVE-9109.patch
>
>
> Hash function differences between Java 7 and Java 8 lead to result order 
> differences. While we have been able to fix a good number by converting hash 
> maps to insert order hash maps, there are several cases where doing so is 
> either not possible (because changes originate in external APIs) or change 
> leads to even more out file differences.
> For example:
> (1) In TestCliDriver.testCliDriver_varchar_udf1, for the following query:
> {code}
> select
>   str_to_map('a:1,b:2,c:3',',',':'),
>   str_to_map(cast('a:1,b:2,c:3' as varchar(20)),',',':')
> from varchar_udf_1 limit 1;”)
> {code}
> the {{StandardMapObjectInspector}} used in {{LazySimpleSerDe}} to serialize 
> the final output uses a {{HashMap}}. Changing it to {{LinkedHashMap}} will 
> lead to several other q-test output differences.
> (2) In TestCliDriver.testCliDriver_parquet_map_null, data with {{map}} column 
> is read from an Avro table into a Parquet table. Avro API, specifically 
> {{GenericData.Record}} uses {{HashMap}} and returns data in different order.
> This patch adds supports to specify a hint called 
> {{JAVA_VERSION_SPECIFIC_OUTPUT}} which may be added to a q-test, only if 
> different outputs are expected for Java versions.
> For example:
>   Under Java 7, test output file has ".java1.7.out" extension. 
>   Under Java 8, test output file has ".java1.8.out" extension.
> If hint is not added, we continue to generate a single ".out" file for the 
> test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to