[
https://issues.apache.org/jira/browse/HIVE-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163919#comment-13163919
]
John Sichi commented on HIVE-1040:
----------------------------------
It's kind of bad that we have binary data in .q.out files...HIVE-2482 would be
the correct way to fix that (using a UDF to display as hex).
> use sed rather than diff for masking out noise in diff-based tests
> ------------------------------------------------------------------
>
> Key: HIVE-1040
> URL: https://issues.apache.org/jira/browse/HIVE-1040
> Project: Hive
> Issue Type: Improvement
> Components: Testing Infrastructure
> Affects Versions: 0.4.1
> Reporter: John Sichi
> Assignee: Marek Sapota
> Priority: Minor
> Attachments: HIVE-1040-code-patch.patch, HIVE-1040.1.patch,
> HIVE-1040.2.patch, HIVE-1040.D597.1.patch, HIVE-1040.D597.2.patch
>
>
> The current diff -I approach has two problems: (1) it does not allow
> resolution finer than line-level, so it's impossible to mask out pattern
> occurrences within a line, and (2) it produces unmasked files, so if you run
> diff on the command line to compare the result .q.out with the checked-in
> file, you see the noise.
> My suggestion is to first run sed to replace noise patterns with an
> unlikely-to-occur string like ZYZZYZVA, and then diff the pre-masked files
> without using any -I.
> This would require a one-time hit to update all existing .q.out files so that
> they would contain the pre-masked results.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira