[ https://issues.apache.org/jira/browse/HIVE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790809#comment-13790809 ]
Prasanth J commented on HIVE-5502: ---------------------------------- Hi [~brocknoland].. seems like the test case failure is not related to the file size of TestFileDump.testDump.orc file.. TestFileDump unit test file contains two test cases ( testDump() and testDictionaryThreshold() ). These two test cases creates an ORC file with the same name (look for testFilePath variable initialization in openFileSystem()). This should be fixed to write to two different files which is based on the test case function name. I think the reason for seeing 2 different file size in your case is the passing test case contains the output of testDictionaryThreshold() whereas failing test case contains the output of testDump(). But the file size of TestFileDump.testDump.orc is not really important for these test cases. Its the contents of orc-file-dump.out file that is more important. Doing a diff of generated orc-file-dump.out vs golden file shows that 1st strip expects 5000 rows but it got only 4000 rows. This is the reason for test case failure. I faced similar non-determinism when I run the test case from eclipse vs from console. From console I always get the correct result but when I try to run the test case from eclipse it fails all the time with the same issue (4000 rows vs 5000 rows). The golden file in this case might have been generated by running "ant test -Dtestcase=TestFileDump". Since now you are testing using maven there might be some difference in ANT_OPTS vs MAVEN_OPTS. Thats my guess. Moving forward there are two ways this can be fixed 1) Implement a deterministic memory manager that doesn't depend on the available memory for ORC test cases 2) Overwrite golden file when we move to maven > ORC TestFileDump is flaky > ------------------------- > > Key: HIVE-5502 > URL: https://issues.apache.org/jira/browse/HIVE-5502 > Project: Hive > Issue Type: Bug > Reporter: Brock Noland > Priority: Minor > Attachments: TestFileDump.tar.gz > > > I found in my maven work that TestFileDump is non-deterministic. For example > sometimes the output ORC file is much larger > {noformat} > pass: > -rwxrwxrwx 1 brock brock 290055 Oct 9 12:02 TestFileDump.testDump.orc > fail: > -rwxrwxrwx 1 brock brock 1938634 Oct 9 12:08 TestFileDump.testDump.orc > {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)