[ 
https://issues.apache.org/jira/browse/HIVE-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13701367#comment-13701367
 ] 

Edward Capriolo commented on HIVE-4804:
---------------------------------------

{quote}
What further worries me was we were generating wrong results (order by clause 
was getting ignored before this fix). Are we now guaranteed to never giving out 
wrong results to end users?
{quote}

This is a new feature which is not on by default. When we decide to enable it 
by default by default (in a follow on issue) we should get a good idea of how 
it works across the board.

I think because the input file is sorted it could be easy for this test to give 
a false positive. Possibly this is what happened with the initial patch. 

I can not speak to the unpredictable-nests of the build server. Being that it 
takes my machine ~ 12 hours to run tests, one test run passing is all I can do.

This is a feature hive desperately needs. Probably the most important feature! 
Order by is a huge bottleneck in many of our processing pipelines. If I had a 
nickel for every 1 reducer order by job I have seen since I started using hive, 
I could start my own big data software company without any VC.

Adding a feature enhancement off by default and then turning it on later after 
we get familiar with it is something we have done quite often.

                
> parallel_orderby.q in trunk fails consistently
> ----------------------------------------------
>
>                 Key: HIVE-4804
>                 URL: https://issues.apache.org/jira/browse/HIVE-4804
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>             Fix For: 0.12.0
>
>         Attachments: HIVE-4804.D11571.1.patch
>
>
> {noformat}
> java.lang.RuntimeException: Error in configuring object
>       at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>       at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>       at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>       at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.<init>(MapTask.java:481)
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:390)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:416)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>       at org.apache.hadoop.mapred.Child.main(Child.java:260)
> Caused by: java.lang.reflect.InvocationTargetException
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:616)
>       at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>       ... 10 more
> Caused by: java.lang.IllegalArgumentException: Can't read partitions file
>       at 
> org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:91)
>       at 
> org.apache.hadoop.hive.ql.exec.HiveTotalOrderPartitioner.configure(HiveTotalOrderPartitioner.java:37)
>       ... 15 more
> Caused by: java.io.IOException: Split points are out of order
>       at 
> org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:78)
>       ... 16 more
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to