[jira] [Commented] (HIVE-2361) Add some UDFs which help to migrate Oracle to Hive

neelesh gadhia (JIRA) Wed, 13 Feb 2013 23:52:20 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13578207#comment-13578207
 ]


neelesh gadhia commented on HIVE-2361:
--------------------------------------

New to using UDF's on hive. But implemented these on hive 0.9.0 and hadoop 
1.1.1. Was able to add jar to hive and kicked the sample sql
SELECT t.empno, t.deptno, t.sal, nexr_sum(hash(t.deptno),t.sal) as sal_sum
FROM (
select a.empno, a.deptno, a.sal from emp a
distribute by hash(a.deptno)
sort BY a.deptno, a.empno
) t;

The sql failed with errors 


2013-02-13 23:30:18,925 INFO org.apache.hadoop.mapred.JobTracker: Adding task 
(REDUCE) 'attempt_201302132324_0002_r_000000_3' to tip 
task_201302132324_0002_r_000000, for tracker 
'tracker_192.168.0.151:localhost/127.0.0.1:50099'
2013-02-13 23:30:18,925 INFO org.apache.hadoop.mapred.JobTracker: Removing task 
'attempt_201302132324_0002_r_000000_2'
2013-02-13 23:30:26,484 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_201302132324_0002_r_000000_3: java.lang.RuntimeException: Error in 
configuring object
        at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
        at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:486)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        ... 9 more
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
        at 
org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:157)
        ... 14 more
Caused by: java.lang.NullPointerException
        at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:137)
        at 
org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:896)
        at 
org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:922)
        at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433)
        at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389)
        at 
org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
        at 
org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:150)
        ... 14 more

2013-02-13 23:30:29,819 INFO org.apache.hadoop.mapred.TaskInProgress: 
TaskInProgress task_201302132324_0002_r_000000 has failed 4 times.
2013-02-13 23:30:29,820 INFO org.apache.hadoop.mapred.JobInProgress: 
TaskTracker at '192.168.0.151' turned 'flaky'
.... 12 more lines..


Tried different function "GenericUDFMax".. same error.

Any pointers/advise, what could be wrong?
                
>  Add some UDFs which help to migrate Oracle to Hive
> ---------------------------------------------------
>
>                 Key: HIVE-2361
>                 URL: https://issues.apache.org/jira/browse/HIVE-2361
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>    Affects Versions: 0.8.0
>            Reporter: JunHo Cho
>            Assignee: JunHo Cho
>            Priority: Minor
>              Labels: features
>         Attachments: nexr-udf.tar
>
>
> Here some UDFs which can be matched to oracle functions:
> There are two kinds of oracle functions. one is scalar function and another 
> is analytic function.
> Most scalar functions in Oracle can be converted to hive's udf directly.  
> Oracle Scalar Function
> GenericUDFDecode : Compares first argument to each other value one by one. 
> e.g., DECODE(x,0,'zero',1,'one') will return 'zero' if x is 0
> GenericUDFGreatest : Return the greatest of the list of one or more 
> expressions. e.g., GREATEST(2,5,12,3) will return 12
> GenericUDFInstr : Return the location of a substring in a string. e.g., 
> INSTR('next', 'e') will return 2
> GenericUDFLnnvl : Evaluate a condition when one or both operands of the 
> condition may be null. e.g., LNNVL(2 > 4) will return true
> GenericUDFNVL : Replace null with a string in the results of a query. e.g., 
> NVL(null,'hive') will return hive
> GenericUDFNVL2 : Determine the value returned by a query based on whether a 
> specified expression is null or not null. e.g., NVL2(null,'not null','null 
> value') will return 'null value'
> GenericUDFToNumber : Convert a string to a number. e.g., 
> TO_NUMBER('112','999') will return 112
> GenericUDFTrunc : Returns a date truncated to a specific unit of measure. 
> e.g., TRUNC('2002-11-02 01:01:01','YYYY') will return '2002-01-01 00:00:00'
> Oracle Analytic Function
> Most analytic functions in Oracle can't be converted to hive's query and udf 
> directly.
> Following udfs should be used with DISTRIBUTED, SORT BY and HASH of hive to 
> support analytic functions 
> e.q., SELECT _FUNC_(hash(col1), col2, ...) FROM SELECT ~ FROM table 
> DISTRIBUTED BY hash(col1) SORT BY col1, col2 ...
> GenericUDFSum : Calculate a cumulative sum.
> GenericUDFRank : Assign a sequential order, or rank within some group based 
> on key.
> GenericUDFDenseRank : Act like RANK function except that it assigns 
> consecutive ranks.
> GenericUDFRowNumber : Return sequence integer value within some group based 
> on key.
> GenericUDFMax : Determine the highest value within some group based on key.
> GenericUDFMin : Determine the lowest value within some group based on key.
> GenericUDFLag : Access data from a previous row.
> This udfs was developed with hive-pdk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2361) Add some UDFs which help to migrate Oracle to Hive

Reply via email to