You should to try catch and return NULL on bad data. The issue is if
you have a single bad row the UDF will throw a exception up the chain.
It will try again, it will fail again, ultimately the job will fail.

On Wed, May 30, 2012 at 10:40 AM, praveenesh kumar <praveen...@gmail.com> wrote:
> Hello Hive Users,
>
> There is a strange situation I am facing.
>
> I have a string column in my Hive table ( its IP address). I am creating a
> UDF where I am taking this string column and converting it into Long value.
> Its a simple UDF. Following is my code :
>
> package com.practice.hive.udf;
> public class IPtoINT extends UDF {
> public static LongWritable execute(Text addr) {
>
>      String[] addrArray = addr.toString().split("\\.");
>
>      long num = 0;
>
>      for (int i=0;i<addrArray.length;i++) {
>           int power = 3-i;
>             num += ((Integer.parseInt(addrArray[i])%256 *
> Math.pow(256,power)));
>      }
>      return new LongWritable(num);
>   }
> }
>
> After creating jar, I am running the following commands:
>
> $ hive
> hive > add jar /home/hadoop/Desktop/HiveData/IPtoINT.jar;
> hive > create temporary function ip2int as 'com.practice.hive.udf.IPtoINT';
> hive > select ip2int(ip1) from sample_data;
>
> But running the above, is giving me the following error:
>
> java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
> processing row
> {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null}
>                 at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
>                 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>                 at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>                 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>                 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>                 at java.security.AccessController.doPrivileged(Native
> Method)
>                 at javax.security.auth.Subject.doAs(Subject.java:415)
>                 at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>                 at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
> Error while processing row
> {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null}
>                 at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
>                 at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
>                 ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to
> execute method public org.apache.hadoop.io.Text
> com.musigma.hive.udf.ip2int.evaluate(org.apache.hadoop.io.Text)  on object
> com.musigma.hive.udf.ip2int@19a4d79 of class com.musigma.hive.udf.ip2int
> with arguments {1.0.144.36:org.apache.hadoop.io.Text} of size 1
>                 at
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:848)
>                 at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:181)
>                 at
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163)
>                 at
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:76)
>                 at
> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
>                 at
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
>                 at
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
>                 at
> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
>                 at
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
>                 at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531)
>                 ... 9 more
> Caused by: java.lang.reflect.InvocationTargetException
>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>                 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>                 at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>                 at java.lang.reflect.Method.invoke(Method.java:601)
>                 at
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:824)
>                 ... 18 more
> Caused by: java.lang.NumberFormatException: For input string: "1"
>                 at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>                 at java.lang.Long.parseLong(Long.java:441)
>                 at java.lang.Long.<init>(Long.java:702)
>                 at com.musigma.hive.udf.ip2int.evaluate(ip2int.java:11)
>                 ... 23 more
>
>
> If I am running the HIVE UDF like --- select ip2int("102.134.123.1") from
> sample_data; Its not giving any error.
> Strange thing is, its numberformat exception. I am not able to use the
> string into int in my java. Very strange issue.
>
> Can someone please tell me what stupid mistake I am doing ?
>
> Is there any other UDF that does string to int/long coversion ?
>
> Regards,
> Praveenesh

Reply via email to