You should to try catch and return NULL on bad data. The issue is if you have a single bad row the UDF will throw a exception up the chain. It will try again, it will fail again, ultimately the job will fail.
On Wed, May 30, 2012 at 10:40 AM, praveenesh kumar <praveen...@gmail.com> wrote: > Hello Hive Users, > > There is a strange situation I am facing. > > I have a string column in my Hive table ( its IP address). I am creating a > UDF where I am taking this string column and converting it into Long value. > Its a simple UDF. Following is my code : > > package com.practice.hive.udf; > public class IPtoINT extends UDF { > public static LongWritable execute(Text addr) { > > String[] addrArray = addr.toString().split("\\."); > > long num = 0; > > for (int i=0;i<addrArray.length;i++) { > int power = 3-i; > num += ((Integer.parseInt(addrArray[i])%256 * > Math.pow(256,power))); > } > return new LongWritable(num); > } > } > > After creating jar, I am running the following commands: > > $ hive > hive > add jar /home/hadoop/Desktop/HiveData/IPtoINT.jar; > hive > create temporary function ip2int as 'com.practice.hive.udf.IPtoINT'; > hive > select ip2int(ip1) from sample_data; > > But running the above, is giving me the following error: > > java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} > at > org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native > Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550) > at > org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) > ... 8 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > execute method public org.apache.hadoop.io.Text > com.musigma.hive.udf.ip2int.evaluate(org.apache.hadoop.io.Text) on object > com.musigma.hive.udf.ip2int@19a4d79 of class com.musigma.hive.udf.ip2int > with arguments {1.0.144.36:org.apache.hadoop.io.Text} of size 1 > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:848) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:181) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:76) > at > org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) > at > org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531) > ... 9 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:824) > ... 18 more > Caused by: java.lang.NumberFormatException: For input string: "1" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Long.parseLong(Long.java:441) > at java.lang.Long.<init>(Long.java:702) > at com.musigma.hive.udf.ip2int.evaluate(ip2int.java:11) > ... 23 more > > > If I am running the HIVE UDF like --- select ip2int("102.134.123.1") from > sample_data; Its not giving any error. > Strange thing is, its numberformat exception. I am not able to use the > string into int in my java. Very strange issue. > > Can someone please tell me what stupid mistake I am doing ? > > Is there any other UDF that does string to int/long coversion ? > > Regards, > Praveenesh