So.....
this.getResourceAsStream(filename) is a very tricky method to get
right especially in hive which you have the hive-classpath, the
hadoop-classpath, the hive-jdbc classpath. Especially when you
consider that launched map/reduce tasks get there own environment and
classpath.

I had the same issues when I was writing my geo-ip-udf. See the comments.

https://github.com/edwardcapriolo/hive-geoip/blob/master/src/main/java/com/jointhegrid/udf/geoip/GenericUDFGeoIP.java

I came to the conclusion that if you add a file to the distributed
cache using 'ADD FILE'
You can reliably assume it will be in the current working directory
and this works.

        File f = new File(database);

I hope this helps.
Edward

On Tue, May 29, 2012 at 8:35 AM, Maoz Gelbart <maoz.gelb...@pursway.com> wrote:
> Hi all,
>
>
>
> I am using Hive 0.7.1 over Cloudera’s Hadoop distribution 0.20.2 and MapR
> hdfs distribution 1.1.1.
>
> I wrote a GenericUDF packaged as a Jar that attempts to open a local
> resource during its initialization at initialize(ObjectInspector[]
> arguments) command.
>
>
>
> When I run with the CLI, everything is fine.
>
> When I run using Cloudera’s Hive-JDBC driver, The UDF fails with null
> pointer returned from the command this.getResourceAsStream(filename).
>
> Removing the line fixed the problem and the UDF ran on both CLI and Jdbc, so
> I believe that “ADD JAR” and “CREATE TEMPORARY FUNCTION” were entered
> correctly.
>
>
>
> Did anyone observe such a behavior? I have a demo Jar to reproduce the
> problem if needed.
>
>
>
> Thanks,
>
> Maoz

Reply via email to