BTW The same query, on the same cluster but on Spark shell return the expected results.
On Mon, Jun 29, 2015 at 3:24 PM, Ophir Cohen <oph...@gmail.com> wrote: > It looks that Zeppelin jar does not distributed to Spark nodes, though I > can't understand why it needed for the UDF. > > On Mon, Jun 29, 2015 at 3:23 PM, Ophir Cohen <oph...@gmail.com> wrote: > >> Thanks for the response, >> I'm not sure what do you mean, it exactly what I tried and failed. >> As I wrote above, 'hc' is actually different name to sqlc (that is >> different name to z.sqlContext). >> >> I get the same results. >> >> >> On Mon, Jun 29, 2015 at 2:12 PM, Mina Lee <mina...@nflabs.com> wrote: >> >>> Hi Ophir, >>> >>> Can you try below? >>> >>> def getNum(): Int = { >>> 100 >>> } >>> sqlc.udf.register("getNum", getNum _) >>> sqlc.sql("select getNum() from filteredNc limit 1").show >>> >>> FYI sqlContext(==sqlc) is internally created by Zeppelin >>> and use hiveContext as sqlContext by default. >>> (If you did not change useHiveContext to be "false" in interpreter menu.) >>> >>> Hope it helps. >>> >>> On Mon, Jun 29, 2015 at 7:55 PM, Ophir Cohen <oph...@gmail.com> wrote: >>> >>>> Guys? >>>> Somebody? >>>> Can it be that Zeppelin does not support UDFs? >>>> >>>> On Sun, Jun 28, 2015 at 11:53 AM, Ophir Cohen <oph...@gmail.com> wrote: >>>> >>>>> Hi Guys, >>>>> One more problem I have encountered using Zeppelin. >>>>> Using Spark 1.3.1 on Yarn Hadoop 2.4 >>>>> >>>>> I'm trying to create and use UDF (hc == z.sqlContext == HiveContext): >>>>> 1. Create and register the UDF: >>>>> def getNum(): Int = { >>>>> 100 >>>>> } >>>>> >>>>> hc.udf.register("getNum",getNum _) >>>>> 2. And I try to use on exist table: >>>>> %sql select getNum() from filteredNc limit 1 >>>>> >>>>> Or: >>>>> 3. Trying using direct hc: >>>>> hc.sql("select getNum() from filteredNc limit 1").collect >>>>> >>>>> Both of them yield with >>>>> *"java.lang.ClassNotFoundException: >>>>> org.apache.zeppelin.spark.ZeppelinContext"* >>>>> (see below the full exception). >>>>> >>>>> And my questions is: >>>>> 1. Can it be that ZeppelinContext is not available on Spark nodes? >>>>> 2. Why it need ZeppelinContext anyway? Why it's relevant? >>>>> >>>>> The exception: >>>>> WARN [2015-06-28 08:43:53,850] ({task-result-getter-0} >>>>> Logging.scala[logWarning]:71) - Lost task 0.2 in stage 23.0 (TID 1626, >>>>> ip-10-216-204-246.ec2.internal): java.lang.NoClassDefFoundError: >>>>> Lorg/apache/zeppelin/spark/ZeppelinContext; >>>>> at java.lang.Class.getDeclaredFields0(Native Method) >>>>> at java.lang.Class.privateGetDeclaredFields(Class.java:2499) >>>>> at java.lang.Class.getDeclaredField(Class.java:1951) >>>>> at >>>>> java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659) >>>>> >>>>> <Many more of ObjectStreamClass lines of exception> >>>>> >>>>> Caused by: java.lang.ClassNotFoundException: >>>>> org.apache.zeppelin.spark.ZeppelinContext >>>>> at >>>>> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:69) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >>>>> ... 103 more >>>>> Caused by: java.lang.ClassNotFoundException: >>>>> org.apache.zeppelin.spark.ZeppelinContext >>>>> at java.lang.ClassLoader.findClass(ClassLoader.java:531) >>>>> at >>>>> org.apache.spark.util.ParentClassLoader.findClass(ParentClassLoader.scala:26) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >>>>> at >>>>> org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:34) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >>>>> at >>>>> org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:30) >>>>> at >>>>> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:64) >>>>> ... 105 more >>>>> >>>> >>>> >>> >> >