Re: Closure issue with spark 1.4.1

David Salinas Mon, 24 Aug 2015 09:46:13 -0700

Hi Moon,

I found another way to reproduce the problem:


//cell 1 does not work
val file = "hdfs://someclusterfile.json"
val s = z.input("Foo").toString
val textFile = sc.textFile(file)
textFile.filter(_.contains(s)).count
//org.apache.spark.SparkException: Job aborted due to stage failure: Task
41 in stage 5.0 failed 4 times, most recent failure: Lost task 41.3 in
stage 5.0 (TID 2735,XXX.com ): java.lang.NoClassDefFoundError:
Lorg/apache/zeppelin/spark/ZeppelinContext;

// cell 2 works
val file = "hdfs://someclusterfile.json"
val s = "Y"
val textFile = sc.textFile(file)
textFile.filter(_.contains(s)).count
//res19: Long = 109

This kind of issue happens often also when using variables from other cells
and also when taking closure for transformation. Maybe you are reading
variables inside the transformation with something like "z.get("s")" which
causes z to be send to the slaves as one of its member is used (although I
also sometimes have this issue without using anything from other cells).

Best,

David


On Mon, Aug 24, 2015 at 10:34 AM, David Salinas <david.salinas....@gmail.com
> wrote:

> Sorry I forgot to mention my environment:
> mesos 0.17, spark 1.4.1, scala 2.10.4, java 1.8
>
> On Mon, Aug 24, 2015 at 10:32 AM, David Salinas <
> david.salinas....@gmail.com> wrote:
>
>> Hi Moon,
>>
>> Today I cannot reproduce the bug with elementary example either but it is
>> still impacting all my notebooks. The weird thing is that when calling a
>> transformation with map, it takes Zeppelin Context in the closure which
>> gives these java.lang.NoClassDefFoundError:
>> Lorg/apache/zeppelin/spark/ZeppelinContext errors (spark shell run this
>> command without any problem). I will try to find another example that is
>> more persistent (it is weird this example was failing yesterday). Do you
>> have any idea of what could cause Zeppelin Context to be included in the
>> closure?
>>
>> Best,
>>
>> David
>>
>>
>> On Fri, Aug 21, 2015 at 6:29 PM, moon soo Lee <m...@apache.org> wrote:
>>
>>> I have tested your code and can not reproduce the problem.
>>>
>>> Could you share your environment? how did you configure Zeppelin with
>>> Spark?
>>>
>>> Thanks,
>>> moon
>>>
>>> On Fri, Aug 21, 2015 at 2:25 AM David Salinas <
>>> david.salinas....@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a problem when using spark closure. This error was not appearing
>>>> with spark 1.2.1.
>>>>
>>>> I have included a reproducible example that happens when taking the
>>>> closure (Zeppelin has been built with head of master with this command mvn
>>>> install -DskipTests -Pspark-1.4 -Dspark.version=1.4.1
>>>> -Dhadoop.version=2.2.0 -Dprotobuf.version=2.5.0). Does anyone ever
>>>> encountered this problem? All my previous notebooks are broken by this :(
>>>>
>>>> ------------------------------
>>>> val textFile = sc.textFile("hdfs://somefile.txt")
>>>>
>>>> val f = (s: String) => s+s
>>>> textFile.map(f).count
>>>> //works fine
>>>> //res145: Long = 407
>>>>
>>>>
>>>> def f(s:String) = {
>>>>     s+s
>>>> }
>>>> textFile.map(f).count
>>>>
>>>> //fails ->
>>>>
>>>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>>>> 566 in stage 87.0 failed 4 times, most recent failure: Lost task 566.3 in
>>>> stage 87.0 (TID 43396, XXX.com): java.lang.NoClassDefFoundError:
>>>> Lorg/apache/zeppelin/spark/ZeppelinContext; at
>>>> java.lang.Class.getDeclaredFields0(Native Method) at
>>>> java.lang.Class.privateGetDeclaredFields(Class.java:2583) at
>>>> java.lang.Class.getDeclaredField(Class.java:2068) ...
>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>>>> at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
>>>>
>>>> Best,
>>>>
>>>> David
>>>>
>>>
>>
>

Re: Closure issue with spark 1.4.1

Reply via email to