Re: Map tuple to case class in Dataset

Tim Gautier Wed, 01 Jun 2016 08:13:21 -0700

I spun up another EC2 cluster today with Spark 1.6.1 and I still get the
error.

scala>       case class Test(a: Int)
defined class Test

scala>       Seq(1,2).toDS.map(t => Test(t)).show
16/06/01 15:04:21 WARN scheduler.TaskSetManager: Lost task 39.0 in stage
0.0 (TID 39, ip-10-2-2-203.us-west-2.compute.internal):
java.lang.NoClassDefFoundError: Could not initialize class $line29.$read$
at
$line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35)
at
$line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:312)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:149)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

16/06/01 15:04:21 INFO scheduler.TaskSetManager: Starting task 39.1 in
stage 0.0 (TID 40, ip-10-2-2-111.us-west-2.compute.internal, partition
39,PROCESS_LOCAL, 2386 bytes)
16/06/01 15:04:21 WARN scheduler.TaskSetManager: Lost task 19.0 in stage
0.0 (TID 19, ip-10-2-2-203.us-west-2.compute.internal):
java.lang.ExceptionInInitializerError
at $line29.$read$$iwC.<init>(<console>:7)
at $line29.$read.<init>(<console>:24)
at $line29.$read$.<init>(<console>:28)
at $line29.$read$.<clinit>(<console>)
at
$line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35)
at
$line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:312)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:149)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at $line3.$read$$iwC$$iwC.<init>(<console>:15)
at $line3.$read$$iwC.<init>(<console>:24)
at $line3.$read.<init>(<console>:26)
at $line3.$read$.<init>(<console>:30)
at $line3.$read$.<clinit>(<console>)
... 18 more

On Tue, May 31, 2016 at 8:48 PM Tim Gautier <tim.gaut...@gmail.com> wrote:

> That's really odd. I copied that code directly out of the shell and it
> errored out on me, several times. I wonder if something I did previously
> caused some instability. I'll see if it happens again tomorrow.
>
> On Tue, May 31, 2016, 8:37 PM Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Using spark-shell of 1.6.1 :
>>
>> scala> case class Test(a: Int)
>> defined class Test
>>
>> scala> Seq(1,2).toDS.map(t => Test(t)).show
>> +---+
>> |  a|
>> +---+
>> |  1|
>> |  2|
>> +---+
>>
>> FYI
>>
>> On Tue, May 31, 2016 at 7:35 PM, Tim Gautier <tim.gaut...@gmail.com>
>> wrote:
>>
>>> 1.6.1 The exception is a null pointer exception. I'll paste the whole
>>> thing after I fire my cluster up again tomorrow.
>>>
>>> I take it by the responses that this is supposed to work?
>>>
>>> Anyone know when the next version is coming out? I keep running into
>>> bugs with 1.6.1 that are hindering my progress.
>>>
>>> On Tue, May 31, 2016, 8:21 PM Saisai Shao <sai.sai.s...@gmail.com>
>>> wrote:
>>>
>>>> It works fine in my local test, I'm using latest master, maybe this bug
>>>> is already fixed.
>>>>
>>>> On Wed, Jun 1, 2016 at 7:29 AM, Michael Armbrust <
>>>> mich...@databricks.com> wrote:
>>>>
>>>>> Version of Spark? What is the exception?
>>>>>
>>>>> On Tue, May 31, 2016 at 4:17 PM, Tim Gautier <tim.gaut...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> How should I go about mapping from say a Dataset[(Int,Int)] to a
>>>>>> Dataset[<case class here>]?
>>>>>>
>>>>>> I tried to use a map, but it throws exceptions:
>>>>>>
>>>>>> case class Test(a: Int)
>>>>>> Seq(1,2).toDS.map(t => Test(t)).show
>>>>>>
>>>>>> Thanks,
>>>>>> Tim
>>>>>>
>>>>>
>>>>>
>>>>
>>

Re: Map tuple to case class in Dataset

Reply via email to