Hi,
Is it possible to run lein repl against standalone spark cluster?
I launched cluster on localhost with master at spark://zhmyh-osx.local:7077
and tried to run following commands:
(require '[sparkling.conf :as conf])
(require '[sparkling.core :as spark])
(def c (-> (conf/spark-conf)
(conf/master "spark://zhmyh-osx.local:7077")
(conf/app-name "sparkling-example")))
(def sc (spark/spark-context c))
(def data (spark/parallelize sc ["a" "b" "c" "d" "e"]))
(spark/first data)
and got this
tf-idf.core=> (spark/first data)
2016-01-17 10:42:22,977 INFO spark.SparkContext:59 - Starting job: first
at NativeMethodAccessorImpl.java:-2
2016-01-17 10:42:23,004 INFO cluster.SparkDeploySchedulerBackend:59 -
Registered executor:
AkkaRpcEndpointRef(Actor[akka.tcp://[email protected]:54884/user/Executor#797341151])
with ID 0
2016-01-17 10:42:23,014 INFO scheduler.DAGScheduler:59 - Got job 0 (first
at NativeMethodAccessorImpl.java:-2) with 1 output partitions
2016-01-17 10:42:23,015 INFO scheduler.DAGScheduler:59 - Final stage:
ResultStage 0(first at NativeMethodAccessorImpl.java:-2)
2016-01-17 10:42:23,016 INFO scheduler.DAGScheduler:59 - Parents of final
stage: List()
2016-01-17 10:42:23,017 INFO scheduler.DAGScheduler:59 - Missing parents:
List()
2016-01-17 10:42:23,031 INFO scheduler.DAGScheduler:59 - Submitting
ResultStage 0 (#<JavaRDD: clojure.core/eval, core.clj, L.2927>
ParallelCollectionRDD[0] at parallelize at
NativeMethodAccessorImpl.java:-2), which has no missing parents
2016-01-17 10:42:23,247 INFO cluster.SparkDeploySchedulerBackend:59 -
Registered executor:
AkkaRpcEndpointRef(Actor[akka.tcp://[email protected]:54886/user/Executor#-391700515])
with ID 1
2016-01-17 10:42:23,265 INFO storage.MemoryStore:59 -
ensureFreeSpace(1512) called with curMem=0, maxMem=1030823608
2016-01-17 10:42:23,268 INFO storage.MemoryStore:59 - Block broadcast_0
stored as values in memory (estimated size 1512.0 B, free 983.1 MB)
2016-01-17 10:42:23,349 INFO storage.BlockManagerMasterEndpoint:59 -
Registering block manager 192.168.200.217:54890 with 530.0 MB RAM,
BlockManagerId(0, 192.168.200.217, 54890)
2016-01-17 10:42:23,482 INFO storage.BlockManagerMasterEndpoint:59 -
Registering block manager 192.168.200.217:54891 with 530.0 MB RAM,
BlockManagerId(1, 192.168.200.217, 54891)
2016-01-17 10:42:23,639 INFO storage.MemoryStore:59 - ensureFreeSpace(976)
called with curMem=1512, maxMem=1030823608
2016-01-17 10:42:23,640 INFO storage.MemoryStore:59 - Block
broadcast_0_piece0 stored as bytes in memory (estimated size 976.0 B, free
983.1 MB)
2016-01-17 10:42:23,643 INFO storage.BlockManagerInfo:59 - Added
broadcast_0_piece0 in memory on 127.0.0.1:54879 (size: 976.0 B, free: 983.1
MB)
2016-01-17 10:42:23,646 INFO spark.SparkContext:59 - Created broadcast 0
from broadcast at DAGScheduler.scala:861
2016-01-17 10:42:23,652 INFO scheduler.DAGScheduler:59 - Submitting 1
missing tasks from ResultStage 0 (#<JavaRDD: clojure.core/eval, core.clj,
L.2927> ParallelCollectionRDD[0] at parallelize at
NativeMethodAccessorImpl.java:-2)
2016-01-17 10:42:23,654 INFO scheduler.TaskSchedulerImpl:59 - Adding task
set 0.0 with 1 tasks
2016-01-17 10:42:23,720 INFO scheduler.TaskSetManager:59 - Starting task
0.0 in stage 0.0 (TID 0, 192.168.200.217, PROCESS_LOCAL, 1946 bytes)
2016-01-17 10:42:23,928 WARN scheduler.TaskSetManager:71 - Lost task 0.0
in stage 0.0 (TID 0, 192.168.200.217): java.lang.IllegalStateException:
unread block data
at
java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2431)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1383)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2016-01-17 10:42:23,942 INFO scheduler.TaskSetManager:59 - Starting task
0.1 in stage 0.0 (TID 1, 192.168.200.217, PROCESS_LOCAL, 1946 bytes)
2016-01-17 10:42:24,145 INFO scheduler.TaskSetManager:59 - Lost task 0.1
in stage 0.0 (TID 1) on executor 192.168.200.217:
java.lang.IllegalStateException (unread block data) [duplicate 1]
2016-01-17 10:42:24,155 INFO scheduler.TaskSetManager:59 - Starting task
0.2 in stage 0.0 (TID 2, 192.168.200.217, PROCESS_LOCAL, 1946 bytes)
2016-01-17 10:42:24,168 INFO scheduler.TaskSetManager:59 - Lost task 0.2
in stage 0.0 (TID 2) on executor 192.168.200.217:
java.lang.IllegalStateException (unread block data) [duplicate 2]
2016-01-17 10:42:24,176 INFO scheduler.TaskSetManager:59 - Starting task
0.3 in stage 0.0 (TID 3, 192.168.200.217, PROCESS_LOCAL, 1946 bytes)
2016-01-17 10:42:24,190 INFO scheduler.TaskSetManager:59 - Lost task 0.3
in stage 0.0 (TID 3) on executor 192.168.200.217:
java.lang.IllegalStateException (unread block data) [duplicate 3]
2016-01-17 10:42:24,191 ERROR scheduler.TaskSetManager:75 - Task 0 in stage
0.0 failed 4 times; aborting job
2016-01-17 10:42:24,194 INFO scheduler.TaskSchedulerImpl:59 - Removed
TaskSet 0.0, whose tasks have all completed, from pool
2016-01-17 10:42:24,202 INFO scheduler.TaskSchedulerImpl:59 - Cancelling
stage 0
2016-01-17 10:42:24,206 INFO scheduler.DAGScheduler:59 - ResultStage 0
(first at NativeMethodAccessorImpl.java:-2) failed in 0.536 s
2016-01-17 10:42:24,210 INFO scheduler.DAGScheduler:59 - Job 0 failed:
first at NativeMethodAccessorImpl.java:-2, took 1.232848 s
IllegalStateException unread block data
java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode
(ObjectInputStream.java:2431)
tf-idf.core=> 2016-01-17 10:42:41,780 INFO storage.BlockManagerInfo:59 -
Removed broadcast_0_piece0 on 127.0.0.1:54879 in memory (size: 976.0 B,
free: 983.1 MB)
2016-01-17 10:42:41,793 INFO spark.ContextCleaner:59 - Cleaned accumulator
1
What I'm doing wrong?
thanks!
Andrew
On Wednesday, January 7, 2015 at 3:23:42 PM UTC+3, chris_betz wrote:
>
> Hi,
>
>
> we just released Sparkling (https://gorillalabs.github.io/sparkling/),
> our take on an API to Apache Spark.
>
>
>
> With an eye on speed for very large amounts of data we improved clj-spark
> and flambo to get us the speed we need for our production environment.
>
>
> See https://gorillalabs.github.io/sparkling/articles/getting_started.html
> for a quickstart or dive directly into our playground project by
> git clone https://github.com/gorillalabs/sparkling-getting-started.git
>
>
>
> Happy hacking
>
> Chris
> (@gorillalabs_de <https://twitter.com/gorillalabs_de>)
>
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.