How to specify PATHS for user defined functions.

2015-07-09 Thread Dan Dong
Hi, All, I have a function and want to access it in my spark programs, but I got the: "Exception in thread "main" java.lang.NoSuchMethodError" in spark-submit. I put the function under: ./src/main/scala/com/aaa/MYFUNC/MYFUNC.scala: package com.aaa.MYFUNC object MYFUNC{ def FUNC1(input: List[S

To access elements of a org.apache.spark.mllib.linalg.Vector

2015-07-14 Thread Dan Dong
Hi, I'm wondering how to access elements of a linalg.Vector, e.g: sparseVector: Seq[org.apache.spark.mllib.linalg.Vector] = List((3,[1,2],[1.0,2.0]), (3,[0,1,2],[3.0,4.0,5.0])) scala> sparseVector(1) res16: org.apache.spark.mllib.linalg.Vector = (3,[0,1,2],[3.0,4.0,5.0]) How to get the indices

Re: To access elements of a org.apache.spark.mllib.linalg.Vector

2015-07-14 Thread Dan Dong
.indices.zip(sVec.values).toMap > ``` > > Best, > Burak > > On Tue, Jul 14, 2015 at 12:23 PM, Dan Dong wrote: > >> Hi, >> I'm wondering how to access elements of a linalg.Vector, e.g: >> sparseVector: Seq[org.apache.spark.mllib.linalg.Vector] = &g

Could not access Spark webUI on OpenStack VMs

2016-04-28 Thread Dan Dong
Hi, all, I'm having problem to access the web UI of my Spark cluster. The cluster is composed of a few virtual machines running on a OpenStack platform. The VMs are launched from CentOS7.0 server image available from official site. The Spark itself runs well and master and worker process are all

How to share a Map among RDDS?

2015-07-21 Thread Dan Dong
Hi, All, I am trying to access a Map from RDDs that are on different compute nodes, but without success. The Map is like: val map1 = Map("aa"->1,"bb"->2,"cc"->3,...) All RDDs will have to check against it to see if the key is in the Map or not, so seems I have to make the Map itself global, the

Re: How to share a Map among RDDS?

2015-07-22 Thread Dan Dong
do rdd.collect and then broadcast or you can do a join >> On 22 Jul 2015 07:54, "Dan Dong" wrote: >> >>> Hi, All, >>> >>> >>> I am trying to access a Map from RDDs that are on different compute >>> nodes, but without success. The

Re: How to share a Map among RDDS?

2015-07-22 Thread Dan Dong
Thanks Andrew, exactly. 2015-07-22 14:26 GMT-05:00 Andrew Or : > Hi Dan, > > `map2` is a broadcast variable, not your map. To access the map on the > executors you need to do `map2.value(a)`. > > -Andrew > > 2015-07-22 12:20 GMT-07:00 Dan Dong : > >> Hi, An

spark-submit and spark-shell behaviors mismatch.

2015-07-22 Thread Dan Dong
Hi, I have a simple test spark program as below, the strange thing is that it runs well under a spark-shell, but will get a runtime error of java.lang.NoSuchMethodError: in spark-submit, which indicate the line of: val maps2=maps.collect.toMap has problem. But why the compilation has no prob

Re: spark-submit and spark-shell behaviors mismatch.

2015-07-23 Thread Dan Dong
it runs with spark-shell do you run > spark shell in local mode or with --master? I'd try with --master master you use for spark-submit> > > Also, if you're using standalone mode I believe the worker log contains > the launch command for the executor -- you probably want to

java.lang.NoSuchMethodError for "list.toMap".

2015-07-23 Thread Dan Dong
Hi, When I ran with spark-submit the following simple Spark program of: import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf import org.apache.spark.rdd.RDD import org.apache.spark.SparkContext import org.apache.spark._ import SparkContext._ object TEST2{ def main(args:Array[

Re: java.lang.NoSuchMethodError for "list.toMap".

2015-07-27 Thread Dan Dong
t; Best Regards > > On Fri, Jul 24, 2015 at 2:15 AM, Dan Dong wrote: > >> Hi, >> When I ran with spark-submit the following simple Spark program of: >> import org.apache.spark.SparkContext._ >> import org.apache.spark.SparkConf >> import org.apache.spark.rdd

Multiple Kafka topics processing in Spark 2.2

2017-09-06 Thread Dan Dong
Hi, All, I have one issue here about how to process multiple Kafka topics in a Spark 2.* program. My question is: How to get the topic name from a message received from Kafka? E.g: .. val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder]( ssc, k

Re: Multiple Kafka topics processing in Spark 2.2

2017-09-08 Thread Dan Dong
; > directKafkaStream.transform { rdd => >offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges >rdd > }.map { >... > }.foreachRDD { rdd => >for (o <- offsetRanges) { > println(*s"${o.topic}* ${o.partition} ${o.fromOffset} ${o.untilOffset

Error: no snappyjava in java.library.path

2015-02-26 Thread Dan Dong
Hi, All, When I run a small program in spark-shell, I got the following error: ... Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886) at java.lang.Runtime.loadLibrary0(Runtime.java:849) at java.lang

multiple programs compilation by sbt.

2015-04-29 Thread Dan Dong
Hi, Following the Quick Start guide: https://spark.apache.org/docs/latest/quick-start.html I could compile and run a Spark program successfully, now my question is how to compile multiple programs with sbt in a bunch. E.g, two programs as: ./src ./src/main ./src/main/scala ./src/main/scala/Sim

Re: multiple programs compilation by sbt.

2015-04-29 Thread Dan Dong
HI, Ted, I will have a look at it , thanks a lot. Cheers, Dan 2015年4月29日 下午5:00于 "Ted Yu" 写道: > Have you looked at > http://www.scala-sbt.org/0.13/tutorial/Multi-Project.html ? > > Cheers > > On Wed, Apr 29, 2015 at 2:45 PM, Dan Dong wrote: > >> Hi,

question about the TFIDF.

2015-05-06 Thread Dan Dong
Hi, All, When I try to follow the document about tfidf from: http://spark.apache.org/docs/latest/mllib-feature-extraction.html val conf = new SparkConf().setAppName("TFIDF") val sc=new SparkContext(conf) val documents=sc.textFile("hdfs://cluster-test-1:9000/user/ubuntu/textExampl

no snappyjava in java.library.path

2015-01-12 Thread Dan Dong
Hi, My Spark job failed with "no snappyjava in java.library.path" as: Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1857) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at java.lang.System.loadL