RE: Error KafkaStream

2015-02-05 Thread jishnu.prathap
Hi,

If your message is string you will have to Change Encoder and 
Decoder to StringEncoder , StringDecoder.

If your message Is byte[] you can use DefaultEncoder & Decoder.



Also Don’t forget to add import statements depending on ur encoder and decoder.

import kafka.serializer.StringEncoder;

import kafka.serializer. StringDecoder;


Regards
Jishnu Prathap

-Original Message-
From: Shao, Saisai [mailto:saisai.s...@intel.com]
Sent: Friday, February 06, 2015 6:41 AM
To: Eduardo Costa Alfaia; Sean Owen
Cc: user@spark.apache.org
Subject: RE: Error KafkaStream



Hi,



I think you should change the `DefaultDecoder` of your type parameter into 
`StringDecoder`, seems you want to decode the message into String. 
`DefaultDecoder` is to return Array[Byte], not String, so here class casting 
will meet error.



Thanks

Jerry



-Original Message-

From: Eduardo Costa Alfaia [mailto:e.costaalf...@unibs.it]

Sent: Friday, February 6, 2015 12:04 AM

To: Sean Owen

Cc: user@spark.apache.org

Subject: Re: Error KafkaStream



I don’t think so Sean.



> On Feb 5, 2015, at 16:57, Sean Owen 
> mailto:so...@cloudera.com>> wrote:

>

> Is SPARK-4905 / https://github.com/apache/spark/pull/4371/files the same 
> issue?

>

> On Thu, Feb 5, 2015 at 7:03 AM, Eduardo Costa Alfaia

> mailto:e.costaalf...@unibs.it>> wrote:

>> Hi Guys,

>> I’m getting this error in KafkaWordCount;

>>

>> TaskSetManager: Lost task 0.0 in stage 4095.0 (TID 1281, 10.20.10.234):

>> java.lang.ClassCastException: [B cannot be cast to java.lang.String

>>at

>> org.apache.spark.examples.streaming.KafkaWordCount$$anonfun$4$$anonfu

>> n$apply$1.apply(KafkaWordCount.scala:7

>>

>>

>> Some idea that could be?

>>

>>

>> Bellow the piece of code

>>

>>

>>

>> val kafkaStream = {

>>val kafkaParams = Map[String, String](

>>"zookeeper.connect" -> "achab3:2181",

>>"group.id" -> "mygroup",

>>"zookeeper.connect.timeout.ms" -> "1",

>>"kafka.fetch.message.max.bytes" -> "400",

>>"auto.offset.reset" -> "largest")

>>

>>val topicpMap = topics.split(",").map((_,numThreads.toInt)).toMap

>>  //val lines = KafkaUtils.createStream[String, String, DefaultDecoder,

>> DefaultDecoder](ssc, kafkaParams, topicpMa p, storageLevel =

>> StorageLevel.MEMORY_ONLY_SER).map(_._2)

>>val KafkaDStreams = (1 to numStreams).map {_ =>

>>KafkaUtils.createStream[String, String, DefaultDecoder,

>> DefaultDecoder](ssc, kafkaParams, topicpMap, storageLe vel =

>> StorageLevel.MEMORY_ONLY_SER).map(_._2)

>>}

>>val unifiedStream = ssc.union(KafkaDStreams)

>>unifiedStream.repartition(sparkProcessingParallelism)

>> }

>>

>> Thanks Guys

>>

>> Informativa sulla Privacy: http://www.unibs.it/node/8155





--

Informativa sulla Privacy: http://www.unibs.it/node/8155



-

To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.org For 
additional commands, e-mail: 
user-h...@spark.apache.org




RE: Spark SQL Stackoverflow error

2015-03-10 Thread jishnu.prathap
import com.google.gson.{GsonBuilder, JsonParser}
import org.apache.spark.mllib.clustering.KMeans
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.clustering.KMeans
/**
* Examine the collected tweets and trains a model based on them.
*/
object ExamineAndTrain {
val jsonParser = new JsonParser()
val gson = new GsonBuilder().setPrettyPrinting().create()
def main(args: Array[String]) {
   val outputModelDir="C:\\outputmode111"
 val tweetInput="C:\\test"
   val numClusters=10
   val numIterations=20

val conf = new 
SparkConf().setAppName(this.getClass.getSimpleName).setMaster("local[4]").set("spark.executor.memory",
 "1g")
val sc = new SparkContext(conf)
val tweets = sc.textFile(tweetInput)
val vectors = tweets.map(Utils.featurize).cache()
vectors.count() // Calls an action on the RDD to populate the vectors cache.
val model = KMeans.train(vectors, numClusters, numIterations)
sc.makeRDD(model.clusterCenters, numClusters).saveAsObjectFile(outputModelDir)
val some_tweets = tweets.take(2)
println("Example tweets from the clusters")
for (i <- 0 until numClusters) {
println(s"\nCLUSTER $i:")
some_tweets.foreach { t =>
if (model.predict(Utils.featurize(t)) == i) {
println(t)
}
}
}
}
}

From: lovelylavs [via Apache Spark User List] 
[mailto:ml-node+s1001560n21956...@n3.nabble.com]
Sent: Sunday, March 08, 2015 2:34 AM
To: Jishnu Menath Prathap (WT01 - BAS)
Subject: Re: Spark SQL Stackoverflow error


​Thank you so much for your reply. If it is possible can you please provide me 
with the code?



Thank you so much.



Lavanya.


From: Jishnu Prathap [via Apache Spark User List] >
Sent: Sunday, March 1, 2015 3:03 AM
To: Nadikuda, Lavanya
Subject: RE: Spark SQL Stackoverflow error

Hi
The Issue was not fixed .
I removed the between sql layer and directly created features from the file.

Regards
Jishnu Prathap

From: lovelylavs [via Apache Spark User List] [mailto:ml-node+[hidden 
email]]
Sent: Sunday, March 01, 2015 4:44 AM
To: Jishnu Menath Prathap (WT01 - BAS)
Subject: Re: Spark SQL Stackoverflow error

Hi,

how was this issue fixed?

If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Stackoverflow-error-tp12086p21862.html
To unsubscribe from Spark SQL Stackoverflow error, click here.
NAML
The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com

If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Stackoverflow-error-tp12086p21863.html
To unsubscribe from Spark SQL Stackoverflow error, click here.
NAML


If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Stackoverflow-error-tp12086p21956.html
To unsubscribe from Spark SQL Stackoverflow error, click 
here.
NAML

RE: How to set Spark executor memory?

2015-03-16 Thread jishnu.prathap
Hi Xi Shen,

You could set the spark.executor.memory in the code itself . new 
SparkConf()..set("spark.executor.memory", "2g")
Or you can try the -- spark.executor.memory 2g while submitting the jar.

Regards
Jishnu Prathap

From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Monday, March 16, 2015 2:06 PM
To: Xi Shen
Cc: user@spark.apache.org
Subject: Re: How to set Spark executor memory?

By default spark.executor.memory is set to 512m, I'm assuming since you are 
submiting the job using spark-submit and it is not able to override the value 
since you are running in local mode. Can you try it without using spark-submit 
as a standalone project?

Thanks
Best Regards

On Mon, Mar 16, 2015 at 1:52 PM, Xi Shen 
mailto:davidshe...@gmail.com>> wrote:

I set it in code, not by configuration. I submit my jar file to local. I am 
working in my developer environment.

On Mon, 16 Mar 2015 18:28 Akhil Das 
mailto:ak...@sigmoidanalytics.com>> wrote:
How are you setting it? and how are you submitting the job?

Thanks
Best Regards

On Mon, Mar 16, 2015 at 12:52 PM, Xi Shen 
mailto:davidshe...@gmail.com>> wrote:
Hi,

I have set spark.executor.memory to 2048m, and in the UI "Environment" page, I 
can see this value has been set correctly. But in the "Executors" page, I saw 
there's only 1 executor and its memory is 265.4MB. Very strange value. why not 
256MB, or just as what I set?

What am I missing here?


Thanks,
David



The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com


[no subject]

2014-09-22 Thread jishnu.prathap


The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


Unable to change the Ports

2014-09-22 Thread jishnu.prathap
Hi Everyone i am new to spark ... I am posting some basic doubts
 i met while trying to create a standalone cluster for a small poc ...

1)My Corporate firewall blocked the port 7077, which is the default port of 
Master URL ,
So i used  start-master.sh --port 8080 (also tried with several other ports and 
-p)
 but still the port used by master was unchanged.
I resolved this problem by addingexport SPARK_MASTER_PORT="59468" in  
spark-env.sh . My first doubt is why -p PORT, --port PORT didn't work?

2)same result when tried with spark spark-class 
org.apache.spark.deploy.worker.Worker spark://IP:PORT

3)Also Why is by default the port random for worker

..Sorry if i am being stupid

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


Re: java.lang.NumberFormatException while starting spark-worker

2014-09-24 Thread jishnu.prathap
Hi ,
   I am getting this weird error while starting Worker.

-bash-4.1$ spark-class org.apache.spark.deploy.worker.Worker 
spark://osebi-UServer:59468
Spark assembly has been built with Hive, including Datanucleus jars on classpath
14/09/24 16:22:04 INFO worker.Worker: Registered signal handlers for [TERM, 
HUP, INT]
"xception in thread "main" java.lang.NumberFormatException: For input string: 
"61608
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at 
scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
at 
org.apache.spark.deploy.worker.WorkerArguments.(WorkerArguments.scala:38)
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:376)
at org.apache.spark.deploy.worker.Worker.main(Worker.scala)

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


RE: java.lang.NumberFormatException while starting spark-worker

2014-09-24 Thread jishnu.prathap

Hi ,
   I am getting this weird error while starting Worker.

-bash-4.1$ spark-class org.apache.spark.deploy.worker.Worker 
spark://osebi-UServer:59468
Spark assembly has been built with Hive, including Datanucleus jars on classpath
14/09/24 16:22:04 INFO worker.Worker: Registered signal handlers for [TERM, 
HUP, INT]
"xception in thread "main" java.lang.NumberFormatException: For input string: 
"61608
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at 
scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
at 
org.apache.spark.deploy.worker.WorkerArguments.(WorkerArguments.scala:38)
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:376)
at org.apache.spark.deploy.worker.Worker.main(Worker.scala)

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


java.lang.NumberFormatException while starting spark-worker

2014-09-24 Thread jishnu.prathap
Hi ,
   I am getting this weird error while starting Worker.

-bash-4.1$ spark-class org.apache.spark.deploy.worker.Worker 
spark://osebi-UServer:59468
Spark assembly has been built with Hive, including Datanucleus jars on classpath
14/09/24 16:22:04 INFO worker.Worker: Registered signal handlers for [TERM, 
HUP, INT]
"xception in thread "main" java.lang.NumberFormatException: For input string: 
"61608
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at 
scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
at 
org.apache.spark.deploy.worker.WorkerArguments.(WorkerArguments.scala:38)
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:376)
at org.apache.spark.deploy.worker.Worker.main(Worker.scala)

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


java.lang.NumberFormatException while starting spark-worker

2014-09-24 Thread jishnu.prathap
Hi ,
   I am getting this weird error while starting Worker.

-bash-4.1$ spark-class org.apache.spark.deploy.worker.Worker 
spark://osebi-UServer:59468
Spark assembly has been built with Hive, including Datanucleus jars on classpath
14/09/24 16:22:04 INFO worker.Worker: Registered signal handlers for [TERM, 
HUP, INT]
"xception in thread "main" java.lang.NumberFormatException: For input string: 
"61608
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at 
scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
at 
org.apache.spark.deploy.worker.WorkerArguments.(WorkerArguments.scala:38)
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:376)
at org.apache.spark.deploy.worker.Worker.main(Worker.scala)

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


Re: java.lang.NumberFormatException while starting spark-worker

2014-09-24 Thread jishnu.prathap
 No .. I am not passing any argument.
I am getting this error while starting the Master
The same spark binary i am able to run in another machine ( ubuntu ) 
installed.



The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


RE: java.lang.NumberFormatException while starting spark-worker

2014-09-24 Thread jishnu.prathap
Hi 

Sorry for the repeated mails .My post was not accepted by the mailing list due 
to some problem in postmas...@wipro.com I had to manually send it . Still it 
was not visible for half an hour.I retried. But later all the post was visible. 
I deleted it from the page but it was already delivered to the mailing list. 
Sorry for the repeated mails.


-Original Message-
From: Sean Owen [mailto:so...@cloudera.com]
Sent: Wednesday, September 24, 2014 6:17 PM
To: Jishnu Menath Prathap (WT01 - BAS)
Subject: Re: java.lang.NumberFormatException while starting spark-worker

Please stop emailing the same message repeatedly every half hour.

On Wed, Sep 24, 2014 at 12:21 PM,   wrote:
> Hi ,
>I am getting this weird error while starting Worker.
>
> -bash-4.1$ spark-class org.apache.spark.deploy.worker.Worker
> spark://osebi-UServer:59468
> Spark assembly has been built with Hive, including Datanucleus jars on 
> classpath
> 14/09/24 16:22:04 INFO worker.Worker: Registered signal handlers for 
> [TERM, HUP, INT] "xception in thread "main"
> java.lang.NumberFormatException: For input
> string: "61608
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Integer.parseInt(Integer.java:492)
> at java.lang.Integer.parseInt(Integer.java:527)
> at
> scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
> at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
> at
> org.apache.spark.deploy.worker.WorkerArguments.(WorkerArguments.scala:38)
> at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:376)
> at org.apache.spark.deploy.worker.Worker.main(Worker.scala)
>
> The information contained in this electronic message and any 
> attachments to this message are intended for the exclusive use of the
> addressee(s) and may contain proprietary, confidential or privileged 
> information. If you are not the intended recipient, you should not 
> disseminate, distribute or copy this e-mail. Please notify the sender 
> immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient 
> should check this email and any attachments for the presence of 
> viruses. The company accepts no liability for any damage caused by any 
> virus transmitted by this email.
>
> www.wipro.com

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


basic twitter stream program not working.

2014-11-13 Thread jishnu.prathap
Hi
   I am trying to run a basic twitter stream program but getting blank 
output. Please correct me if I am missing something.

import org.apache.spark.SparkConf
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.twitter.TwitterUtils
import org.apache.spark.streaming.Seconds
import org.apache.log4j.LogManager
import org.apache.log4j.Level

object Sparktwiter1 {
  def main(args: Array[String]) {
LogManager.getRootLogger().setLevel(Level.ERROR);
System.setProperty("http.proxyHost", "proxy4.wipro.com");
System.setProperty("http.proxyPort", "8080");
System.setProperty("twitter4j.oauth.consumerKey", "")
System.setProperty("twitter4j.oauth.consumerSecret", "")
System.setProperty("twitter4j.oauth.accessToken", "")
System.setProperty("twitter4j.oauth.accessTokenSecret", "")
val sparkConf = new 
SparkConf().setAppName("TwitterPopularTags").setMaster("local").set("spark.eventLog.enabled",
 "true")
val ssc = new StreamingContext(sparkConf, Seconds(2))
val stream = TwitterUtils.createStream(ssc, None)//, filters)
stream.print
val s1 = stream.flatMap(status => status.getText)
s1.print
val hashTags = stream.flatMap(status => status.getText.split(" 
").filter(_.startsWith("#")))
hashTags.print
 ssc.start()
ssc.awaitTermination()
  }
}

Output

---
Time: 1415869348000 ms
---

---
Time: 1415869348000 ms
---

---
Time: 1415869348000 ms
---

[cid:image005.jpg@01CFFF52.453A17F0]


[cid:image006.jpg@01CFFF52.453A17F0]


The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


runexample TwitterPopularTags showing Class Not found error

2014-11-13 Thread jishnu.prathap
Hi
I am getting the following error while running the 
TwitterPopularTags  example .I am using spark-1.1.0-bin-hadoop2.4 .

jishnu@getafix:~/spark/bin$ run-example TwitterPopularTags *** ** ** *** **

spark assembly has been built with Hive, including Datanucleus jars on classpath
java.lang.ClassNotFoundException: 
org.apache.spark.examples.org.apache.spark.streaming.examples.TwitterPopularTags
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:318)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

tried executing in three  different machines but all showed the same error.I am 
able to run other examples like SparkPi .


Thanks & Regards
Jishnu Menath Prathap
BAS EBI(Open Source)



The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


RE: basic twitter stream program not working.

2014-11-13 Thread jishnu.prathap
Hi
Thanks Akhil  you saved the day….  Its working perfectly …

Regards
Jishnu Menath Prathap
From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Thursday, November 13, 2014 3:25 PM
To: Jishnu Menath Prathap (WT01 - BAS)
Cc: Akhil [via Apache Spark User List]; user@spark.apache.org
Subject: Re: basic twitter stream program not working.

Change this line

val sparkConf = new 
SparkConf().setAppName("TwitterPopularTags").setMaster("local").set("spark.eventLog.enabled","true")

to

val sparkConf = new 
SparkConf().setAppName("TwitterPopularTags").setMaster("local[4]").set("spark.eventLog.enabled","true")



Thanks
Best Regards

On Thu, Nov 13, 2014 at 2:58 PM, 
mailto:jishnu.prat...@wipro.com>> wrote:
Hi
   I am trying to run a basic twitter stream program but getting blank 
output. Please correct me if I am missing something.

import org.apache.spark.SparkConf
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.twitter.TwitterUtils
import org.apache.spark.streaming.Seconds
import org.apache.log4j.LogManager
import org.apache.log4j.Level

object Sparktwiter1 {
  def main(args: Array[String]) {
LogManager.getRootLogger().setLevel(Level.ERROR);
System.setProperty("http.proxyHost", 
"proxy4.wipro.com");
System.setProperty("http.proxyPort", "8080");
System.setProperty("twitter4j.oauth.consumerKey", "")
System.setProperty("twitter4j.oauth.consumerSecret", "")
System.setProperty("twitter4j.oauth.accessToken", "")
System.setProperty("twitter4j.oauth.accessTokenSecret", "")
val sparkConf = new 
SparkConf().setAppName("TwitterPopularTags").setMaster("local").set("spark.eventLog.enabled",
 "true")
val ssc = new StreamingContext(sparkConf, Seconds(2))
val stream = TwitterUtils.createStream(ssc, None)//, filters)
stream.print
val s1 = stream.flatMap(status => status.getText)
s1.print
val hashTags = stream.flatMap(status => status.getText.split(" 
").filter(_.startsWith("#")))
hashTags.print
 ssc.start()
ssc.awaitTermination()
  }
}

Output

---
Time: 1415869348000 ms
---

---
Time: 1415869348000 ms
---

---
Time: 1415869348000 ms
---

[cid:image001.jpg@01CFFF64.0FD789F0]


[cid:image002.jpg@01CFFF64.0FD789F0]


The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


Is it possible to save the streams to one single file?

2014-11-20 Thread jishnu.prathap
Hi
My question is generic:

§  Is it possible to save the streams to one single file ? if yes can you give 
me a link or code sample?

§  I tried using .saveastextfile but its creating different file for each 
stream. I need to update the same file instead of creating different file for 
each stream.
My Use Case:

§  Retrieve twitter streams , then extract each tweets and perform sentiment 
analysis on them. Count the number of +ve and -ve sentiments.

§  Save the count in a file .file should get updated with each stream.

object sparkAnalytics {
  def main(args: Array[String]) {

org.apache.log4j.LogManager.getRootLogger().setLevel(org.apache.log4j.Level.ERROR);
val sentimentAnalyzer = new SentimentAnalyzer();
System.setProperty("twitter4j.oauth.consumerKey", "***")
System.setProperty("twitter4j.oauth.consumerSecret", "**")
System.setProperty("twitter4j.oauth.accessToken", " ")
System.setProperty("twitter4j.oauth.accessTokenSecret", "**")
val sparkConf = new 
SparkConf().setAppName("TwitterSentimentalAnalysis").setMaster("local[4]").set("spark.eventLog.enabled",
 "true")
val ssc = new StreamingContext(sparkConf, Seconds(2))
val stream = TwitterUtils.createStream(ssc, None) //, filters)
val statuses = stream.map(
  status => sentimentAnalyzer.findSentiment(
 status.getText().replaceAll("[^A-Za-z0-9 \\#]", "")))
val line = statuses.map(   tweetWithSentiment => 
tweetWithSentiment.getCssClass())
val pos = line.filter(s => s.contains("sentiment-positive"))
val k = pos.count
k.print //Instead of Printing it in the console i have to update a file
val neg = line.filter(s => s.contains("sentiment-negative"))
val n = neg.count
n.print //Instead of Printing it in the console i have to update a file
ssc.start()
ssc.awaitTermination()

  }
}

Thanks & Regards
Jishnu Menath Prathap


The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


Re: Persist streams to text files

2014-11-20 Thread jishnu.prathap
Hi I am also having similar problem.. any fix suggested..

Originally Posted by GaganBM
Hi,

I am trying to persist the DStreams to text files. When I use the inbuilt API 
'saveAsTextFiles' as :

stream.saveAsTextFiles(resultDirectory)

this creates a number of subdirectories, for each batch, and within each sub 
directory, it creates bunch of text files for each RDD (I assume).

I am wondering if I can have single text files for each batch. Is there any API 
for that ? Or else, a single output file for the entire stream ?

I tried to manually write from each RDD stream to a text file as :

stream.foreachRDD(rdd =>{
  rdd.foreach(element => {
  fileWriter.write(element)
  })
  })

where 'fileWriter' simply makes use of a Java BufferedWriter to write strings 
to a file. However, this fails with exception :

DStreamCheckpointData.writeObject used
java.io.BufferedWriter
java.io.NotSerializableException: java.io.BufferedWriter
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183)
at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
.

Any help on how to proceed with this ?

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


RE: Persist streams to text files

2014-11-20 Thread jishnu.prathap
Hi Akhil
Thanks for reply
But it creates different directories ..I tried using filewriter  but it shows 
non serializable error..
val stream = TwitterUtils.createStream(ssc, None) //, filters)

val statuses = stream.map(
  status => sentimentAnalyzer.findSentiment({
status.getText().replaceAll("[^A-Za-z0-9 \\#]", "")

  })
  )

val line = statuses.foreachRDD(
  rdd => {
rdd.foreach(
  tweetWithSentiment => {
if(!tweetWithSentiment.getLine().isEmpty())
println(tweetWithSentiment.getCssClass() + " for line :=>  " + 
tweetWithSentiment.getLine())//Now I print in console but I need to update it 
to a file in local machine

  })
  })

Thanks & Regards
Jishnu Menath Prathap
From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Friday, November 21, 2014 11:48 AM
To: Jishnu Menath Prathap (WT01 - BAS)
Cc: u...@spark.incubator.apache.org
Subject: Re: Persist streams to text files


To have a single text file output for each batch you can repartition it to 1 
and then call the saveAsTextFiles

stream.repartition(1).saveAsTextFiles(location)
On 21 Nov 2014 11:28, 
mailto:jishnu.prat...@wipro.com>> wrote:
Hi I am also having similar problem.. any fix suggested..

Originally Posted by GaganBM
Hi,

I am trying to persist the DStreams to text files. When I use the inbuilt API 
'saveAsTextFiles' as :

stream.saveAsTextFiles(resultDirectory)

this creates a number of subdirectories, for each batch, and within each sub 
directory, it creates bunch of text files for each RDD (I assume).

I am wondering if I can have single text files for each batch. Is there any API 
for that ? Or else, a single output file for the entire stream ?

I tried to manually write from each RDD stream to a text file as :

stream.foreachRDD(rdd =>{
  rdd.foreach(element => {
  fileWriter.write(element)
  })
  })

where 'fileWriter' simply makes use of a Java BufferedWriter to write strings 
to a file. However, this fails with exception :

DStreamCheckpointData.writeObject used
java.io.BufferedWriter
java.io.NotSerializableException: java.io.BufferedWriter
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183)
at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
.

Any help on how to proceed with this ?

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


RE: Persist streams to text files

2014-11-21 Thread jishnu.prathap
Hi
Thank you ☺Akhil it worked like charm…..
I used the file writer outside rdd.foreach that might be the reason for 
nonserialisable exception….

Thanks & Regards
Jishnu Menath Prathap
From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Friday, November 21, 2014 1:15 PM
To: Jishnu Menath Prathap (WT01 - BAS)
Cc: u...@spark.incubator.apache.org
Subject: Re: Persist streams to text files

Here's a quick version to store (append) in your local machine

val tweets = TwitterUtils.createStream(ssc, None)

val hashTags = tweets.flatMap(status => status.getText.split(" 
").filter(_.startsWith("#")))


hashTags.foreachRDD(rdds => {

  rdds.foreach(rdd => {
val fw = new FileWriter("/home/akhld/tags.txt", true)
println("HashTag => " + rdd)
fw.write(rdd + "\n")
fw.close()
  })

})

Thanks
Best Regards

On Fri, Nov 21, 2014 at 12:12 PM, 
mailto:jishnu.prat...@wipro.com>> wrote:
Hi Akhil
Thanks for reply
But it creates different directories ..I tried using filewriter  but it shows 
non serializable error..
val stream = TwitterUtils.createStream(ssc, None) //, filters)

val statuses = stream.map(
  status => sentimentAnalyzer.findSentiment({
status.getText().replaceAll("[^A-Za-z0-9 \\#]", "")

  })
  )

val line = statuses.foreachRDD(
  rdd => {
rdd.foreach(
  tweetWithSentiment => {
if(!tweetWithSentiment.getLine().isEmpty())
println(tweetWithSentiment.getCssClass() + " for line :=>  " + 
tweetWithSentiment.getLine())//Now I print in console but I need to update it 
to a file in local machine

  })
  })

Thanks & Regards
Jishnu Menath Prathap
From: Akhil Das 
[mailto:ak...@sigmoidanalytics.com]
Sent: Friday, November 21, 2014 11:48 AM
To: Jishnu Menath Prathap (WT01 - BAS)
Cc: u...@spark.incubator.apache.org
Subject: Re: Persist streams to text files


To have a single text file output for each batch you can repartition it to 1 
and then call the saveAsTextFiles

stream.repartition(1).saveAsTextFiles(location)
On 21 Nov 2014 11:28, 
mailto:jishnu.prat...@wipro.com>> wrote:
Hi I am also having similar problem.. any fix suggested..

Originally Posted by GaganBM
Hi,

I am trying to persist the DStreams to text files. When I use the inbuilt API 
'saveAsTextFiles' as :

stream.saveAsTextFiles(resultDirectory)

this creates a number of subdirectories, for each batch, and within each sub 
directory, it creates bunch of text files for each RDD (I assume).

I am wondering if I can have single text files for each batch. Is there any API 
for that ? Or else, a single output file for the entire stream ?

I tried to manually write from each RDD stream to a text file as :

stream.foreachRDD(rdd =>{
  rdd.foreach(element => {
  fileWriter.write(element)
  })
  })

where 'fileWriter' simply makes use of a Java BufferedWriter to write strings 
to a file. However, this fails with exception :

DStreamCheckpointData.writeObject used
java.io.BufferedWriter
java.io.NotSerializableException: java.io.BufferedWriter
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183)
at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
.

Any help on how to proceed with this ?

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you 

Stack overflow Error while executing spark SQL

2014-12-09 Thread jishnu.prathap
Hi

I am getting Stack overflow Error
Exception in main java.lang.stackoverflowerror
scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
   at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
   at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
   at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
   at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
   at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)

while executing the following code
sqlContext.sql("SELECT text FROM tweetTable LIMIT 
10").collect().foreach(println)

The complete code is from github
https://github.com/databricks/reference-apps/blob/master/twitter_classifier/scala/src/main/scala/com/databricks/apps/twitter_classifier/ExamineAndTrain.scala

import com.google.gson.{GsonBuilder, JsonParser}
import org.apache.spark.mllib.clustering.KMeans
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.clustering.KMeans
/**
* Examine the collected tweets and trains a model based on them.
*/
object ExamineAndTrain {
val jsonParser = new JsonParser()
val gson = new GsonBuilder().setPrettyPrinting().create()
def main(args: Array[String]) {
// Process program arguments and set properties
/*if (args.length < 3) {
System.err.println("Usage: " + this.getClass.getSimpleName +
"")
System.exit(1)
}
*
*/
   val outputModelDir="C:\\MLModel"
 val tweetInput="C:\\MLInput"
   val numClusters=10
   val numIterations=20

//val Array(tweetInput, outputModelDir, Utils.IntParam(numClusters), 
Utils.IntParam(numIterations)) = args

val conf = new 
SparkConf().setAppName(this.getClass.getSimpleName).setMaster("local[4]")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
// Pretty print some of the tweets.
val tweets = sc.textFile(tweetInput)
println("Sample JSON Tweets---")
for (tweet <- tweets.take(5)) {
println(gson.toJson(jsonParser.parse(tweet)))
}
val tweetTable = sqlContext.jsonFile(tweetInput).cache()
tweetTable.registerTempTable("tweetTable")
println("--Tweet table Schema---")
tweetTable.printSchema()
println("Sample Tweet Text-")

sqlContext.sql("SELECT text FROM tweetTable LIMIT 
10").collect().foreach(println)



println("--Sample Lang, Name, text---")
sqlContext.sql("SELECT user.lang, user.name, text FROM tweetTable LIMIT 
1000").collect().foreach(println)
println("--Total count by languages Lang, count(*)---")
sqlContext.sql("SELECT user.lang, COUNT(*) as cnt FROM tweetTable GROUP BY 
user.lang ORDER BY cnt DESC LIMIT 25").collect.foreach(println)
println("--- Training the model and persist it")
val texts = sqlContext.sql("SELECT text from tweetTable").map(_.head.toString)
// Cache the vectors RDD since it will be used for all the KMeans iterations.
val vectors = texts.map(Utils.featurize).cache()
vectors.count() // Calls an action on the RDD to populate the vectors cache.
val model = KMeans.train(vectors, numClusters, numIterations)
sc.makeRDD(model.clusterCenters, numClusters).saveAsObjectFile(outputModelDir)
val some_tweets = texts.take(100)
println("Example tweets from the clusters")
for (i <- 0 until numClusters) {
println(s"\nCLUSTER $i:")
some_tweets.foreach { t =>
if (model.predict(Utils.featurize(t)) == i) {
println(t)
}
}
}
}
}

Thanks & Regards
Jishnu Menath Prathap




Stack overflow Error while executing spark SQL

2014-12-09 Thread jishnu.prathap
Hi

I am getting Stack overflow Error
Exception in main java.lang.stackoverflowerror
scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
   at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
   at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
   at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
   at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
   at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)

while executing the following code
sqlContext.sql("SELECT text FROM tweetTable LIMIT 
10").collect().foreach(println)

The complete code is from github
https://github.com/databricks/reference-apps/blob/master/twitter_classifier/scala/src/main/scala/com/databricks/apps/twitter_classifier/ExamineAndTrain.scala

import com.google.gson.{GsonBuilder, JsonParser}
import org.apache.spark.mllib.clustering.KMeans
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.clustering.KMeans
/**
* Examine the collected tweets and trains a model based on them.
*/
object ExamineAndTrain {
val jsonParser = new JsonParser()
val gson = new GsonBuilder().setPrettyPrinting().create()
def main(args: Array[String]) {
// Process program arguments and set properties
/*if (args.length < 3) {
System.err.println("Usage: " + this.getClass.getSimpleName +
"")
System.exit(1)
}
*
*/
   val outputModelDir="C:\\MLModel"
 val tweetInput="C:\\MLInput"
   val numClusters=10
   val numIterations=20

//val Array(tweetInput, outputModelDir, Utils.IntParam(numClusters), 
Utils.IntParam(numIterations)) = args

val conf = new 
SparkConf().setAppName(this.getClass.getSimpleName).setMaster("local[4]")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
// Pretty print some of the tweets.
val tweets = sc.textFile(tweetInput)
println("Sample JSON Tweets---")
for (tweet <- tweets.take(5)) {
println(gson.toJson(jsonParser.parse(tweet)))
}
val tweetTable = sqlContext.jsonFile(tweetInput).cache()
tweetTable.registerTempTable("tweetTable")
println("--Tweet table Schema---")
tweetTable.printSchema()
println("Sample Tweet Text-")

sqlContext.sql("SELECT text FROM tweetTable LIMIT 
10").collect().foreach(println)



println("--Sample Lang, Name, text---")
sqlContext.sql("SELECT user.lang, user.name, text FROM tweetTable LIMIT 
1000").collect().foreach(println)
println("--Total count by languages Lang, count(*)---")
sqlContext.sql("SELECT user.lang, COUNT(*) as cnt FROM tweetTable GROUP BY 
user.lang ORDER BY cnt DESC LIMIT 25").collect.foreach(println)
println("--- Training the model and persist it")
val texts = sqlContext.sql("SELECT text from tweetTable").map(_.head.toString)
// Cache the vectors RDD since it will be used for all the KMeans iterations.
val vectors = texts.map(Utils.featurize).cache()
vectors.count() // Calls an action on the RDD to populate the vectors cache.
val model = KMeans.train(vectors, numClusters, numIterations)
sc.makeRDD(model.clusterCenters, numClusters).saveAsObjectFile(outputModelDir)
val some_tweets = texts.take(100)
println("Example tweets from the clusters")
for (i <- 0 until numClusters) {
println(s"\nCLUSTER $i:")
some_tweets.foreach { t =>
if (model.predict(Utils.featurize(t)) == i) {
println(t)
}
}
}
}
}

Thanks & Regards
Jishnu Menath Prathap




RE: How to integrate Spark with OpenCV?

2015-01-14 Thread jishnu.prathap
Hi Akhil
Thanks for the response
Our use case is  Object detection in  multiple videos. It’s kind of searching 
an image if present in the video by matching the image with all the frames of 
the video. I am able to do it in normal java code using OpenCV lib now but I 
don’t think it is scalable to an extend we could implement it for thousands of 
large sized videos. So I thought we could leverage distributed computing and 
performance of spark If possible.
I could see Jaonary 
Rabarisoa
 has tried to use OpenCV with spark 
http://apache-spark-user-list.1001560.n3.nabble.com/Getting-started-using-spark-for-computer-vision-and-video-analytics-td1551.html.
 But I don’t have any code reference on how to do it with OpenCV.
In case any Image+Video processing library works better with Spark plz let me 
know. Any help would be really appreciated.
.
Thanks & Regards
Jishnu Menath Prathap
From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Wednesday, January 14, 2015 12:35 PM
To: Jishnu Menath Prathap (WT01 - BAS)
Cc: user@spark.apache.org
Subject: Re: How to integrate Spark with OpenCV?

I ddn't played with OpenCV yet, but i was just wondering about your use-case. 
What exactly are you trying to do?

Thanks
Best Regards

Jishnu Prathap mailto:jishnu.prat...@wipro.com>> 
wrote:

Hi, Can somone suggest any Video+image processing library which works well with 
spark. Currently i am trying to integrate OpenCV with Spark. I am relatively 
new to both spark and OpenCV It would really help me if someone could share 
some sample code how to use Mat ,IplImage and spark rdd 's together .Any help 
would be really appreciated. Thanks in Advance!!

View this message in context: How to integrate Spark with 
OpenCV?
Sent from the Apache Spark User List mailing list 
archive at Nabble.com.