Re: unserializable object in Spark Streaming context

2014-07-18 Thread Gino Bustelo
I get TD's recommendation of sharing a connection among tasks. Now, is there a good way to determine when to close connections? Gino B. > On Jul 17, 2014, at 7:05 PM, Yan Fang wrote: > > Hi Sean, > > Thank you. I see your point. What I was thinking is that, do computation in a > distributed

Re: jackson-core-asl jar (1.8.8 vs 1.9.x) conflict with the spark-sql (version 1.x)

2014-06-27 Thread Gino Bustelo
Hit this same problem yesterday. My fix might not be ideal for you, but we were able to get rid of the error by turning off annotation deser in ObjectMapper. Gino B. > On Jun 27, 2014, at 2:58 PM, M Singh wrote: > > Hi: > > I am using spark to stream data to cassandra and it works fine in lo

Re: problem about cluster mode of spark 1.0.0

2014-06-24 Thread Gino Bustelo
nd > until we fix standalone-cluster mode through spark-submit. > > I have filed the relevant issues: > https://issues.apache.org/jira/browse/SPARK-2259 and > https://issues.apache.org/jira/browse/SPARK-2260. Thanks for pointing this > out, and we will get to fixing these shortly. > >

Re: problem about cluster mode of spark 1.0.0

2014-06-20 Thread Gino Bustelo
I've found that the jar will be copied to the worker from hdfs fine, but it is not added to the spark context for you. You have to know that the jar will end up in the driver's working dir, and so you just add a the file name if the jar to the context in your program. In your example below, ju

Re: spark streaming, kafka, SPARK_CLASSPATH

2014-06-17 Thread Gino Bustelo
> ) >>> >>> >>> mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) => >>> >>> >>> >>> { >>> case x if x.startsWith("META-INF/ECLIPSEF.RSA") => MergeStrategy.last >>> &

Re: spark streaming, kafka, SPARK_CLASSPATH

2014-06-16 Thread Gino Bustelo
+1 for this issue. Documentation for spark-submit are misleading. Among many issues, the jar support is bad. HTTP urls do not work. This is because spark is using hadoop's FileSystem class. You have to specify the jars twice to get things to work. Once for the DriverWrapper to laid your classes

Re: Master not seeing recovered nodes("Got heartbeat from unregistered worker ....")

2014-06-13 Thread Gino Bustelo
I get the same problem, but I'm running in a dev environment based on docker scripts. The additional issue is that the worker processes do not die and so the docker container does not exit. So I end up with worker containers that are not participating in the cluster. On Fri, Jun 13, 2014 at 9:44

Re: Best practise for 'Streaming' dumps?

2014-06-08 Thread Gino Bustelo
t; I was almost going to use HBase or Hive, but they seem to have been >> deprecated in 1.0.0? Or just late to the party? >> >> Also, I've been having trouble deleting hadoop directories.. the old "two >> line" examples don't seem to work anymore. I actually

Re: Best practise for 'Streaming' dumps?

2014-06-07 Thread Gino Bustelo
Have you thought of using window? Gino B. > On Jun 6, 2014, at 11:49 PM, Jeremy Lee > wrote: > > > It's going well enough that this is a "how should I in 1.0.0" rather than > "how do i" question. > > So I've got data coming in via Streaming (twitters) and I want to archive/log > it all. It

Re: New user streaming question

2014-06-07 Thread Gino Bustelo
I would make sure that your workers are running. It is very difficult to tell from the console dribble if you just have no data or the workers just disassociated from masters. Gino B. > On Jun 6, 2014, at 11:32 PM, Jeremy Lee > wrote: > > Yup, when it's running, DStream.print() will print o

Re: Interactive modification of DStreams

2014-06-03 Thread Gino Bustelo
Thanks for the reply. Are there plans to allow this runtime interactions with a dstream context? From the surface they seem doable. What is preventing this to work? Also... I implemented the modifiable windowdstream and it seemed to work good. Thanks for the pointer. Gino B. > On Jun 2, 2014

Re: how to construct a ClassTag object as a method parameter in Java

2014-06-03 Thread Gino Bustelo
A better way seems to be to use ClassTag$.apply(Class). I'm going by memory since I'm on my phone, but I just did that today. Gino B. > On Jun 3, 2014, at 11:04 AM, Michael Armbrust wrote: > > Ah, this is a bug that was fixed in 1.0. > > I think you should be able to workaround it by using

Re: Akka Connection refused - standalone cluster using spark-0.9.0

2014-05-28 Thread Gino Bustelo
I've been playing with the amplab docker scripts and I needed to set spark.driver.host to the driver host ip. One that all spark processes can get to. > On May 28, 2014, at 4:35 AM, jaranda wrote: > > Same here, got stuck at this point. Any hints on what might be going on? > > > > -- > Vie