Help with publishing to Kafka from Spark Streaming?

2015-05-01 Thread Pavan Sudheendra
Link to the question: http://stackoverflow.com/questions/29974017/spark-kafka-producer-not-serializable-exception Thanks for any pointers.

Re: Spark on Mesos

2015-05-01 Thread Tim Chen
Hi Stephen, It looks like Mesos slave was most likely not able to launch some mesos helper processes (fetcher probably?). How did you install Mesos? Did you build from source yourself? Please install Mesos through a package or actually from source run make install and run from the installed bina

Fwd: Event generator for SPARK-Streaming from csv

2015-05-01 Thread anshu shukla
I have the real DEBS-TAxi data in csv file , in order to operate over it how to simulate a "Spout" kind of thing as event generator using the timestamps in CSV file. -- Thanks & Regards, Anshu Shukla

Re: Enabling Event Log

2015-05-01 Thread James King
Oops! well spotted. Many thanks Shixiong. On Fri, May 1, 2015 at 1:25 AM, Shixiong Zhu wrote: > "spark.history.fs.logDirectory" is for the history server. For Spark > applications, they should use "spark.eventLog.dir". Since you commented out > "spark.eventLog.dir", it will be "/tmp/spark-events

Spark worker error on standalone cluster

2015-05-01 Thread Michael Ryabtsev (Totango)
Hi everyone, I have a spark application that works fine on a standalone Spark cluster that runs on my laptop (master and one worker), but fails when I try to run in on a standalone Spark cluster deployed on EC2 (master and worker are on different machines). The application structure goes in the

Re: Event generator for SPARK-Streaming from csv

2015-05-01 Thread Juan Rodríguez Hortalá
Hi, Maybe you could use streamingContext.fileStream like in the example from https://spark.apache.org/docs/latest/streaming-programming-guide.html#input-dstreams-and-receivers, you can read "from files on any file system compatible with the HDFS API (that is, HDFS, S3, NFS, etc.)". You could split

Re: how to pass configuration properties from driver to executor?

2015-05-01 Thread Michael Ryabtsev
Hi, We've had a similar problem, but with log4j properties file. The only working way we've found, was externally deploying the properties file on the worker machine to the spark conf folder and configuring the executor jvm options with: sparkConf.set("spark.executor.extraJavaOptions", "-Dlog4j.c

NullPointerException with Avro + Spark.

2015-05-01 Thread ๏̯͡๏
I have this spark app that simply needs to do a simple regular join between two datasets. IT works fine with tiny data set (2.5G input of each dataset). When i run against 25G of each input and with .partitionBy(new org.apache.spark.HashPartitioner(200)) , I see NullPointerExveption this trace do

Driver memory default setting stops background jobs

2015-05-01 Thread Andreas Marfurt
Hi all, I encountered strange behavior with the driver memory setting, and was wondering if some of you experienced it as well, or know what the problem is. I want to start a Spark job in the background with spark-submit. If I have the driver memory setting in my spark-defaults.conf: spark.driver.

Exiting "driver" main() method...

2015-05-01 Thread James Carman
In all the examples, it seems that the spark application doesn't really do anything special in order to exit. When I run my application, however, the spark-submit script just "hangs" there at the end. Is there something special I need to do to get that thing to exit normally?

Spark Streaming Kafka Avro NPE on deserialization of payload

2015-05-01 Thread Todd Nist
*Resending as I do not see that this made it to the mailing list, sorry if in fact it did an is just nor reflected online yet.* I’m very perplexed with the following. I have a set of AVRO generated objects that are sent to a SparkStreaming job via Kafka. The SparkStreaming job follows the receiver

ClassNotFoundException for Kryo serialization

2015-05-01 Thread Akshat Aranya
Hi, I'm getting a ClassNotFoundException at the executor when trying to register a class for Kryo serialization: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstanc

spark.logConf with log4j.rootCategory=WARN

2015-05-01 Thread roy
Hi, I have recently enable log4j.rootCategory=WARN, console in spark configuration. but after that spark.logConf=True has becomes ineffective. So just want to confirm if this is because log4j.rootCategory=WARN ? Thanks -- View this message in context: http://apache-spark-user-list.100

Re: ClassNotFoundException for Kryo serialization

2015-05-01 Thread Ted Yu
bq. Caused by: java.lang.ClassNotFoundException: com.example.Schema$MyRow So the above class is in the jar which was in the classpath ? Can you tell us a bit more about Schema$MyRow ? On Fri, May 1, 2015 at 8:05 AM, Akshat Aranya wrote: > Hi, > > I'm getting a ClassNotFoundException at the exec

Re: ClassNotFoundException for Kryo serialization

2015-05-01 Thread Akshat Aranya
Yes, this class is present in the jar that was loaded in the classpath of the executor Java process -- it wasn't even lazily added as a part of the task execution. Schema$MyRow is a protobuf-generated class. After doing some digging around, I think I might be hitting up against SPARK-5470, the fi

Re: ClassNotFoundException for Kryo serialization

2015-05-01 Thread Akshat Aranya
I cherry-picked the fix for SPARK-5470 and the problem has gone away. On Fri, May 1, 2015 at 9:15 AM, Akshat Aranya wrote: > Yes, this class is present in the jar that was loaded in the classpath > of the executor Java process -- it wasn't even lazily added as a part > of the task execution. Sch