Which deployment environment are you running the streaming programs? Standalone? In that case you have to specify what is the max cores for each application, other all the cluster resources may get consumed by the application. http://spark.apache.org/docs/latest/spark-standalone.html
TD On Thu, Jul 24, 2014 at 4:57 PM, Barnaby <bfa...@outlook.com> wrote: > I have the streaming program writing sequence files. I can find one of the > files and load it in the shell using: > > scala> val rdd = sc.sequenceFile[String, > Int]("tachyon://localhost:19998/files/WordCounts/20140724-213930") > 14/07/24 21:47:50 INFO storage.MemoryStore: ensureFreeSpace(32856) called > with curMem=0, maxMem=309225062 > 14/07/24 21:47:50 INFO storage.MemoryStore: Block broadcast_0 stored as > values to memory (estimated size 32.1 KB, free 294.9 MB) > rdd: org.apache.spark.rdd.RDD[(String, Int)] = MappedRDD[1] at sequenceFile > at <console>:12 > > So I got some type information, seems good. > > It took a while to research but I got the following streaming code to > compile and run: > > val wordCounts = ssc.fileStream[String, Int, SequenceFileInputFormat[String, > Int]](args(0)) > > It works now and I offer this for reference to anybody else who may be > curious about saving sequence files and then streaming them back in. > > Question: > When running both streaming programs at the same time using spark-submit I > noticed that only one app would really run. To get the one app to continue I > had to stop the other app. Is there a way to get these running > simultaneously? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/streaming-sequence-files-tp10557p10620.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.