You must start the StreamingContext by calling ssc.start() On Thu, May 28, 2015 at 6:57 PM, Animesh Baranawal < animeshbarana...@gmail.com> wrote:
> Hi, > > I am trying to extract the filenames from which a Dstream is generated by > parsing the toDebugString method on RDD > I am implementing the following code in spark-shell: > > import org.apache.spark.streaming.{StreamingContext, Seconds} > val ssc = new StreamingContext(sc,Seconds(10)) > val lines = ssc.textFileStream(// directory //) > > def g : List[String] = { > var res = List[String]() > lines.foreachRDD{ rdd => { > if(rdd.count > 0){ > val files = rdd.toDebugString.split("\n").filter(_.contains(":\")) > files.foreach{ ms => { > res = ms.split(" ")(2)::res > }} } > }} > res > } > > g.foreach(x => {println(x); println("************")}) > > However when I run the code, nothing gets printed on the console apart > from the logs. Am I doing something wrong? > And is there any better way to extract the file names from DStream ? > > Thanks in advance > > > Animesh > > -- Sourav Chandra Senior Software Engineer · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · sourav.chan...@livestream.com o: +91 80 4121 8723 m: +91 988 699 3746 skype: sourav.chandra Livestream "Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd Block, Koramangala Industrial Area, Bangalore 560034 www.livestream.com