Re: Historical Data as Stream

2014-05-17 Thread Soumya Simanta
@Laeeq - please see this example. https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala#L47-L49 On Sat, May 17, 2014 at 2:06 PM, Laeeq Ahmed wrote: > @Soumya Simanta > > Right now its just a prove of concept. Later I will ha

Re: Historical Data as Stream

2014-05-17 Thread Laeeq Ahmed
@Soumya Simanta Right now its just a prove of concept. Later I will have a real stream. Its EEG files of brain. Later it can be used for real time analysis of eeg streams. @Mayur The size is huge yes. SO its better to do in distributed manner and as I said above I want to read as stream becaus

Re: Historical Data as Stream

2014-05-17 Thread Mayur Rustagi
The real question is why are looking to consume file as a Stream 1. Too big to load as RDD 2. Operate in sequential manner. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi On Sat, May 17, 2014 at 5:12 AM, Soumya Simanta wrot

Re: Historical Data as Stream

2014-05-16 Thread Soumya Simanta
File is just a steam with a fixed length. Usually streams don't end but in this case it would. On the other hand if you real your file as a steam may not be able to use the entire data in the file for your analysis. Spark (give enough memory) can process large amounts of data quickly. > On M