What if the window is of 5 seconds, and the file takes longer than 5 seconds to be completely scanned? It will still attempt to load the whole file?
On Mon, Nov 10, 2014 at 6:24 PM, Soumitra Kumar <kumar.soumi...@gmail.com> wrote: > Entire file in a window. > > On Mon, Nov 10, 2014 at 9:20 AM, Saiph Kappa <saiph.ka...@gmail.com> > wrote: > >> Hi, >> >> In my application I am doing something like this "new >> StreamingContext(sparkConf, Seconds(10)).textFileStream("logs/")", and I >> get some unknown exceptions when I copy a file with about 800 MB to that >> folder ("logs/"). I have a single worker running with 512 MB of memory. >> >> Anyone can tell me if every 10 seconds spark reads parts of that big >> file, or if it attempts to read the entire file in a single window? How >> does it work? >> >> Thanks. >> >> >