Hi Guys, quick verification question:
Spark’s method like textFile(…) and sequenceFile(…) support wildcards. However if I have a directory structure with “hdfs:///data/year/month/day” (ex. "hdfs:///data/2015/12/17”), then its possible to crawl a whole year of data consisting of sequence files with “sparkContext.sequenceFile('hdfs:///data/*/*/*/*.seq')” correct? Looking at our results it seems to be working fine and as described above. Thanks, Mark