Hi Guys,

quick verification question:

Spark’s method like textFile(…) and sequenceFile(…) support wildcards. 
However if I have a directory structure with “hdfs:///data/year/month/day” (ex. 
"hdfs:///data/2015/12/17”), then its possible to crawl a whole year of data 
consisting of sequence files with 
“sparkContext.sequenceFile('hdfs:///data/*/*/*/*.seq')” correct?

Looking at our results it seems to be working fine and as described above.

Thanks,
Mark

Reply via email to