Hi, On Wed, Dec 3, 2014 at 5:31 PM, Bahubali Jain <bahub...@gmail.com> wrote: > > I am trying to use textFileStream("some_hdfs_location") to pick new files > from a HDFS location.I am seeing a pretty strange behavior though. > textFileStream() is not detecting new files when I "move" them from a > location with in hdfs to location at which textFileStream() is checking for > new files. > But when I copy files from a location in linux filesystem to hdfs then the > textFileStream is detecting the new files. >
Is it possible that the timestamp of the moved files is actually older than the ones of previously processed files? I think only "new" files are picked up. Try moving the file and set the timestamp to now() to see if it makes a difference. Tobias