Hi, I'm trying to join DStream with interval let say 20s, join with RDD loaded from HDFS folder which is changing periodically, let say new file is coming to the folder for every 10 minutes.
How should it be done, considering the HDFS files in the folder is periodically changing/adding new files? Do RDD automatically detect changes in HDFS folder as RDD source and automatically reload RDD? Thanks! Rendy