For hdfs, there is iNotify mechanism. https://issues.apache.org/jira/browse/HDFS-6634
https://www.slideshare.net/Hadoop_Summit/keep-me-in-the-loop-inotify-in-hdfs FYI On Wed, Aug 16, 2017 at 9:41 AM, Conrad Crampton < conrad.cramp...@secdata.com> wrote: > Hi, > > I have a simple text file that is stored in HDFS which I use in a > RichFilterFunction by way of DistributedCache file. The file is externally > edited periodically to have other lines added to it. My FilterFunction also > implements Runnable whose run method is run as a scheduleAtFixedRate method > of ScheduledExectutorService which reloads the file and stores the results > in a List in the Filter class. > > > > I have realized the errors of my ways as the file that is reloaded is the > cached file that is copied to temporary file location on the node which > this instance of Filter class is loaded and not the file from HDFS directly > (as this has been copied when the Flink job started. > > > > Can anyone suggest a solution to this? It is I think a similar problem > that Add Side Inputs in Flink [1] proposal is trying to address but not > finalized yet. > > Can anyone see a problem if I have a thread that reloads the HDFS file > being in the main body of my Flink program and registers the cache file > within that reload process e.g. > > > > env.registerCachedFile(properties.getProperty(*"whitelist.location"*), > *WHITELIST*); > > > > i.e. does this actually copy the file again from HDFS to temporary files > on each node? I think I’d have to have the same schedule I have currently > that reload within my Filter function too though as all the previous > process would do is to push the HDFS file to temp location and not actually > refresh my List. > > > > Any suggestions would be welcome. > > > > Thanks > > Conrad > > > > [1] https://docs.google.com/document/d/1hIgxi2Zchww_ > 5fWUHLoYiXwSBXjv-M5eOv-MKQYN3m4/edit#heading=h.pqg5z6g0mjm7 > > > SecureData, combating cyber threats > > ------------------------------ > > The information contained in this message or any of its attachments may be > privileged and confidential and intended for the exclusive use of the > intended recipient. If you are not the intended recipient any disclosure, > reproduction, distribution or other dissemination or use of this > communications is strictly prohibited. The views expressed in this email > are those of the individual and not necessarily of SecureData Europe Ltd. > Any prices quoted are only valid if followed up by a formal written quote. > > SecureData Europe Limited. Registered in England & Wales 04365896. > Registered Address: SecureData House, Hermitage Court, Hermitage Lane, > Maidstone, Kent, ME16 9NT >