What kind of atomicity/visibility claims are made regarding the various operations on a FileSystem? I have multiple processes that write into local sequence files, then uploads them into a remote directory in HDFS. A map/reduce job runs which operates on whatever is in the directory. The processes are not synchronized with the job, so it is entirely possible that the job might start as a file is being uploaded. Thus, my concern is that the job may include a partially uploaded file if "FileSystem.copyFromLocalFile" is not atomic (in the sense that the file will not appear until all bytes are written).
Are any of the FileSystem API's atomic in this sense? What about, at the very least, rename (e.g. first write to a temp hdfs location, then use rename to atomically flip the file into the live directory)? Thanks, Brian
