[ https://issues.apache.org/jira/browse/HDFS-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Manoj Govindassamy resolved HDFS-11218. --------------------------------------- Resolution: Workaround The core issue described in the jira is not a problem any more. With the fix for HDFS-11402, we have a workaround to capture immutable copies of open files in the snapshots. > Add option to skip open files during HDFS Snapshots > --------------------------------------------------- > > Key: HDFS-11218 > URL: https://issues.apache.org/jira/browse/HDFS-11218 > Project: Hadoop HDFS > Issue Type: Improvement > Components: snapshots > Affects Versions: 3.0.0-alpha1 > Reporter: Manoj Govindassamy > Assignee: Manoj Govindassamy > > *Problem:* > When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written > files in Snapshots do not have the point-in-time file length captured. > At the time of File close or any other meta data modification operation on > that file which was previously open, HDFS reconciles the file length and > records the modification in the last taken Snapshot. All the previously taken > Snapshots continue to have the same open File with no modification recorded. > So, all those previous snapshots end up using the final modification record > in the next available snapshot. > *Proposal:* > HDFS Snapshot Design goal was to have O(M) space usage for Snapshots, where M > is the number file modifications. So, it would very expensive to record > modifications for all the open files in all the snapshots. For applications > that do not want to capture incomplete / partial being written binary files > in the snapshots, it would be preferable to have an extra option to skip open > files. This way, they don't have to worry about restoring inconsistent files > from the snapshots. > {noformat} > hdfs dfs -createSnapshot -skipOpenFiles <snapDir> <snapName> > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org