[ https://issues.apache.org/jira/browse/HDFS-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tsz Wo (Nicholas), SZE resolved HDFS-4529. ------------------------------------------ Resolution: Fixed Fix Version/s: Snapshot (HDFS-2802) Hadoop Flags: Reviewed Created HDFS-4704 for implementing (4). I have committed this. > Decide the semantic of concat with snapshots > -------------------------------------------- > > Key: HDFS-4529 > URL: https://issues.apache.org/jira/browse/HDFS-4529 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode > Reporter: Tsz Wo (Nicholas), SZE > Assignee: Tsz Wo (Nicholas), SZE > Fix For: Snapshot (HDFS-2802) > > Attachments: h4529_20130415.patch, h4529_20130416.patch > > > The use case of concat is for copying large files across clusters using the > following steps. > - Step 1: The blocks of a file in the source cluster are copied in parallel > to transient files in the destination cluster. > - Step 2: Then the transient files in the destination cluster are > concatenated in order to obtain the original file. > If a snapshot is taken in the destination cluster before Step 2, some > transient files may be captured in the snapshot. Then what should happen? > The following are some alternatives: > * (1) fail concat and keep the transient files in the snapshots; > * (2) allow concat and keep the transient files in the snapshots; > * (3) allow concat but remove the transient files from all snapshots. > All solutions above are not perfect. Here are their drawbacks: > For (1) and (2), the transient files will remain in the system until the > snapshots are deleted. It is inefficient to the system since the files are > known to be transient. (1) may be able to force user to create files under > some non-snapshottable tmp directory in the first place. However, it > complicates the user applications and the existing applications may need to > be updated for the new policy. Also, non-snapshottable directory may not > exists since admin may set the system root directory to be snapshottable. > For (2), the problem seems to break the Read-Only snapshot contract - some > files appear in a snapshot may disappear later on. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira