I think we should take this on the jira than the merge heads up thread. Nicholas, please suggest a jira where we can continue the
Some comments inline: On Wed, Apr 24, 2013 at 1:25 PM, Todd Lipcon <t...@cloudera.com> wrote: > On Fri, Apr 19, 2013 at 3:36 AM, Aaron T. Myers <a...@cloudera.com> wrote: > > > On Fri, Apr 19, 2013 at 6:53 AM, Tsz Wo Sze <szets...@yahoo.com> wrote: > > > > > HdfsAdmin is also for admin operations. However, createSnapshot etc > > > methods aren't. > > > > > > > I agree that they're not administrative operations in the sense that they > > don't strictly require super user privilege, but they are > "administrative" > > in the sense that they will most-often be used by those administering > HDFS. > > The HdfsAdmin class should not be construed to contain only operations > > which require super user privilege, even though that happens to be the > case > > right now. It's intended as just a public API for HDFS-specific > operations. > > > I have to disagree about adding this functionality to HdfsAdmin. HdfsAdmin class is for admin operations. As Nicholas has said, the snapshot operations are nothing different from mkdir, create file kind of operations. > > Regardless, my point is not necessarily that these operations should go > > into the HdfsAdmin class, but rather that they shouldn't go into the > > FileSystem class, since the snapshots API doesn't seem to me like it will > > generalize to other FileSystem implementations. > > > > > Agreed. The cases of WAFL/ZFS were brought up -- in those file systems, > even if users may take snapshots, they're done using FS-specific APIs > rather than any standard Linux interface. So, I'm in favor of either > putting the APIs in HdfsAdmin, or alternatively in DistributedFileSystem, > forcing a user to down-cast if they want to use the HDFS-specific > operation. I have hard time understanding the issue related to adding these methods to FileSystem API. I think we already have many operations, one might argue does not belong to generic file system such as getting block size, file checksum, operations to copy from local, or copy to local, getting replication etc. These are operations that are largely influenced by having HDFS as the dominant implementation. I also think there are other operations that are only in DistributedFileSystem should be moved down to FileSystem. Such as concat etc. I think it is perfectly okay for the base FileSystem to throw unsupported exception for such operations. Current way of casting a FileSystem to a non public DistributedFileSystem is not a good idea. Other file system which support snapshot could implement these methods. Implementing these methods does not mean, they have to use the same snapshot path convention. They can document and provide their own convention for supporting snapshot paths. Regards, Suresh -- http://hortonworks.com/download/