On 7 August 2013 10:59, Jeff Dost <jd...@ucsd.edu> wrote: > Hello, > > We work in a software development team at the UCSD CMS Tier2 Center. We > would like to propose a mechanism to allow one to subclass the > DFSInputStream in a clean way from an external package. First I'd like to > give some motivation on why and then will proceed with the details. > > We have a 3 Petabyte Hadoop cluster we maintain for the LHC experiment at > CERN. There are other T2 centers worldwide that contain mirrors of the > same data we host. We are working on an extension to Hadoop that, on > reading a file, if it is found that there are no available replicas of a > block, we use an external interface to retrieve this block of the file from > another data center. The external interface is necessary because not all > T2 centers involved in CMS are running a Hadoop cluster as their storage > backend. >
You are relying on all these T2 site being HDFS clusters with exactly the same block sizing of a named file, so that you can pick up a copy of a block from elsewhere? Why not just handle a failing FileNotFoundException on an open() and either relay to another site or escalate to T1/T0/Castor? This would be far easier to implement with a front end wrapper for any FileSystem. What you are proposing seems far more brittle in both code and execution. > > In order to implement this functionality, we need to subclass the > DFSInputStream and override the read method, so we can catch IOExceptions > that occur on client reads at the block level. > Specific IOEs related to missing blocks, or any IOE -as that could swallow other problems? > > The basic steps required: > 1. Invent a new URI scheme for the customized "FileSystem" in > core-site.xml: > <property> > <name>fs.foofs.impl</name> > <value>my.package.**FooFileSystem</value> > <description>My Extended FileSystem for foofs: uris.</description> > </property> > Assuming you only used the FS driver everywhere, then you could just overwrite the hdfs declaration > > 2. Write new classes included in the external package that subclass the > following: > FooFileSystem subclasses DistributedFileSystem > FooFSClient subclasses DFSClient > FooFSInputStream subclasses DFSInputStream > > Now any client commands that explicitly use the foofs:// scheme in paths > to access the hadoop cluster can open files with a customized InputStream > that extends functionality of the default hadoop client DFSInputStream. In > order to make this happen for our use case, we had to change some access > modifiers in the DistributedFileSystem, DFSClient, and DFSInputStream > classes provided by Hadoop. As well as marking such things as unstable/internal, its critical that such things don't expose any sync problems (which bool closed may). Better to use protected accessors. > In addition, we had to comment out the check in the namenode code that > only allows for URI schemes of the form "hdfs://". > That's there for a reason and filtering it would break. The more elegant design would be for DistributedFileSystem to have a non-final getSchemaName() method used in the validator; you'd just overwride it > > Attached is a patch file we apply to hadoop. Patches and reviews should be done via JIRA > Note that we derived this patch by modding the Cloudera release > hadoop-2.0.0-cdh4.1.1 which can be found at: > http://archive.cloudera.com/**cdh4/cdh/4/hadoop-2.0.0-cdh4.**1.1.tar.gz<http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.1.1.tar.gz> > > And they have to be done against hadoop trunk in SVN, then potentially backported to earlier ASF releases. Vendors are free to cherry pick anything. > We would greatly appreciate any advise on whether or not this approach > sounds reasonable, and if you would consider accepting these modifications > into the official Hadoop code base. > The changes look somewhat minor, but there are issues above -everyone is worried about stability of HDFS and anything that would change behaviour (which the commenting out check is very much so) would be rejected. For a patch to be considered 1. it must not change the semantics of existing APIs, here DFS client code 2. it mustn't expose fields that could generate concurrency problems 3. It has to be considered the right approach 4. It has to come with tests These are things that could be covered in the JIRA