Speaking as an Apache Hadoop user who must do something with the NameNode single point of failure this year, I don't subscribe to the view that moving that SPOF from the NameNode to a NFS filer is reasonable to ask of those not already set up with NetApp or similar, or those running in a "cloud" environment, or those quite common deployments (sadly) where our legacy datacenter designs are not... ideal. I would be curious how common this opinion is (or not).
So we intend to run HDFS 2 in HA configuration using the QJM for edit log persistence, fencing, and recovery. Also, there is a BookKeeper based journal manager under development already in HDFS in trunk and on branch-2. Occasionally I've broken it patching up HDFS. I suppose that should come out too? But I would think that not a good idea either per the above reasoning. On Thursday, September 27, 2012, Konstantin Shvachko wrote: > Hi Todd, > > > I had said previously that it's worth > > discussing if several other people believe the same. > > Well let's put it on to general list for discussion then? > Seems to me an important issue for Hadoop evolution in general. > We keep growing the HDFS umbrella with competing technologies > (http/web HDFS as an example) within it. > Which makes the project harder to stabilize and release. > Not touching MR/Yarn here. > > > If at some point in the future, the internal APIs have fully > > stabilized (security, IPC, edit log streams, JournalManager, metrics, > > etc) then we can pull it out at that time. > > By that time it will monolithically grow into HDFS and vise versa. > > > I know that we plan to ship it as part of CDH and will be our > > recommended way of running HA HDFS. > > Sounds like CDH is moving well in release plans and otherwise. > My concern is that if we add another 6000 lines of code to Hadoop-2, > it will take yet another x months for stabilization. > While it is not clear why people cannot just use NFS filers for shared > storage, > as you originally designed. > > > distros. Moving it to an entirely separate standalone project will > > just add extra work for these folks who, like us, think it's currently > > the best option for HA log storage. > > Don't know who these folks are. I see it as less work for HDFS community, > because there is no need for porting and supporting this project in two or > more different versions. > > Thanks, > --Konstantin > > On Wed, Sep 26, 2012 at 10:50 AM, Todd Lipcon > <t...@cloudera.com<javascript:;>> > wrote: > > On Tue, Sep 25, 2012 at 11:21 PM, Konstantin Shvachko > > <shv.had...@gmail.com> wrote: > >> I think this is a great work, Todd. > >> And I think we should not merge it into trunk or other branches. > >> As I suggested earlier on this list I think this should be spinned off > >> as a separate project or a subproject. > >> > >> - The code is well detached as a self contained package. > > > > The addition is mostly self-contained, but it makes use of a bunch of > > "private" parts of HDFS and Common: > > - Reuses all of the Hadoop security infrastructure, IPC, metrics, etc > > - Coupled to the JournalManager interface which is still evolving. In > > fact there were several patches in trunk which were done during the > > development of this project, specifically to make this API more > > general. There's still some further work to be done in this area on > > the generic interface -- eg support for upgrade/rollback. > > - The functional tests make use of a bunch of "private" HDFS APIs as > well. > > > >> - It is a logically stand-alone project that can be replaced by other > >> technologies. > >> - If it is a separate project then there is no need to port it to > >> other versions. You can package it as a dependent jar. > > > > Per above, it's not that separate, because in order to build it, we > > had to make a number of changes to core HDFS internal interfaces. It > > currently couldn't be used to store anything except for NN logs. It > > would be a nice extension to truly separate it out into a > > content-agnostic quorum-based edit log, but today it actually uses the > > existing edit log validation code to determine valid lengths, etc. > > > >> - Finally, it will be a good precedent of spinning new projects out of > >> HDFS rather than bringing everything under HDFS umbrella. > >> > >> Todd, I had a feeling you were in favor of this direction? > > > > I'm not in favor of it - I had said previously that it's worth > > discussing if several other people believe the same. > > > > I know that we plan to ship it as part of CDH and will be our > > recommended way of running HA HDFS. If the community doesn't accept > > the contribution, and prefers that we maintain it in a fork on github, > > then it's worth hearing. But I imagine that many other community > > members will want to either use or it ship it as part of their > > distros. Moving it to an entirely separate standalone project will > > just add extra work for these folks who, like us, think it's currently > > the best option for HA log storage. > > > > If at some point in the future, the internal APIs have fully > > stabilized (security, IPC, edit log streams, JournalManager, metrics, > > etc) then we can pull it out at that time. > > > > -Todd > > > >> On Tue, Sep 25, 2012 at 4:58 PM, Eli Collins <e...@cloudera.com> wrote: > >>> +1 Awesome work Todd. > >>> > >>> On Tue, Sep 25, 2012 at 4:02 PM, Todd Lipcon <t...@cloudera.com> > wrote: > >>>> Dear fellow HDFS developers, > >>>> > >>>> Per my email thread last week ("Heads up: merge for QJM branch soon" > >>>> at http://markmail.org/message/vkyh5culdsuxdb6t) I would like to > >>>> propose merging the HDFS-3077 branch into trunk. The branch has been > >>>> active since mid July and has stabilized significantly over the last > >>>> two months. It has passed the full test suite, findbugs, and release > >>>> audit, and I think it's ready to merge at this point. > >>>> > >>>> The branch has been fully developed using the standard > >>>> 'review-then-commit' (RTC) policy, and the design is described in > >>>> detail in a document attached to HDFS-3077 itself. The code itself has > >>>> been contributed by me, Aaron, and Eli, but I'd be remiss not to also > >>>> acknowledge the contributions to the design from discussions with > >>>> Suresh, Sanjay, Henry Robinson, Patrick Hunt, Ivan Kelly, Andrew > >>>> Purtell, Flavio Junqueira, Ben Reed, Nicholas, Bikas, Brandon, and > >>>> others. Additionally, special thanks to Andrew Purtell and Stephen Chu > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)