+1 Niall
On Wed, May 14, 2008 at 12:27 AM, Edward J. Yoon <[EMAIL PROTECTED]> wrote: > Dear Incubator PMC, > > There has been some discussion around the Hama proposal, > and we would now like to officially propose Hama to the Incubator > for consideration, with Grant Ingersoll's +1. > > Please vote on accepting the Hama project for incubation. The full > Hama proposal is available at the end of this message and as a wiki > page at http://wiki.apache.org/incubator/HamaProposal. We ask the > Incubator PMC to sponsor the Hama podling, with myself, Ian Holsman, > and Jeff Eastman as the mentors. > > The vote is open for the next 72 hours and only votes from the > Incubator PMC are binding. > > [ ] +1 Accept Hama as a new podling > [ ] -1 Do not accept the new podling (provide reason, please) > > ---- > == Abstract == > > Hama will develop a parallel matrix computational package based on > [http://hadoop.apache.org Hadoop] Map/Reduce. > > == Proposal == > > Hama will develop a parallel matrix computational package, which > provides an library of matrix operations for the large-scale > processing development environment and Map/Reduce framework for the > large-scale Numerical Analysis and Data Mining, which need the > intensive computation power of matrix inversion, e.g. linear > regression, PCA, SVM and etc. It will be also useful for many > scientific applications, e.g. physics computations, linear algebra, > computational fluid dynamics, statistics, graphic rendering and many > more. > > == Background == > > Currently, several shared-memory based parallel matrix solutions can > provide a scalable and high performance matrix operations, but matrix > resources can not be scalable in the term of complexity. And, Hadoop > HDFS Files and Map/Reduce can only used by 1D blocked algorithm. > > == Rationale == > > Hama approach proposes the use of 3-dimensional Row and Column > (Qualifier), Time space and multi-dimensional Columnfamilies of > [http://hadoop.apache.org/hbase Hbase], which is able to store large > sparse and various type of matrices (e.g. Triangular Matrix, 3D > Matrix, and etc.) and utilize the 2D blocked algorithm. its > auto-partitioned sparsity sub-structure will be efficiently managed > and serviced by Hbase. Row and Column operations can be done in > linear-time, where several algorithms, such as ''structured Gaussian > elimination'' or ''iterative methods'', run in O(the number of > non-zero elements in the matrix / number of mappers) time on Hadoop > Map/Reduce. > > == Current Status == > > In its current state, the 'hama' is buggy and needs filling out, but > generalized matrix interface and basic linear algebra operations was > implemented within a large prototype system. In the future, We need > new parallel algorithms based on Map/Reduce for performance of heavy > decompositions and factorizations. It also needs tools to compose an > arbitrary matrix only with certain data filtered from hbase array > structure. > > == Meritocracy == > > The initial developers are very familiar with meritocratic open source > development, both at Apache and elsewhere. Apache was chosen > specifically because the initial developers want to encourage this > style of development for the project. > > === Community === > > Hama seeks to develop developer and user communities during incubation. > > == Core Developers == > > The initial set of committers includes folks from the > [http://hadoop.apache.org Hadoop] & [http://hadoop.apache.org/hbase > Hbase] communities. We have varying degrees of experience with > Apache-style open source development, ranging from none to ASF > Members. > > == Alignment == > > The developers of Hama want to work with the Apache Software > Foundation specifically because Apache has proven to provide a strong > foundation and set of practices for developing standards-based > infrastructure and server components. > > == Known Risks == > === Orphaned products === > > Most of the active developers would like to become Hama Committers or > PMC Members and have long term interest to develop/maintain and > '''use''' the code. > > === Inexperience with Open Source === > > We has already a good experience with Apache open source development process. > > === Homogenous Developers === > > The current list of committers includes developers from several > different companies ([http://en.wikipedia.org/wiki/NHN NHN, corp], > TMAX software, Korea Research Institute of Bioscience and > Biotechnology, Students) plus many independent volunteers. The > committers are geographically distributed across the Europe, and Asia. > They are experienced with working in a distributed environment. > > === Reliance on Salaried Developers === > > It is expected that Hama development will occur on both salaried time > and on volunteer time, after hours. While there is reliance on > salaried developers (currently from [http://en.wikipedia.org/wiki/NHN > NHN, corp], but it's expected that other company's salaried developers > will also be involved), the Hama Community is very active and things > should balance out fairly quickly. In the meantime, > [http://en.wikipedia.org/wiki/NHN NHN, corp] might support the project > in the future by dedicating 'work time' to Hama, so that there is a > smooth transition. > > === Relationships with Other Apache Products === > > Hama has a strong relationship with Apache [http://hadoop.apache.org > Hadoop], [http://hadoop.apache.org/hbase Hbase] and > [http://lucene.apache.org/mahout Mahout]. Being part of Apache could > help for a closer collaboration between the three projects. > > === A Excessive Fascination with the Apache Brand === > > We believe in the processes, systems, and framework Apache has put in > place. The brand is nice, but is not why we wish to come to Apache. > > == Documentation == > > * http://code.google.com/p/hama/w/list > > == Initial Source == > > * http://code.google.com/p/hama/source/checkout > > == External Dependencies == > > * Hadoop (HDFS, Map/Reduce) License: Apache License, 2.0 > * Hbase (Sparse Matrix Table) License: Apache License, 2.0 > > == Required Resources == > > * Developer and user mailing lists > * [EMAIL PROTECTED] > * [EMAIL PROTECTED] > * [EMAIL PROTECTED] > * A subversion repository > * https://svn.apache.org/repos/asf/incubator/hama > * A JIRA issue tracker > > == Initial Committers == > > * Edward J. Yoon, (edward AT udanax DOT org) > * Chanwit Kaewkasi, (chanwit AT gmail DOT com) > * Cha MinChang, (minslovey AT gmail DOT com) > * Suh ChangHee, (bluesvm AT gmail DOT com) > * Ha Yongho, (yongho.ha AT gmail DOT com) > * Hong Taehui, (hongtebari AT gmail DOT com) > * Yoon JooSun, (ologist0 AT gmail DOT com) > * Takkiel Shim, (tkshim AT gmail DOT com) > * Donguk Choi, (alloe130 AT gmail DOT com) > > == Sponsors == > === Nominated Mentors === > > * Ian Holsman, (ianh AT apache DOT org) > * Jeff Eastman, (jeastman AT windwardsolutions DOT com) > * Edward J. Yoon, (edward AT udanax DOT org) > > === Sponsoring Entity === > The Apache Incubator. > > -- > B. Regards, > Edward J. Yoon, > http://blog.udanax.org > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]