Re: [DISCUSS] Horn Incubation Proposal

Edward J. Yoon Sun, 30 Aug 2015 21:22:49 -0700

Hi,

If there are no more comments/concerns, I plan to call for a vote soon. Thanks!


On Wed, Aug 26, 2015 at 9:37 PM, Edward J. Yoon <edwardy...@apache.org> wrote:
> More diverse volunteers for Horn project will be added, from
> cldi-kaist team who is ranked 7th in the ImageNet ILSVRC last year.
> We're also seeking one more mentor (instead of me).
>
> Thanks.
>
> P.S., I'm CC'ing current volunteers. If you're still not on the list
> please subscribe so you can see what's going on.
>
> On Fri, Aug 21, 2015 at 2:23 PM, Edward J. Yoon <edwardy...@apache.org> wrote:
>>> multiple worker groups for asynchronous training---data parallelism; and
>>> multiple workers in one group for synchronous training---model parallelism.
>>
>> So, it's basically execution of the multiple asynchronous BSP (Bulk
>> Synchronous Parallel) jobs. This can be simply handled within only
>> single BSP job using region barriers as mentioned in proposal.
>> Moreover, since Apache Hama is a general-purpose BSP framework on top
>> of HDFS, it provides the data partition, locality optimization,
>> job/task scheduling, messaging and fault tolerance in scalable way by
>> nature.
>>
>>> For the programming model, currently Horn proposes to support feed-forward
>>> There are plenty of rooms for collaborations indeed...
>>
>> Yeah, but still it can be more improved. Maybe we can discuss the
>> simplified programming APIs and many others e.g., support GPUs
>> together in the future.
>>
>> On Fri, Aug 21, 2015 at 1:13 PM, ooibc <oo...@comp.nus.edu.sg> wrote:
>>>
>>> Hi,
>>>
>>> I am an initial committer of Apache(incubating) SINGA
>>> (http://singa.incubator.apache.org/)
>>>
>>> Both SINGA and the proposal follow the general parameter-server
>>> architecture:
>>> workers for computing gradients; servers for parameter updating.
>>>
>>> SINGA has implemented the model and data parallelism discussed in the Horn'
>>> proposal:
>>> multiple worker groups for asynchronous training---data parallelism; and
>>> multiple workers in one group for synchronous training---model parallelism.
>>>
>>> One feature of SINGA's architecture is that it can be extended to organize
>>> the
>>> servers in a hierarchical topology, which may help to reduce the
>>> communication bottleneck
>>> of servers organized in a flat topology.
>>>
>>> For the programming model, currently Horn proposes to support feed-forward
>>> models,
>>> e.g., MLP, auto-encoder, while SINGA supports all three categories of the
>>> known models,
>>> feed-forward models (eg MLP, CNN), energy models (eg RBM, DBM),
>>> and recurrent models (eg. RNN).
>>> SINGA provides good support for users to code, e.g., implement new parameter
>>> updating
>>> protocols or layers, and is being integrated with HDFS as well.
>>>
>>> We will submit the first release and full documentation to the mentors this
>>> weekend, and if
>>> ok, we will announce the first full release soon.  The GPU version is
>>> scheduled for
>>> October release.
>>>
>>> Technical papers:
>>>   http://www.comp.nus.edu.sg/~ooibc/singa-mm15.pdf
>>>   http://www.comp.nus.edu.sg/~ooibc/singaopen-mm15.pdf
>>>
>>> and project website (which has more details than the Apache web site):
>>>   http://www.comp.nus.edu.sg/~dbsystem/singa/
>>>
>>>
>>> There are plenty of rooms for collaborations indeed...
>>>
>>> regards
>>> beng chin
>>> www.comp.nus.edu.sg/~ooibc
>>>
>>>
>>>
>>>
>>> On 2015-08-21 08:27, Edward J. Yoon wrote:
>>>>
>>>> Hi all,
>>>>
>>>> We'd like to propose Horn (혼), a fully distributed system for
>>>> large-scale deep learning as an Apache Incubator project and start the
>>>> discussion. The complete proposal can be found at:
>>>> https://wiki.apache.org/incubator/HornProposal
>>>>
>>>> Any advices and helps are welcome! Thanks, Edward.
>>>>
>>>> = Horn Proposal =
>>>>
>>>> == Abstract ==
>>>>
>>>> (tentatively named "Horn [hɔ:n]", korean meaning of Horn is a
>>>> "Spirit") is a neuron-centric programming APIs and execution framework
>>>> for large-scale deep learning, built on top of Apache Hama.
>>>>
>>>> == Proposal ==
>>>>
>>>> It is a goal of the Horn to provide a neuron-centric programming APIs
>>>> which allows user to easily define the characteristic of artificial
>>>> neural network model and its structure, and its execution framework
>>>> that leverages the heterogeneous resources on Hama and Hadoop YARN
>>>> cluster.
>>>>
>>>> == Background ==
>>>>
>>>> The initial ANN code was developed at Apache Hama project by a
>>>> committer, Yexi Jiang (Facebook) in 2013. The motivation behind this
>>>> work is to build a framework that provides more intuitive programming
>>>> APIs like Google's MapReduce or Pregel and supports applications
>>>> needing large model with huge memory consumptions in distributed way.
>>>>
>>>> == Rationale ==
>>>>
>>>> While many of deep learning open source softwares such as Caffe,
>>>> DeepDist, and NeuralGiraph are still data or model parallel only, we
>>>> aim to support both data and model parallelism and also fault-tolerant
>>>> system design. The basic idea of data and model parallelism is use of
>>>> the remote parameter server to parallelize model creation and
>>>> distribute training across machines, and the BSP framework of Apache
>>>> Hama for performing asynchronous mini-batches. Within single BSP job,
>>>> each task group works asynchronously using region barrier
>>>> synchronization instead of global barrier synchronization, and trains
>>>> large-scale neural network model using assigned data sets in BSP
>>>> paradigm. Thus, we achieve data and model parallelism. This
>>>> architecture is inspired by Google's !DistBelief (Jeff Dean et al,
>>>> 2012).
>>>>
>>>> == Initial Goals ==
>>>>
>>>> Some current goals include:
>>>>  * builds new community
>>>>  * provides more intuitive programming APIs
>>>>  * needs both data and model parallelism support
>>>>  * must run natively on both Hama and Hadoop2
>>>>  * needs also GPUs and InfiniBand support (FPGAs if possible)
>>>>
>>>> == Current Status ==
>>>>
>>>> === Meritocracy ===
>>>>
>>>> The core developers understand what it means to have a process based
>>>> on meritocracy. We will provide continuous efforts to build an
>>>> environment that supports this, encouraging community members to
>>>> contribute.
>>>>
>>>> === Community ===
>>>>
>>>> A small community has formed within the Apache Hama project and some
>>>> companies such as instant messenger service company and mobile
>>>> manufacturing company. And many people are interested in the
>>>> large-scale deep learning platform itself. By bringing Horn into
>>>> Apache, we believe that the community will grow even bigger.
>>>>
>>>> === Core Developers ===
>>>>
>>>> Edward J. Yoon, Thomas Jungblut, and Dongjin Lee
>>>>
>>>> == Known Risks ==
>>>>
>>>> === Orphaned Products ===
>>>>
>>>> Apache Hama is already a core open source component at Samsung
>>>> Electronics, and Horn also will be used by Samsung Electronics, and so
>>>> there is no direct risk for this project to be orphaned.
>>>>
>>>> === Inexperience with Open Source ===
>>>>
>>>> Some are very new and the others have experience using and/or working
>>>> on Apache open source projects.
>>>>
>>>> === Homogeneous Developers ===
>>>>
>>>> The initial committers are from different organizations such as,
>>>> Microsoft, Samsung Electronics, and Line Plus.
>>>>
>>>> === Reliance on Salaried Developers ===
>>>>
>>>> Few will be worked as a full-time open source developer. Other
>>>> developers will also start working on the project in their spare time.
>>>>
>>>> === Relationships with Other Apache Products ===
>>>>
>>>>  * Horn is based on Apache Hama
>>>>  * Apache Zookeeper is used for distributed locking service
>>>>  * Natively run on Apache Hadoop and Mesos
>>>>  * Horn can be somewhat overlapped with Singa podling (If possible,
>>>> we'd also like to use Singa or Caffe to do the heavy lifting part).
>>>>
>>>> === An Excessive Fascination with the Apache Brand ===
>>>>
>>>> Horn itself will hopefully have benefits from Apache, in terms of
>>>> attracting a community and establishing a solid group of developers,
>>>> but also the relation with Apache Hama, a general-purpose BSP
>>>> computing engine. These are the main reasons for us to send this
>>>> proposal.
>>>>
>>>> == Documentation ==
>>>>
>>>> Initial plan about Horn can be found at
>>>> http://blog.udanax.org/2015/06/googles-distbelief-clone-project-on.html
>>>>
>>>> == Initial Source ==
>>>>
>>>> The initial source code has been release as part of Apache Hama
>>>> project developed under Apache Software Foundation. The source code is
>>>> currently hosted at
>>>>
>>>> https://svn.apache.org/repos/asf/hama/trunk/ml/src/main/java/org/apache/hama/ml/ann/
>>>>
>>>> == Cryptography ==
>>>>
>>>> Not applicable.
>>>>
>>>> == Required Resources ==
>>>>
>>>> === Mailing Lists ===
>>>>
>>>>  * horn-private
>>>>  * horn-dev
>>>>
>>>> === Subversion Directory ===
>>>>
>>>>  * Git is the preferred source control system: git://git.apache.org/horn
>>>>
>>>> === Issue Tracking ===
>>>>
>>>>  * a JIRA issue tracker, HORN
>>>>
>>>> == Initial Committers and Affiliations ==
>>>>
>>>>  * Thomas Jungblut (tjungblut AT apache DOT org)
>>>>  * Edward J. Yoon (edwardyoon AT apache DOT org)
>>>>  * Dongjin Lee (dongjin.lee.kr AT gmail DOT com)
>>>>  * Minho Kim (minwise.kim AT samsung DOT com)
>>>>  * Chia-Hung Lin (chl501 AT apache DOT org)
>>>>  * Behroz Sikander (behroz.sikander AT tum DOT de)
>>>>  * Hyok S. Choi (hyok.choi AT samsung DOT com)
>>>>  * Kisuk Lee (ks881115 AT gmail DOT com)
>>>>
>>>> == Affiliations ==
>>>>
>>>>  * Thomas Jungblut (Microsoft)
>>>>  * Edward J. Yoon (Samsung Electronics)
>>>>  * Donjin Lee (LINE Plus)
>>>>  * Minho Kim (Samsung Electronics)
>>>>  * Chia-Hung Lin (Self)
>>>>  * Behroz Sikander (Technical University of Munich)
>>>>  * Hyok S. Choi (Samsung Electronics)
>>>>  * Kisuk Lee (Seoul National University)
>>>>
>>>> == Sponsors ==
>>>>
>>>> === Champion ===
>>>>
>>>>  * Edward J. Yoon <ASF member, edwardyoon AT apache DOT org>
>>>>
>>>> === Nominated Mentors ===
>>>>
>>>>  * Luciano Resende <ASF member, lresende AT apache DOT org>
>>>>  * Robin Anil <ASF member, robin.anil AT gmail DOT com>
>>>>  * Edward J. Yoon <ASF member, edwardyoon AT apache DOT org>
>>>>
>>>> === Sponsoring Entity ===
>>>>
>>>> The Apache Incubator
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>>> For additional commands, e-mail: general-h...@incubator.apache.org
>>>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>
>
>
> --
> Best Regards, Edward J. Yoon



-- 
Best Regards, Edward J. Yoon

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [DISCUSS] Horn Incubation Proposal

Reply via email to