Updated Branches: refs/heads/master 302741d30 -> e17a0c23b
[GSoC]: Proposal from Dharmesh Kakadia Project: http://git-wip-us.apache.org/repos/asf/cloudstack/repo Commit: http://git-wip-us.apache.org/repos/asf/cloudstack/commit/e17a0c23 Tree: http://git-wip-us.apache.org/repos/asf/cloudstack/tree/e17a0c23 Diff: http://git-wip-us.apache.org/repos/asf/cloudstack/diff/e17a0c23 Branch: refs/heads/master Commit: e17a0c23b8d4d7cbbcbf4a95cf545b0d8ae59aaa Parents: 302741d Author: Sebastien Goasguen <run...@gmail.com> Authored: Fri Jun 7 06:03:42 2013 -0400 Committer: Sebastien Goasguen <run...@gmail.com> Committed: Fri Jun 7 06:03:42 2013 -0400 ---------------------------------------------------------------------- docs/en-US/CloudStack_GSoC_Guide.xml | 1 + docs/en-US/gsoc-dharmesh.xml | 149 +++++++++++++++++++++++++++++ 2 files changed, 150 insertions(+), 0 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/cloudstack/blob/e17a0c23/docs/en-US/CloudStack_GSoC_Guide.xml ---------------------------------------------------------------------- diff --git a/docs/en-US/CloudStack_GSoC_Guide.xml b/docs/en-US/CloudStack_GSoC_Guide.xml index b7ba61f..243a0ca 100644 --- a/docs/en-US/CloudStack_GSoC_Guide.xml +++ b/docs/en-US/CloudStack_GSoC_Guide.xml @@ -48,6 +48,7 @@ </bookinfo> <xi:include href="gsoc-tuna.xml" xmlns:xi="http://www.w3.org/2001/XInclude" /> <xi:include href="gsoc-imduffy15.xml" xmlns:xi="http://www.w3.org/2001/XInclude" /> + <xi:include href="gsoc-dharmesh.xml" xmlns:xi="http://www.w3.org/2001/XInclude" /> </book> http://git-wip-us.apache.org/repos/asf/cloudstack/blob/e17a0c23/docs/en-US/gsoc-dharmesh.xml ---------------------------------------------------------------------- diff --git a/docs/en-US/gsoc-dharmesh.xml b/docs/en-US/gsoc-dharmesh.xml new file mode 100644 index 0000000..5e2bf73 --- /dev/null +++ b/docs/en-US/gsoc-dharmesh.xml @@ -0,0 +1,149 @@ +<?xml version='1.0' encoding='utf-8' ?> +<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [ +<!ENTITY % BOOK_ENTITIES SYSTEM "CloudStack_GSoC_Guide.ent"> +%BOOK_ENTITIES; +]> + +<!-- Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<chapter id="gsoc-dharmesh"> + <title>Dharmesh's 2013 GSoC Proposal</title> + <para>This chapter describes Dharmrsh's 2013 Google Summer of Code project within the &PRODUCT; ASF project. It is a copy paste of the submitted proposal.</para> + <section id="abstract-dharmesh"> + <title>Abstract</title> + <para> + The project aims to bring <ulink url="http://aws.amazon.com/cloudformation/"><citetitle>cloudformation</citetitle></ulink> like service to cloudstack. One of the prime use-case is cluster computing frameworks on cloudstack. A cloudformation service will give users and administrators of cloudstack ability to manage and control a set of resources easily. The cloudformation will allow booting and configuring a set of VMs and form a cluster. Simple example would be LAMP stack. More complex clusters such as mesos or hadoop cluster requires a little more advanced configuration. There is already some work done by Chiradeep Vittal at this front [5]. In this project, I will implement server side cloudformation service for cloudstack and demonstrate how to run mesos cluster using it. + </para> + </section> + + <section id="mesos"> + <title>Mesos</title> + <para> + <ulink url="http://incubator.apache.org/mesos/"><citetitle>Mesos</citetitle></ulink> is a resource management platform for clusters. It aims to increase resource utilization of clusters by sharing cluster resources among multiple processing frameworks(like MapReduce, MPI, Graph Processing) or multiple instances of same framework. It provides efficient resource isolation through use of containers. Uses zookeeper for state maintenance and fault tolerance. + </para> + </section> + + <section id="mesos-use"> + <title>What can run on mesos ?</title> + + <para><emphasis role="bold">Spark:</emphasis> A cluster computing framework based on the Resilient Distributed Datasets (RDDs) abstraction. RDD is more generalized than MapReduce and can support iterative and interactive computation while retaining fault tolerance, scalability, data locality etc.</para> + + <para><emphasis role="bold">Hadoop:</emphasis>: Hadoop is fault tolerant and scalable distributed computing framework based on MapReduce abstraction.</para> + + <para><emphasis role="bold">Begel:</emphasis>: A graph processing framework based on pregel.</para> + + <para>and other frameworks like MPI, Hypertable.</para> + </section> + + <section id="mesos-deploy"> + <title>How to deploy mesos ?</title> + + <para>Mesos provides cluster installation <ulink url="https://github.com/apache/mesos/blob/trunk/docs/Deploy-Scripts.textile"><citetitle>scripts</citetitle></ulink> for cluster deployment. There are also scripts available to deploy a cluster on <ulink url="https://github.com/apache/mesos/blob/trunk/docs/EC2-Scripts.textile"><citetitle>Amazon EC2</citetitle></ulink>. It would be interesting to see if this scripts can be leveraged in anyway.</para> + </section> + + <section id="deliverables-dharmesh"> + <title>Deliverables</title> + <orderedlist> + <listitem> + <para>Deploy CloudStack and understand instance configuration/contextualization</para> + </listitem> + <listitem> + <para>Test and deploy Mesos on a set of CloudStack based VM, manually. Design/propose an automation framework</para> + </listitem> + <listitem> + <para>Test stackmate and engage chiradeep (report bugs, make suggestion, make pull request)</para> + </listitem> + <listitem> + <para>Create cloudformation template to provision a Mesos Cluster</para> + </listitem> + <listitem> + <para>Compare with Apache Whirr or other cluster provisioning tools for server side implementation of cloudformation service.</para> + </listitem> + </orderedlist> + </section> + + <section id="arch-and-tools"> + <title>Architecture and Tools</title> + + <para>The high level architecture is as follows:</para> + + <para> + <mediaobject> + <imageobject> + <imagedata fileref="images/mesos-integration-arch.jpg"/> + </imageobject> + </mediaobject> + </para> + + + <para>It includes following components:</para> + + <orderedlist> + <listitem> + <para>CloudFormation Query API server:</para> + <para>This acts as a point of contact to and exposes CloudFormation functionality as Query API. This can be accessed directly or through existing tools from Amazon AWS for their cloudformation service. It will be easy to start as a module which resides outside cloudstack at first and I plan to use dropwizard [3] to start with. Later may be the API server can be merged with cloudstack core. I plan to use mysql for storing details of clusters.</para> + </listitem> + + <listitem> + <para>Provisioning:</para> + + <para>Provisioning module is responsible for handling the booting process of the VMs through cloudstack. This uses the cloudstack APIs for launching VMs. I plan to use preconfigured templates/images with required dependencies installed, which will make cluster creation process much faster even for large clusters. Error handling is very important part of this module. For example, what you do if few VMs fail to boot in cluster ?</para> + </listitem> + + <listitem> + <para>Configuration:</para> + + <para>This module deals with configuring the VMs to form a cluster. This can be done via manual scripts/code or via configuration management tools like chef/ironfan/knife. Potentially workflow automation tools like rundeck [4] also can be used. Also Apache whirr and Provisionr are options. I plan to explore this tools and select suitable ones.</para> + </listitem> + + </orderedlist> + </section> + + <section id="api"> + <title>API</title> + + <para>Query <ulink url="http://docs.aws.amazon.com/AWSCloudFormation/latest/APIReference/API_Operations.html"><citetitle>API</citetitle></ulink> will be based on Amazon AWS cloudformation service. This will allow leveraging existing <ulink url="http://aws.amazon.com/developertools/AWS-CloudFormation"><citetitle>tools</citetitle></ulink> for AWS.</para> + </section> + + <section id="timeline"> + <title>Timeline</title> + <para>1-1.5 week : project design. Architecture, tools selection, API design</para> + <para>1-1.5 week : getting familiar with cloudstack and stackmate codebase and architecture details</para> + <para>1-1.5 week : getting familiar with mesos internals</para> + <para>1-1.5 week : setting up the dev environment and create mesos templates</para> + <para>2-3 week : build provisioning and configuration module</para> + <para>Midterm evaluation: provisioning module, configuration module</para> + <para>2-3 week : develope cloudformation server side implementation</para> + <para>2-3 week : test and integrate</para> + </section> + + <section id="future-work"> + <title>Future Work</title> + <orderedlist> + <listitem> + <para><emphasis role="bold">Auto Scaling:</emphasis></para> + <para>Automatically adding or removing VMs from mesos cluster based on various conditions like utilization going above/below a static threshold. There can be more sophisticated strategies based on prediction or fine grained metric collection with tight integration with mesos framework.</para> + </listitem> + <listitem> + <para><emphasis role="bold">Cluster Simulator:</emphasis></para> + <para>Integrating with existing simulator to simulate mesos clusters. This can be useful in various scenarios, for example while developing a new scheduling algorithm, testing autoscaling etc.</para> + </listitem> + </orderedlist> + </section> +</chapter> http://git-wip-us.apache.org/repos/asf/cloudstack/blob/e17a0c23/docs/en-US/images/mesos-integration-arch.jpg ---------------------------------------------------------------------- diff --git a/docs/en-US/images/mesos-integration-arch.jpg b/docs/en-US/images/mesos-integration-arch.jpg new file mode 100644 index 0000000..e69de29