Re: [DISCUSS/PROPOSAL] Upgrading Driver Model

Mike Tutkowski Tue, 20 Aug 2013 15:40:41 -0700

I agree, John - let's get consensus first, then talk time tables.


On Tue, Aug 20, 2013 at 4:31 PM, John Burwell <jburw...@basho.com> wrote:

> Mike,
>
> Before we can dig into timelines or implementations, I think we need to
> get consensus on the problem to solved and the goals.  Once we have a
> proper understanding of the scope, I believe we can chunk the across a set
> of development lifecycle.  The subject is vast, but it also has a far
> reaching impact to both the storage and network layer evolution efforts.
>  As such, I believe we need to start addressing it as part of the next
> release.
>
> As a separate thread, we need to discuss the timeline for the next
> release.  I think we need to avoid the time compression caused by the
> overlap of the 4.1 stabilization effort and 4.2 development.  Therefore, I
> don't think we should consider development of the next release started
> until the first 4.2 RC is released.  I will try to open a separate discuss
> thread for this topic, as well as, tying of the discussion of release code
> names.
>
> Thanks,
> -John
>
> On Aug 20, 2013, at 6:22 PM, Mike Tutkowski <mike.tutkow...@solidfire.com>
> wrote:
>
> > Hey John,
> >
> > I think this is some great stuff. Thanks for the write up.
> >
> > It looks like you have ideas around what might go into a first release of
> > this plug-in framework. Were you thinking we'd have enough time to
> squeeze
> > that first rev into 4.3. I'm just wondering (it's not a huge deal to hit
> > that release for this) because we would only have about five weeks.
> >
> > Thanks
> >
> >
> > On Tue, Aug 20, 2013 at 3:43 PM, John Burwell <jburw...@basho.com>
> wrote:
> >
> >> All,
> >>
> >> In capturing my thoughts on storage, my thinking backed into the driver
> >> model.  While we have the beginnings of such a model today, I see the
> >> following deficiencies:
> >>
> >>
> >>   1. *Multiple Models*: The Storage, Hypervisor, and Security layers
> >>   each have a slightly different model for allowing system
> functionality to
> >>   be extended/substituted.  These differences increase the barrier of
> entry
> >>   for vendors seeking to extend CloudStack and accrete code paths to be
> >>   maintained and verified.
> >>   2. *Leaky Abstraction*:  Plugins are registered through a Spring
> >>   configuration file.  In addition to being operator unfriendly (most
> >>   sysadmins are not Spring experts nor do they want to be), we expose
> the
> >>   core bootstrapping mechanism to operators.  Therefore, a
> misconfiguration
> >>   could negatively impact the injection/configuration of internal
> management
> >>   server components.  Essentially handing them a loaded shotgun pointed
> at
> >>   our right foot.
> >>   3. *Nondeterministic Load/Unload Model*:  Because the core loading
> >>   mechanism is Spring, the management has little control over the
> timing and
> >>   order of component loading/unloading.  Changes to the Management
> Server's
> >>   component dependency graph could break a driver by causing it to be
> started
> >>   at an unexpected time.
> >>   4. *Lack of Execution Isolation*: As a Spring component, plugins are
> >>   loaded into the same execution context as core management server
> >>   components.  Therefore, an errant plugin can corrupt the entire
> management
> >>   server.
> >>
> >>
> >> For next revision of the plugin/driver mechanism, I would like see us
> >> migrate towards a standard pluggable driver model that supports all of
> the
> >> management server's extension points (e.g. network devices, storage
> >> devices, hypervisors, etc) with the following capabilities:
> >>
> >>
> >>   - *Consolidated Lifecycle and Startup Procedure*:  Drivers share a
> >>   common state machine and categorization (e.g. network, storage,
> hypervisor,
> >>   etc) that permits the deterministic calculation of initialization and
> >>   destruction order (i.e. network layer drivers -> storage layer
> drivers ->
> >>   hypervisor drivers).  Plugin inter-dependencies would be supported
> between
> >>   plugins sharing the same category.
> >>   - *In-process Installation and Upgrade*: Adding or upgrading a driver
> >>   does not require the management server to be restarted.  This
> capability
> >>   implies a system that supports the simultaneous execution of multiple
> >>   driver versions and the ability to suspend continued execution work
> on a
> >>   resource while the underlying driver instance is replaced.
> >>   - *Execution Isolation*: The deployment packaging and execution
> >>   environment supports different (and potentially conflicting) versions
> of
> >>   dependencies to be simultaneously used.  Additionally, plugins would
> be
> >>   sufficiently sandboxed to protect the management server against driver
> >>   instability.
> >>   - *Extension Data Model*: Drivers provide a property bag with a
> >>   metadata descriptor to validate and render vendor specific data.  The
> >>   contents of this property bag will provided to every driver operation
> >>   invocation at runtime.  The metadata descriptor would be a lightweight
> >>   description that provides a label resource key, a description
> resource key,
> >>   data type (string, date, number, boolean), required flag, and optional
> >>   length limit.
> >>   - *Introspection: Administrative APIs/UIs allow operators to
> >>   understand the configuration of the drivers in the system, their
> >>   configuration, and their current state.*
> >>   - *Discoverability*: Optionally, drivers can be discovered via a
> >>   project repository definition (similar to Yum) allowing drivers to be
> >>   remotely acquired and operators to be notified regarding update
> >>   availability.  The project would also provide, free of charge,
> certificates
> >>   to sign plugins.  This mechanism would support local mirroring to
> support
> >>   air gapped management networks.
> >>
> >>
> >> Fundamentally, I do not want to turn CloudStack into an erector set with
> >> more screws than nuts which is a risk with highly pluggable
> architectures.
> >> As such, I think we would need to tightly bound the scope of drivers and
> >> their behaviors to prevent the loss system usability and stability.  My
> >> thinking is that drivers would be packaged into a custom JAR, CAR
> >> (CloudStack ARchive), that would be structured as followed:
> >>
> >>
> >>   - META-INF
> >>      - MANIFEST.MF
> >>      - driver.yaml (driver metadata(e.g. version, name, description,
> >>      etc) serialized in YAML format)
> >>      - LICENSE (a text file containing the driver's license)
> >>   - lib (driver dependencies)
> >>   - classes (driver implementation)
> >>   - resources (driver message files and potentially JS resources)
> >>
> >>
> >> The management server would acquire drivers through a simple scan of a
> URL
> >> (e.g. file directory, S3 bucket, etc).  For every CAR object found, the
> >> management server would create an execution environment (likely a
> dedicated
> >> ExecutorService and Classloader), and transition the state of the
> driver to
> >> Running (the exact state model would need to be worked out).  To be
> really
> >> nice, we could develop a custom Ant task/Maven plugin/Gradle plugin to
> >> create CARs.   I can also imagine an opportunities to add hooks to this
> >> model to register instrumentation information with JMX and
> authorization.
> >>
> >> To keep the scope of this email confined, we would introduce the general
> >> notion of a Resource, and (hand wave hand wave) eventually
> compartmentalize
> >> the execution of work around a resource [1].  This (hand waved)
> >> compartmentalization would allow us the controls necessary to safely and
> >> reliably perform in-place driver upgrades.  For an initial release, I
> would
> >> recommend implementing the abstractions, loading mechanism, extension
> data
> >> model, and discovery features.  With these capabilities in place, we
> could
> >> attack the in-place upgrade model.
> >>
> >> If we were to adopt such a pluggable capability, we would have the
> >> opportunity to decouple the vendor and CloudStack release schedules.
>  For
> >> example, if a vendor were introducing a new product that required a new
> or
> >> updated driver, they would no longer need to wait for a CloudStack
> release
> >> to support it.  They would also gain the ability to fix high priority
> >> defects in the same manner.
> >>
> >> I have hand waved a number of issues that would need to be resolved
> before
> >> such an approach could be implemented.  However, I think we need to
> decide,
> >> as a community, that it worth devoting energy and effort to enhancing
> the
> >> plugin/driver model and the goals of that effort before driving head
> first
> >> into the deep rabbit hole of design/implementation.
> >>
> >> Thoughts? (/me ducks)
> >> -John
> >>
> >> [1]: My opinions on the matter from CloudStack Collab 2013 ->
> >>
> http://www.slideshare.net/JohnBurwell1/how-to-run-from-a-zombie-cloud-stack-distributed-process-management
> >>
> >
> >
> >
> > --
> > *Mike Tutkowski*
> > *Senior CloudStack Developer, SolidFire Inc.*
> > e: mike.tutkow...@solidfire.com
> > o: 303.746.7302
> > Advancing the way the world uses the
> > cloud<http://solidfire.com/solution/overview/?video=play>
> > *™*
>
>


-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkow...@solidfire.com
o: 303.746.7302
Advancing the way the world uses the
cloud<http://solidfire.com/solution/overview/?video=play>
*™*

Re: [DISCUSS/PROPOSAL] Upgrading Driver Model

Reply via email to