Dennis,

        Essentially it sounds like you're describing exactly what Amazon Launch
Configurations (LC) and Auto Scaling Groups (ASG) are meant to do with
the launching of new instances based on triggered events. You either can
use the CloudWatch metrics as triggers or your own processes that
trigger an event through an SNS/SQS queue to increase or decrease the
number of nodes running. Which could pretty much covers all of this with
some caveats for C.

        The trick, or sticky part to use your phrasing, is the certificates and
the dynamic hostnames used within Amazon. You can work around this by
utilizing an autosigning script similar to what I have in my environment
[1]. To use it requires utilizing CSR attributes [2]. This allows you to
automatically sign the CSR that comes in for valid instances. The next
trick is to be able to designate how this host is supposed to be
configured. For this I look for Facts that I can use within my Hiera
configuration on the Master side of things and then use server roles.
It's not hard to be able to drop custom facts on instances
initialization and than be able to use those on the Agent run to
determine this hosts purpose and role in life.

        Everything I need to do this in Amazon is made very easy with the use
of cloud-init to be able to setup the required configuration as
user-data that can be included in the LC for the ASG. You just need to
customize the user-data for each LC for different server roles, the ASG
then handles the scaling of the cluster of nodes using the assigned LC.

        You don't have the instances just sitting around if the load doesn't
justify it, but when the triggered events occur it can go up or down as
defined by your policy in the ASG. I believe this accomplishes the end
state goal of what you're looking for and is very possible.

1.
https://github.com/UGNS/standard-modules/blob/production/scripts/autosigner.rb
2.
https://docs.puppetlabs.com/puppet/latest/reference/ssl_attributes_extensions.html

On 10/10/2014 09:10 PM, Dennis Gearon wrote:
> I want to be able to scale up different types of nodes both quickly and
> moderately quickly, AUTOMATICALLY WITH NO USER INPUT, with Puppet doing
> the configureation.
> 
> My idea so far is:
> 
> A/ There are several nodes of each type sleeping, enough per type to
> last 15 minutes of normal to abnormally high peak loads.
> 
> B/ There is a base Amazon image that I create for each node type, with
> Puppet on it stored, in case the load continues or business does :-)
> 
> C/ The system detects that the average load is increasing, or the short
> term load is getting excessive. A new instance is made from the correct
> type of instance, AND HERE:S the STICKY PART, the launching of the
> istance includes installing a unique cert on the instance for puppet,
> and also storing that cert/key in the puppet master (puppetmasters).
> Whomever needs to of the agent or puppetmaster contacts first and the
> new instance is under puppet control.
> 
> D/ Whatever algorithm is needed to wait till things have calmed down, or
> immediately do an update using a catalog is to be determined (TBD).
> 
> E/ When the load is gone, the newer instance goes to sleep with the
> older instances.
> 
> F/ If the load average stays down, or to update the OS on older
> instances, the older instances are woken up  one at a time, their
> contents switched to a new node (or they are just retired if there is a
> lot of reduncy), the node is destroyed, a new node is created from the
> correct Amazon image, it is updated using it's catalog, and then any
> contents needed to function are pushed to it, and then it is put to sleep.
> 
> How does this sound? Is it possible? Also, I read somewhere that
> updating the OS of a puppet node requires COMPLETELY REMOVING PUPPET,
> EVERYTHING, doing the update, then putting puppet back on. Is this
> really true? Any automated way to do this?
> 


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to