On Thu, Apr 25, 2013 at 8:02 PM, Michael Tiernan
<michael.tier...@gmail.com> wrote:
> Can I ask a side question about this statement? On the whole, I can
> believe the statement but I'd like to ask for a bit more clarification
> on it. Not to question the statement in general but to learn more
> about the overall process.

Here's the most simple problem you'll see:

>From your workstation open 10,000 SSH connections (as if you are
updating 10,000 machines).  Chances are your machine doesn't have
enough RAM for that many /usr/bin/ssh processes.

So you decide to schedule them in batches.  Suppose you do 100 at a
time.  If each batch doesn't start until all 100 of the previous batch
finish, any connection that is a straggler (or a down host that takes
a while to time out) means that next batch doesn't start for a long,
long time.  Now your 100 batches of 100 takes days or weeks.

A more sophisticated scheduling scheme would start 100 connections and
then start a new connection any time one connection completes.  This
will work if you have less than 100 down machines (which if you have
10,000 machines may not be true very often).  Even then you are
dealing with many crypto handshakes happening at the same time; a lot
of work even for a 8-core machine.

If all the clients connect to the master, then you have to deal with
the potential of 10,000 connections coming in all at once.  Luckily it
is easier to do this kind of scaling now a days and many Puppet sites
do just that.

The benefit of a message queue system is that the distribution of
messages to all the endpoints is highly optimized.  Many of these
systems use multicast for all systems on a particular IP subnet and
other interesting techniques to make distribution fast and efficient.
I don't know enough about the system Salt uses to say much more but
you get the point.  The MCollective system that Puppet uses (which I
also don't have direct experience with) also uses a message queue for
distribution.

Hope that helps,
Tom

-- 
Skype: YesThatTom -- GTalk and GooglePlus: t...@whatexit.org
Blog:  http://EverythingSysadmin.com
Videos:  http://www.TomOnTime.com
_______________________________________________
Discuss mailing list
Discuss@lists.lopsa.org
https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to