While you're logging into every host to install mcollective, there are some 
other things to think about (that are easily puppetizeable):

-remote syslogging, so that lots of logs don't cause application hosts to clot
-file system monitoring for your hosts, so you get an alert before things fill 
up
-trend analysis (graphs) on the hosts, so you get an alert when something's 
trending to fill up (by inode as well depending on the host)
-something monitoring critical processes, so that if they stop responding it'll 
restart them (here I plug monit for simplicity's sake, but snmp agents and 
similar items can do this too)
something monitoring the logs which can alarm when something is absent/present 
when it shouldn't be

As to your immediate problem, try an ssh loop if you can run init scripts via 
sudo. Use -t so that sudo will have a tty. For security's sake you'll have to 
enter your password 60 times, but the experience will incentivize you to 
monitor for this problem.

echo <<XX >/tmp/h
host1
host2
XX

for h in `cat /tmp/h`; do ssh -t $h sudo /etc/init.d/puppet restart; done


Good luck.


On Sat, Jan 28, 2012 at 08:53:37AM +1100, Denmat wrote:
> Hi,
> Puppet's sister project, MCollective would do it. An alternative would be 
> something like Rundeck.
> 
> Den
> 
> On 28/01/2012, at 3:52, Kyle Mallory <kyle.mall...@utah.edu> wrote:
> 
> > I am experiencing a curious event, and wondering if others have seen 
> > this... As well, I have a question related to it.
> > 
> > Today, I noticed my puppet summary report from Foreman this morning, that 
> > 60 of my 160 hosts all stopped reporting at nearly the exact same time, and 
> > have not since restarted.  Investigating, it appears that my puppetmaster 
> > temporarily ran out of disk space on the /var volume, probably in part do 
> > to logging.  I have log rollers running, which eventually freed up some 
> > disk space, but the 60 hosts, have not resumed reporting.
> > 
> > If I dig into the logs on one of the failing agents, there are no messages 
> > from puppet, past 4am (here is a snippet of my logs):
> > 
> > Jan 27 02:44:25 kmallory3 puppet-agent[15340]: Using cached catalog
> > Jan 27 02:44:25 kmallory3 puppet-agent[15340]: Could not retrieve catalog; 
> > skipping run
> > Jan 27 03:14:30 kmallory3 puppet-agent[15340]: Could not retrieve catalog 
> > from remote server: Error 400 on SERVER: No space left on device - 
> > /var/lib/puppet/yaml/facts/kmallory3.xxx.xxx.xxx.yaml
> > Jan 27 03:14:30 kmallory3 puppet-agent[15340]: Using cached catalog
> > Jan 27 03:14:30 kmallory3 puppet-agent[15340]: Could not retrieve catalog; 
> > skipping run
> > Jan 27 03:47:30 kmallory3 puppet-agent[15340]: Could not retrieve plugin: 
> > execution expired
> > Jan 27 04:01:02 kmallory3 puppet-agent[15340]: Could not retrieve catalog 
> > from remote server: execution expired
> > Jan 27 04:01:02 kmallory3 puppet-agent[15340]: Using cached catalog
> > Jan 27 04:01:02 kmallory3 puppet-agent[15340]: Could not retrieve catalog; 
> > skipping run
> > 
> > Forcing a run of puppet, I get the following message:
> > 
> > kmallory3:/var/log# puppetd --onetime --test
> > notice: Ignoring --listen on onetime run
> > notice: Run of Puppet configuration client already in progress; skipping
> > 
> > After stopping and restarting the puppet service, the agent started running 
> > properly.  It appears that the failure from the server has caused the agent 
> > to hang, from which it was not able to recover gracefully.  Has anyone 
> > experienced this before?  We are running 2.6.1 on the large majority of our 
> > hosts, including this one.  Many failed, but 2/3rds keep running properly.
> > 
> > Now, on to my question.. Anyone got some bright ideas for how I could force 
> > Puppet to restart itself on a 60 machines, when Puppet isn't running??  I'm 
> > not really excited by the prospect of logging into 60 machines, and running 
> > a sudo command...  sigh.
> > 
> > 
> > --Kyle
> > 
> > -- 
> > You received this message because you are subscribed to the Google Groups 
> > "Puppet Users" group.
> > To post to this group, send email to puppet-users@googlegroups.com.
> > To unsubscribe from this group, send email to 
> > puppet-users+unsubscr...@googlegroups.com.
> > For more options, visit this group at 
> > http://groups.google.com/group/puppet-users?hl=en.
> > 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to 
> puppet-users+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/puppet-users?hl=en.
> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.

Reply via email to