On 6/16/2014 9:42 AM, Dave Pooser wrote:
On 5/30/14 11:11 AM, "Kevin A. McGrail" <kmcgr...@pccc.com> wrote:

Good time for an update to the users list about the issue.  The box that
processed the updates at the ASF collo failed catastrophically during a
power surge that took down some other boxes as ell. Unfortunately, while
the project requested backups in 2009, they were not implemented.
Now that the update box is back online (and thanks for all your hard work
on that! Systems archaeology is no fun at all), is there anything useful
the community can do to help prevent another such catastrophe? I'd be
willing to contribute hardware and/or VM space at $WORKPLACE for an
offsite replica as long as we wouldn't need to sync more than 2-4GB/day
after the initial setup completed.

If you have access to any SA boxes, make sure they have a scheduled backup (and make sure the backup works and has all important data!). If any systems do not have backups, report it to the appropriate list.

Also make sure every task the box is designed to handle is appropriately documented, including user accounts required, libraries required and their versions, what crontabs should be, etc.

Reply via email to