Do you need a UPS? Sounds like a power related problem
tedc On Sun, Mar 17, 2013 at 5:18 AM, Derek Atkins <warl...@mit.edu> wrote: > Good morning, GnuCashers, > > Some (many?) of you may have noticed the outage of 'code.gnucash.org' > starting with a lot of packet loss on Thursday and escalating into a > complete outage by Friday. This took out our Subversion, Wiki, Email > List, everything server. Well, as of 2:15pm US/EDT on Saturday > (yesterday) everything should be back to normal and operational. If you > don't want to hear the gory details of what happened feel free to stop > reading now. > > The issue was multiple simultaneous failures of multiple pieces of > equipment. What I thought was a power outage turned out be caused by a > failure in my main network switch. It started dropping ports, or > causing ports to fail partially (dropping packets). This was also the > main cause of the packet loss, too. However I didn't discover this > until later. > > My main DHCP server was off the net; I swapped ethernet cables and it > appeared to fix the problem. > > My main database server, however, lost its main network controller so I > had to install a new one (I have a few on hand, so it was a relatively > painless operation -- I just had to remember the magic voodoo to get the > system to call the new card 'eth0', but that was also only a few > minutes). > > It was only after I got this working that I realized that it was the > switch that had failed -- many of the ports connected to actual hosts > had a 'dead link'. I also noticed that my main DHCP server was > bouncing. It would come on the net, stay for a bit, and then go dark. > Luckily I also had a few extra (smaller) switches lying around so I > linked a few of them together and moved all the non-working ports over. > This also fixed the bouncing DHCP server. > > Last, but not least, the VM Server Host's network was wedged, requiring > a complete reboot to reset. This also required resetting all the VMs, > some of which required a bit of hand-holding to come back (and many of > which required a virtual disk fsck as well, taking even more time). The > last of the systems returned to service shortly after 2pm. > > I do plan to acquire a new switch to replace the failing one, but what I > have now is working so I'll watch it closely for now. > > Thanks, > > -derek > > -- > Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory > Member, MIT Student Information Processing Board (SIPB) > URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH > warl...@mit.edu PGP key available > _______________________________________________ > gnucash-devel mailing list > gnucash-devel@gnucash.org > https://lists.gnucash.org/mailman/listinfo/gnucash-devel > _______________________________________________ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel