On Wed, Jun 26, 2013 at 9:38 PM, Nikola Pavlović <n...@riseup.net> wrote:
> Hi, > > Last night during a massive (~1 year worth :| ) > portsnap fetch > > the server went unresponsive and ssh eventually disconnected. I decided > to leave it during the night, and, sure enough, the situation was the > same in the morning, so I had to do a hard reset. It came back up, but > one of the two gmirror components was marked as broken and deactivated. > > The hang happened during the 'fetching new files or ports' (~24000 of > them, there are currently ~10000 snapshots in /var/db/portsnap) phase > of postsnap fetch. > > /var/log/messages was completely silent during the period between the > hang and the reset. > > Googling around I found a mention that it's possible to sometimes get a > 'blip'[*] during busy periods, so I decided to just bite the bullet and > reinsert the component with > # gmirror forget gm0 > # gmirror clean ad4 > # gmirror insert gm0 ad4 > > Currently it's syncing and things *seem* OK. My question is how much > should I be worried and what could be the cause of this? Is it possible > that ports snapshot fetching caused this, or that perhaps it was the other > way around (a failing disk causing the machine to choke during the huge > portsnap fetch)? How to proceed? :) > The messages log definitely shows problems with your io. The smart log of the disks are also at least mildly concerning and indicates the drives are in a preliminary stage of death. Some HD deaths take years to complete. Expect random glitches and intermittent reduced performance as a continuous degradation. You might be able to alleviate some of this by switching to the AHCI driver and bumping up timeouts but at the end of the day 2 flaky disks in a mirror don't inspire confidence. -- Adam Vande More _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"