Hi. On Fri, Oct 26, 2018 at 11:23:39AM +0200, Alessandro Vesely wrote: > On Thu 25/Oct/2018 20:30:27 +0200 Brian wrote: > > On Thu 25 Oct 2018 at 19:53:26 +0200, Alessandro Vesely wrote: > > > >> Hi all, > >> early this morning a network card burned out. A few hours later, the > >> server > >> was not responding on any network address, nor on the system console. I > >> had to > >> power it down. > >> > >> Upon rebooting, network errors were detected an I arranged the server to > >> work > >> with the available hardware. The last line logged was an incoming email > >> from a > >> spammer in Brazil. It shouldn't have triggered any severe damage. I > >> found no > >> breakdown hint in the logs. > >> > >> My theory is that the system didn't realize that the card was broken, > >> didn't > >> turn the interface down, and kept storing outgoing stuff until it blew > >> off. Is > >> that reasonable or should I be more paranoid? > > > > You have given an exact diagnosis of your problem - the network > > card failed. What's your problem? Replace it instead of agonising > > and theorising. > > The problem is that the server froze. I don't think that's what it is > supposed > to do when a card fails.
It's my impression too. > Contrast that with log lines about anything else, from non-redundant power > supplies to failed GPG signatures. In part, the missing precise diagnosis > must > be a shortcoming on part of the card vendor. However, how come the kernel > didn't realize that the link had to go down, log something, and just fail any > subsequent call on that interface, instead of freezing? Or did it freeze for > an unrelated reason? I believe that it's impossible to answer this question. It's highly likely that it was kernel panic. Whenever it was related to failed NIC, or no - it's impossible to tell since there's no kernel backtrace. I'd install, say, kdump-tools for the future incidents like this. Reco