Package: linux-image-4.9.0-6-amd64 Version: 4.9.82-1+deb9u3 Hi!
Here's a short problem description. We have some Supermicro servers with the same configuration for all machines (hardware, kernels, packages, etc). A month ago, or maybe a bit later, all of these machines began crashing into kernel panic. I can't find any pattern of failure at all. But it happens very often. Some machines may drop into kernel panic a couple times a day! But usually machines crash about every 3 to 6 days. All of these machines have intensive network and i/o operations. I saved dmesg log from one of these machines after the crash (see the attachment). As far as I see, every machine probably has problems with mlx4_en or GRO. Also I see list_add double add => list_del corruption. Can I do anything to get more detailed logs? What additional information do you need for better problem diagnostics? --- С уважением, Буданов Евгений. Системный администратор Компания «Рестрим»
dmesg.log
Description: Binary data
lspci
Description: Binary data