Kenzo Iwami wrote:
Hi,

This problem originally occurred in a very large cluster system using snmp
for server management. About two servers panicked each day. The program I sent
is to reproduce this problem in a very short time. It does occur under normal
load when there is a lot of servers.
hmm, not good - does your snmp daemon use ethtool excessively? That would certainly be painful to the driver (any driver!).

I only looked at the panic message after this problem occurred.
I could tell that the snmp daemon caused the panic while trying to process
the ethtool's ioctl, but I don't know how often this was called.
However, it shouldn't be excessively called because it occurred on a production
system while it was idle.

Anyway as I said in the same e-mail, we're working on reducing the lock timeout to a reasonable time. This will unfortunately take some time, as we need to change some major components in the driver to make sure this doesn't happen.

How about the following approach?
If acquiring semaphore fails inside the interrupt handler, acquiring semaphore
is abandoned immediately without waiting for timeout.
However, I don't know whether this method affects other processes.

with the current hardware being accessed simultaneously from several users in the kernel, that would lead to large problems - the watchdog task accesses it every 2 seconds as it reads the PHY link status, so when one of those fails the driver would have no choice but to reset the entire device.

Auke
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to