On Tue, Sep 06, 2005 at 02:47:14PM -0700, David S. Miller wrote: > From: Eugene Surovegin <[EMAIL PROTECTED]> > Date: Tue, 6 Sep 2005 00:40:01 -0700 > > > According to Documentation/networking/netdevices.txt > > dev->hard_start_xmit must be called with interrupts *enabled*. > > > > Unfortunately, current netconsole code always calls netpoll with local > > interrupts disabled: > > > > write_msg (local_irq_save) > > netpoll_send_udp > > netpoll_send_skb > > np->dev->hard_start_xmit. > > > > I'm not sure this can cause any problems, but quick grep has showed > > that some drivers indeed rely on the documented behavior. > > > > Also, it'd be nice if netpoll author updated netdevices.txt with info > > about dev->poll_controller sync rules :) (in fact, I stumbled upon > > this inconsistency when I was trying to figure out locking for > > dev->poll_controller implementation in my driver). > > Yes, several drivers rely on interrupts being enabled, I learned this > the hard way in the past. Also, disabling interrupts during this > potentially expensive function call can kill your interrupt latency > sensitive devices (such as serial ports) and allow them to overflow. > > I don't immediately see how this can be fixed, because the console > code does really need to disable interrupts in order to not try > to recursively take it's locks.
David, correct me if I'm wrong, but I think there is a major problem with current netconsole/netpoll approach. It breaks several assumptions about contexts in which driver can be called by the network stack (dev->poll and dev->hard_start_xmit). Drivers assume that these entry points won't be called from hardirq context. My understanding that this can happen with netconsole if, for example, I use printk from IRQ handler. Current netpoll tricks with remembering smp_processor_id() can only prevent recursion into hard_start_xmit and/or poll. But, let's say driver has a timer handler, which shouldn't allow access to TX queue for some reason during its execution. Currently, simple spin_lock(dev->xmit_lock) is enough, because timer is called with BH's disabled and this should prevent hard_start_xmit calls as well. With netconsole, I have to change this to IRQ disabling spinlock or use netpoll_poll_lock(). In either case, full locking audit is required, and quite possibly rather intrusive changes in some cases (e.g. in my driver all packet processing is done without IRQ disabling (full TX and RX NAPI)). There is one quick solution/hack I can think of, we can introduce network device flag, which will indicate if it's safe to call driver's poll and hard_start_xmit from hardirq context, if not, use current netpoll queue. -- Eugene - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html