On Tue, Sep 06, 2005 at 02:47:14PM -0700, David S. Miller wrote:
> From: Eugene Surovegin <[EMAIL PROTECTED]>
> Date: Tue, 6 Sep 2005 00:40:01 -0700
> 
> > According to Documentation/networking/netdevices.txt 
> > dev->hard_start_xmit must be called with interrupts *enabled*.
> > 
> > Unfortunately, current netconsole code always calls netpoll with local 
> > interrupts disabled:
> > 
> >   write_msg (local_irq_save)
> >     netpoll_send_udp
> >       netpoll_send_skb
> >         np->dev->hard_start_xmit.
> > 
> > I'm not sure this can cause any problems, but quick grep has showed 
> > that some drivers indeed rely on the documented behavior.
> > 
> > Also, it'd be nice if netpoll author updated netdevices.txt with info 
> > about dev->poll_controller sync rules :) (in fact, I stumbled upon 
> > this inconsistency when I was trying to figure out locking for 
> > dev->poll_controller implementation in my driver).
> 
> Yes, several drivers rely on interrupts being enabled, I learned this
> the hard way in the past.  Also, disabling interrupts during this
> potentially expensive function call can kill your interrupt latency
> sensitive devices (such as serial ports) and allow them to overflow.
> 
> I don't immediately see how this can be fixed, because the console
> code does really need to disable interrupts in order to not try
> to recursively take it's locks.

David, correct me if I'm wrong, but I think there is a major problem 
with current netconsole/netpoll approach.

It breaks several assumptions about contexts in which driver can be 
called by the network stack (dev->poll and dev->hard_start_xmit). 
Drivers assume that these entry points won't be called from hardirq 
context. My understanding that this can happen with netconsole if, for 
example, I use printk from IRQ handler. Current netpoll tricks with 
remembering smp_processor_id() can only prevent recursion into 
hard_start_xmit and/or poll. But, let's say driver has a timer 
handler, which shouldn't allow access to TX queue for some reason 
during its execution. Currently, simple spin_lock(dev->xmit_lock) is 
enough, because timer is called with BH's disabled and this should 
prevent hard_start_xmit calls as well. With netconsole, I have to 
change this to IRQ disabling spinlock or use netpoll_poll_lock(). In 
either case, full locking audit is required, and quite possibly rather 
intrusive changes in some cases (e.g. in my driver all packet 
processing is done without IRQ disabling (full TX and RX NAPI)).

There is one quick solution/hack I can think of, we can introduce 
network device flag, which will indicate if it's safe to call driver's 
poll and hard_start_xmit from hardirq context, if not, use current 
netpoll queue.

-- 
Eugene

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to