NAPI poll behavior in various Intel drivers

David Miller Fri, 04 Jan 2008 03:40:56 -0800

Several Intel networking drivers such as e1000, e1000e
and e100 all do this to exit NAPI polling:


        if ((!tx_cleaned && (work_done == 0)) ||
           !netif_running(poll_dev)) {

I tried to make this use in the NAPI rework:

        if ((!tx_cleaned && (work_done < budget)) ||
           !netif_running(poll_dev)) {

But that got reverted by:

        commit f7bbb9098315d712351aba7861a8c9fcf6bf0213

        e1000: Fix NAPI state bug when Rx complete
    
        Don't exit polling when we have not yet used our budget, this causes
        the NAPI system to end up with a messed up poll list.
    
        Signed-off-by: Auke Kok <[EMAIL PROTECTED]>
        Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>

I definitely would not have signed off on that :-)

That "tx_cleaned" thing clouds the logic in all of these driver's
poll routines.

The one necessary precondition is that when work_done < budget
we exit polling and return a value less than budget.

If the ->poll() returns a value less than budget, net_rx_action()
assumes that the device has been removed from the poll() list.

                /* Drivers must not modify the NAPI state if they
                 * consume the entire weight.  In such cases this code
                 * still "owns" the NAPI instance and therefore can
                 * move the instance around on the list at-will.
                 */
                if (unlikely(work == weight))
                        list_move_tail(&n->poll_list, list);

This "work_done == 0" test in these drivers, is thus, wrong.  It
should be "work_done < budget" and the whole tx_cleaned thing needs to
be removed.

It happens to work, because what happens is that we loop again and
process the same NAPI struct again.

As a result, E1000 devices get polled TWICE every time they
process at least one RX packet, but do not consume the whole
quota.

I smell a performance hack, and if so this is wrong and against
all of the principles of NAPI.  Either that or it's a workaround
for the "!netif_running()" case.

I noticed this while trying to work on a generic fix for the
"->poll() does not exit when device is brought down while being
bombed with packets" bug.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

NAPI poll behavior in various Intel drivers

Reply via email to