On Wed, 11 Jul 2007 22:39:49 +0100 "Daniel J Blueman" <[EMAIL PROTECTED]> wrote:
> On 11/07/07, Daniel J Blueman <[EMAIL PROTECTED]> wrote: > > > > On 05/07/07, Stephen Hemminger <[EMAIL PROTECTED]> wrote: > > > > > Well, it didn't fix my test, but it made it better. The following > > > > > seemed > > > > > to work longer... > > > > > > > > > > --- a/drivers/net/sky2.c 2007-07-05 09:09:45.000000000 -0700 > > > > > +++ b/drivers/net/sky2.c 2007-07-05 09:09:51.000000000 -0700 > > > > > @@ -2490,6 +2490,13 @@ static int sky2_poll(struct net_device * > > > > > > > > > > work_done = sky2_status_intr(hw, work_limit); > > > > > if (work_done < work_limit) { > > > > > + /* Bug/Errata workaround? > > > > > + * Need to kick the TX irq moderation timer. > > > > > + */ > > > > > + if (sky2_read8(hw, STAT_TX_TIMER_CTRL) == TIM_START) { > > > > > + sky2_write8(hw, STAT_TX_TIMER_CTRL, TIM_STOP); > > > > > + sky2_write8(hw, STAT_TX_TIMER_CTRL, > > > > > TIM_START); > > > > > + } > > > > > netif_rx_complete(dev0); > > > > > > > > > > /* end of interrupt, re-enables also acts as I/O > > > > > synchronization */ > > > > > > > > I spoke too soon on this. With the above patch on 2.6.22-rc7, it > > > > failed much sooner than the previous patch with the > > > > read32(B0_Y2_SP_LISR); I'll try to reproduce with the older patch. > > > > > > > > Note the ifconfig error/dropped/frame count at the time of failure: > [snip] > > > The last message means some how frame was received with checksum for count > > > wrong. I have only seen it when coalescing is messed up. > > > > > > I ran for 2+ days with the patch, and only 20min without. Usually my ISP > > > connection > > > gives up after that because of crappy DSL box, and that makes DNS not > > > work. > > > > It wedged when I was copying a few GBs of data from my server to a > > local disk at the time, and running rsync over ssh on a large file on > > my server to my laptop's disk. > > > > This would be the typical load that would cause the NIC to lockup from > > missing an IRQ or otherwise, however, it did feel like the new code > > didn't un-wedge the Yukon-EC's bus master unit. > > > > What other tricks can be used to reset the Yukon-EC's bus master unit? > > > > I'll try the read32(B0_Y2_SP_LISR) trick, as before. > > Nope, this still locks up as you found. > > I have a reliable reproducer: > > 1. export directory over NFS TCP on server > 2. mount directory on client > 3. run 'iozone -a' in directory on client > > I'm reproducing this with NFSv4 (with callbacks working) with 1500 > octet MTU with one client, all gigabit. It would be good to hear if > you can reproduce the problem there. > > Daniel Please try again with post 2.6.22 git version (1.16)? -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html