found 628444 linux-2.6/3.2.9-1 tags 628444 + upstream patch moreinfo quit Hi Dafydd,
Dafydd Harries wrote: > I've been seeing similar problems with my "Intel Corporation Centrino > Ultimate-N 6300". > > Like others, the problems seemed to start around 2.6.39. Odd. What kernel did you use before then? (/var/log/dpkg.log might tell.) > Like othes, the card flakes out a day or two after booting, and a reboot > always fixes the problem. Occasionally it stays working for longer. > > Like others, I've added RAM. But as far as I can recall the upgrade > happened well before any poblems started appearing. Interesting and useful. > Any ASPM settings are at their default. > > I'll try wd_disable=1 as a workaround for now. > > Meenakshi, will the patch you mentioned be applied in 3.3? Cc-ing her. The patch currently seems to be part of the wireless-next tree but not davem's net tree. > Below is a syslog excerpt from around the time of failue. It seems to > support Meenakshi's suggestion that it's related to the queue getting > stuck. Well, that can be tested. Could you try the patch against current "master"? It works like this: 0. Prerequisites: apt-get install git build-essential 1. Get the kernel history, if you don't already have it: git clone \ git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2. Configure and build: cd linux git checkout origin/master cp /boot/config-$(uname -r) .config; # current configuration make localmodconfig; # optional: minimize configuration make deb-pkg; # optionally with -j<num> for parallel build dpkg -i ../<name of package>; # as root reboot ... test test test ... 3. Hopefully it reproduces the problem. So try the attached patch: git am -3sc <the patch> make deb-pkg; # maybe with -j4 dpkg -i ../<name of package>; # as root reboot If it works, we can pass this to Dave with information about what happened and your test result, to get the patch fast-tracked. Thanks, Jonathan > Below is a syslog excerpt from around the time of failue. It seems to > support Meenakshi's suggestion that it's related to the queue getting > stuck. [...] > iwlwifi 0000:02:00.0: Queue 4 stuck for 2000 ms. > iwlwifi 0000:02:00.0: Current read_ptr 112 write_ptr 115 > iwlwifi 0000:02:00.0: On demand firmware reload > iwlwifi 0000:02:00.0: Command REPLY_QOS_PARAM failed: FW Error > iwlwifi 0000:02:00.0: Failed to update QoS > iwlwifi 0000:02:00.0: fw recovery, no hcmd send > iwlwifi 0000:02:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5 > iwlwifi 0000:02:00.0: Error clearing ASSOC_MSK on BSS (-5) > iwlwifi 0000:02:00.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0xFFFFFFFF > iwlwifi 0000:02:00.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0xFFFFFFFF [...] > ieee80211 phy0: Hardware restart was requested > wpa_supplicant[1472]: CTRL-EVENT-DISCONNECTED bssid=00:50:7f:cb:4b:58 reason=4 > ieee80211 phy0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardware (-2) [....] > iwlwifi 0000:02:00.0: Could not load the INST uCode section > iwlwifi 0000:02:00.0: Failed to start RT ucode: -110 [...] > iwlwifi 0000:02:00.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0xFFFFFFFF [...] > I get some kind of OOPS but I'm guessing this is just because the driver can't > communicate with the card when the module is being unloaded: [...] > WARNING: at > /build/buildd-linux-2.6_3.2.9-1-amd64-KTPapN/linux-2.6-3.2.9/debian/build/source_amd64_none/drivers/net/wireless/iwlwifi/iwl-core.c:1330 > iwlagn_mac_remove_interface+0x48/0xdd [iwlwifi]() > Hardware name: 3249CTO > Modules linked in: uvcvideo videodev v4l2_compat_ioctl32 media snd_usb_audio > snd_usbmidi_lib pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) > acpi_cpufreq mperf cpufreq_stats cpufreq_userspace cpu > Mar 12 13:15:04 localhost kernel: sync_memcpy async_tx raid1 raid0 multipath > linear md_mod sd_mod crc_t10dif usbhid hid ahci libahci ehci_hcd libata > scsi_mod usbcore thermal thermal_sys usb_common e1000e [last unloaded: > scsi_wait_scan] > Mar 12 13:15:04 localhost kernel: [48290.674508] Pid: 1405, comm: > NetworkManager Tainted: G O 3.2.0-2-amd64 #1 > Mar 12 13:15:04 localhost kernel: [48290.674511] Call Trace: > Mar 12 13:15:04 localhost kernel: [48290.674520] [<ffffffff81046879>] ? > warn_slowpath_common+0x78/0x8c > Mar 12 13:15:04 localhost kernel: [48290.674531] [<ffffffffa03ea9af>] ? > iwlagn_mac_remove_interface+0x48/0xdd [iwlwifi] [...] > Mar 12 13:15:04 localhost kernel: [48290.674647] [<ffffffff812a35a5>] ? > netlink_rcv_skb+0x36/0x7a [...] > iwlwifi 0000:02:00.0: ctx->vif = (null), vif = ffff8801b1c72df0 > iwlwifi 0000:02:00.0: ID = 0: ctx = ffff8801b1a834b0 ctx->vif = > (null)
From: Johannes Berg <johannes.b...@intel.com> Date: Sun, 4 Mar 2012 08:50:46 -0800 Subject: iwlwifi: always monitor for stuck queues commit 342bbf3fee2fa9a18147e74b2e3c4229a4564912 upstream. If we only monitor while associated, the following can happen: - we're associated, and the queue stuck check runs, setting the queue "touch" time to X - we disassociate, stopping the monitoring, which leaves the time set to X - almost 2s later, we associate, and enqueue a frame - before the frame is transmitted, we monitor for stuck queues, and find the time set to X, although it is now later than X + 2000ms, so we decide that the queue is stuck and erroneously restart the device It happens more with P2P because there we can go between associated/unassociated frequently. Cc: sta...@vger.kernel.org Reported-by: Ben Cahill <ben.m.cah...@intel.com> Signed-off-by: Johannes Berg <johannes.b...@intel.com> Signed-off-by: Wey-Yi Guy <wey-yi.w....@intel.com> Signed-off-by: John W. Linville <linvi...@tuxdriver.com> Signed-off-by: Jonathan Nieder <jrnie...@gmail.com> --- drivers/net/wireless/iwlwifi/iwl-core.c | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/iwl-core.c b/drivers/net/wireless/iwlwifi/iwl-core.c index 7bcfa781e0b9..3abe9ede6990 100644 --- a/drivers/net/wireless/iwlwifi/iwl-core.c +++ b/drivers/net/wireless/iwlwifi/iwl-core.c @@ -1465,20 +1465,10 @@ void iwl_bg_watchdog(unsigned long data) if (timeout == 0) return; - /* monitor and check for stuck cmd queue */ - if (iwl_check_stuck_queue(priv, priv->shrd->cmd_queue)) - return; - - /* monitor and check for other stuck queues */ - if (iwl_is_any_associated(priv)) { - for (cnt = 0; cnt < hw_params(priv).max_txq_num; cnt++) { - /* skip as we already checked the command queue */ - if (cnt == priv->shrd->cmd_queue) - continue; - if (iwl_check_stuck_queue(priv, cnt)) - return; - } - } + /* monitor and check for stuck queues */ + for (cnt = 0; cnt < hw_params(priv).max_txq_num; cnt++) + if (iwl_check_stuck_queue(priv, cnt)) + return; mod_timer(&priv->watchdog, jiffies + msecs_to_jiffies(IWL_WD_TICK(timeout))); -- 1.7.9.2