Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-21 Thread Luis Chamberlain
On Fri, May 22, 2020 at 08:12:59AM +0300, Emmanuel Grumbach wrote: > > > > On Tue, May 19, 2020 at 10:37 PM Emmanuel Grumbach > > wrote: > > > So I believe we already have this uevent, it is the devcoredump. All > > > we need is to add the unique id. > > > > I think there are a few reasons that d

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-21 Thread Emmanuel Grumbach
> > On Tue, May 19, 2020 at 10:37 PM Emmanuel Grumbach > wrote: > > So I believe we already have this uevent, it is the devcoredump. All > > we need is to add the unique id. > > I think there are a few reasons that devcoredump doesn't satisfy what > either Luis or I want. > > 1) it can be disable

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-21 Thread Brian Norris
On Tue, May 19, 2020 at 10:37 PM Emmanuel Grumbach wrote: > So I believe we already have this uevent, it is the devcoredump. All > we need is to add the unique id. I think there are a few reasons that devcoredump doesn't satisfy what either Luis or I want. 1) it can be disabled entirely [1], for

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-20 Thread Andy Shevchenko
On Wed, May 20, 2020 at 8:40 AM Emmanuel Grumbach wrote: > Since I have been involved quite a bit in the firmware debugging > features in iwlwifi, I think I can give a few insights here. > > But before this, we need to understand that there are several sources of > issues: > 1) the firmware may

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-19 Thread Emmanuel Grumbach
Hi all, Since I have been involved quite a bit in the firmware debugging features in iwlwifi, I think I can give a few insights here. But before this, we need to understand that there are several sources of issues: 1) the firmware may crash but the bus is still alive, you can still use the bus

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-19 Thread Brian Norris
Hi Luis, On Tue, May 19, 2020 at 7:02 AM Luis Chamberlain wrote: > On Mon, May 18, 2020 at 06:23:33PM -0700, Brian Norris wrote: > > On Sat, May 16, 2020 at 6:51 AM Johannes Berg > > wrote: > > > In addition, look what we have in iwl_trans_pcie_removal_wk(). If we > > > detect that the device i

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-19 Thread Luis Chamberlain
On Mon, May 18, 2020 at 06:23:33PM -0700, Brian Norris wrote: > On Sat, May 16, 2020 at 6:51 AM Johannes Berg > wrote: > > In addition, look what we have in iwl_trans_pcie_removal_wk(). If we > > detect that the device is really wedged enough that the only way we can > > still try to recover is b

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Brian Norris
On Sat, May 16, 2020 at 6:51 AM Johannes Berg wrote: > In addition, look what we have in iwl_trans_pcie_removal_wk(). If we > detect that the device is really wedged enough that the only way we can > still try to recover is by completely unbinding the driver from it, then > we give userspace a uev

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Luis Chamberlain
On Mon, May 18, 2020 at 03:16:45PM -0700, Jakub Kicinski wrote: > On Mon, 18 May 2020 21:22:02 + Luis Chamberlain wrote: > > Indeed my issue with devlink is that it did not seem generic enough for > > all devices which use firmware and for which firmware can crash. Support > > should not have t

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Jakub Kicinski
On Mon, 18 May 2020 21:22:02 + Luis Chamberlain wrote: > Indeed my issue with devlink is that it did not seem generic enough for > all devices which use firmware and for which firmware can crash. Support > should not have to be "add devlink support" + "now use this new hook", > but rather a ver

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Luis Chamberlain
On Mon, May 18, 2020 at 01:46:43PM -0700, Jakub Kicinski wrote: > On Mon, 18 May 2020 22:41:48 +0200 Johannes Berg wrote: > > On Mon, 2020-05-18 at 13:35 -0700, Jakub Kicinski wrote: > > > It's intended to be a generic netlink channel for configuring devices. > > > > > > All the firmware-related i

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Luis Chamberlain
On Mon, May 18, 2020 at 10:07:49PM +0200, Johannes Berg wrote: > On Mon, 2020-05-18 at 19:59 +, Luis Chamberlain wrote: > > > > Err, no. Those two are most definitely related. Have you looked at (most > > > or some or whatever) staging drivers recently? Those contain all kinds > > > of garbage

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Jakub Kicinski
On Mon, 18 May 2020 22:41:48 +0200 Johannes Berg wrote: > On Mon, 2020-05-18 at 13:35 -0700, Jakub Kicinski wrote: > > It's intended to be a generic netlink channel for configuring devices. > > > > All the firmware-related interfaces have no dependencies on netdevs, > > in fact that's one of the r

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Johannes Berg
On Mon, 2020-05-18 at 13:35 -0700, Jakub Kicinski wrote: > > It's intended to be a generic netlink channel for configuring devices. > > All the firmware-related interfaces have no dependencies on netdevs, > in fact that's one of the reasons we moved to devlink - we don't want > to hold rtnl lock

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Jakub Kicinski
On Mon, 18 May 2020 22:29:53 +0200 Johannes Berg wrote: > On Mon, 2020-05-18 at 13:28 -0700, Jakub Kicinski wrote: > > On Mon, 18 May 2020 21:25:09 +0200 Johannes Berg wrote: > > > It's pretty clear, but even then, first of all I doubt this is the case > > > for many of the places that you've spr

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Johannes Berg
On Mon, 2020-05-18 at 13:28 -0700, Jakub Kicinski wrote: > On Mon, 18 May 2020 21:25:09 +0200 Johannes Berg wrote: > > It's pretty clear, but even then, first of all I doubt this is the case > > for many of the places that you've sprinkled the annotation on, and > > secondly it actually hides usefu

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Jakub Kicinski
On Mon, 18 May 2020 21:25:09 +0200 Johannes Berg wrote: > It's pretty clear, but even then, first of all I doubt this is the case > for many of the places that you've sprinkled the annotation on, and > secondly it actually hides useful information. > > Regardless of the support issue, I think this

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Johannes Berg
On Mon, 2020-05-18 at 19:59 +, Luis Chamberlain wrote: > > Err, no. Those two are most definitely related. Have you looked at (most > > or some or whatever) staging drivers recently? Those contain all kinds > > of garbage that might do whatever with your kernel. > > No, I stay away :) :) >

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Luis Chamberlain
On Mon, May 18, 2020 at 09:25:09PM +0200, Johannes Berg wrote: > On Mon, 2020-05-18 at 19:09 +, Luis Chamberlain wrote: > > > > Unfortunately a "taint" is interpreted by many users as: "your kernel > > > is really F#*D up, you better do something about it right now." > > > Assuming they're pay

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Johannes Berg
On Mon, 2020-05-18 at 19:09 +, Luis Chamberlain wrote: > > Unfortunately a "taint" is interpreted by many users as: "your kernel > > is really F#*D up, you better do something about it right now." > > Assuming they're paying attention at all in the first place of course. > > Taint historicall

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Luis Chamberlain
On Mon, May 18, 2020 at 11:06:27AM -0700, Steve deRosier wrote: > On Mon, May 18, 2020 at 10:19 AM Luis Chamberlain wrote: > > From a support perspective it is a *crystal* clear sign that the device > > and / or device driver may be in a very bad state, in a generic way. > > > > Unfortunately a "

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Steve deRosier
On Mon, May 18, 2020 at 10:19 AM Luis Chamberlain wrote: > > On Mon, May 18, 2020 at 10:15:45AM -0700, Ben Greear wrote: > > > > > > On 05/18/2020 10:09 AM, Luis Chamberlain wrote: > > > On Mon, May 18, 2020 at 09:58:53AM -0700, Ben Greear wrote: > > > > > > > > > > > > On 05/18/2020 09:51 AM, Lui

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Luis Chamberlain
On Mon, May 18, 2020 at 10:15:45AM -0700, Ben Greear wrote: > > > On 05/18/2020 10:09 AM, Luis Chamberlain wrote: > > On Mon, May 18, 2020 at 09:58:53AM -0700, Ben Greear wrote: > > > > > > > > > On 05/18/2020 09:51 AM, Luis Chamberlain wrote: > > > > On Sat, May 16, 2020 at 03:24:01PM +0200, J

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Ben Greear
On 05/18/2020 10:09 AM, Luis Chamberlain wrote: On Mon, May 18, 2020 at 09:58:53AM -0700, Ben Greear wrote: On 05/18/2020 09:51 AM, Luis Chamberlain wrote: On Sat, May 16, 2020 at 03:24:01PM +0200, Johannes Berg wrote: On Fri, 2020-05-15 at 21:28 +, Luis Chamberlain wrote:> module_fi

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Luis Chamberlain
On Mon, May 18, 2020 at 09:58:53AM -0700, Ben Greear wrote: > > > On 05/18/2020 09:51 AM, Luis Chamberlain wrote: > > On Sat, May 16, 2020 at 03:24:01PM +0200, Johannes Berg wrote: > > > On Fri, 2020-05-15 at 21:28 +, Luis Chamberlain wrote:> > > > module_firmware_crashed > > > > > > You di

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Ben Greear
On 05/18/2020 09:51 AM, Luis Chamberlain wrote: On Sat, May 16, 2020 at 03:24:01PM +0200, Johannes Berg wrote: On Fri, 2020-05-15 at 21:28 +, Luis Chamberlain wrote:> module_firmware_crashed You didn't CC me or the wireless list on the rest of the patches, so I'm replying to a random on

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Luis Chamberlain
On Sat, May 16, 2020 at 03:50:55PM +0200, Johannes Berg wrote: > On Sat, 2020-05-16 at 15:24 +0200, Johannes Berg wrote: > > > Instead of the kernel taint, IMHO you should provide an annotation in > > sysfs (or somewhere else) for the *struct device* that had its firmware > > crash. Or maybe, if i

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Luis Chamberlain
On Sat, May 16, 2020 at 03:24:01PM +0200, Johannes Berg wrote: > On Fri, 2020-05-15 at 21:28 +, Luis Chamberlain wrote:> > module_firmware_crashed > > You didn't CC me or the wireless list on the rest of the patches, so I'm > replying to a random one, but ... > > What is the point here? > >

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-16 Thread Johannes Berg
On Sat, 2020-05-16 at 15:24 +0200, Johannes Berg wrote: > Instead of the kernel taint, IMHO you should provide an annotation in > sysfs (or somewhere else) for the *struct device* that had its firmware > crash. Or maybe, if it's too complex to walk the entire hierarchy > checking for that, have a

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-16 Thread Johannes Berg
On Fri, 2020-05-15 at 21:28 +, Luis Chamberlain wrote:> module_firmware_crashed You didn't CC me or the wireless list on the rest of the patches, so I'm replying to a random one, but ... What is the point here? This should in no way affect the integrity of the system/kernel, for most device

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-15 Thread Rafael Aquini
On Fri, May 15, 2020 at 09:28:43PM +, Luis Chamberlain wrote: > This makes use of the new module_firmware_crashed() to help > annotate when firmware for device drivers crash. When firmware > crashes devices can sometimes become unresponsive, and recovery > sometimes requires a driver unload / r

[PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-15 Thread Luis Chamberlain
This makes use of the new module_firmware_crashed() to help annotate when firmware for device drivers crash. When firmware crashes devices can sometimes become unresponsive, and recovery sometimes requires a driver unload / reload and in the worst cases a reboot. Using a taint flag allows us to an