[PATCH] scsi: smartpqi_init: Reporting 'logical unit failure'

2019-02-27 Thread Erwan Velu
code makes clear this is because driver received a HARDWARE_ERROR/0x3e/0x1 which is a 'logical unit failure'. This patch is just about reporting that fact to help admins making a relationship between this event and the offlining. Signed-off-by: Erwan Velu --- drivers/scsi/smartpqi/

Re: [PATCH] scsi: smartpqi_init: Reporting 'logical unit failure'

2019-02-28 Thread Erwan Velu
Hey, That makes me wonder why the 0x3e / 0x2 isn't handled here aka 3E/02 DZTPROMAEBKVF TIMEOUT ON LOGICAL UNIT Is it possible the controller send to the kernel this kind of message, if so shouldn't we handle it here ? Erwan, Le 27/02/2019 à 17:31, Erwan Velu a écrit : &

[PATCH v2] scsi: smartpqi_init: Reporting 'logical unit failure'

2019-03-01 Thread Erwan Velu
part of the code makes clear this is because driver received a HARDWARE_ERROR/0x3e/0x1 which is a 'logical unit failure'. This patch is just about reporting the reason behind the offlining to ease the analyse. Signed-off-by: Erwan Velu --- drivers/scsi/smartpqi/smartpqi_init.c | 6

Re: [PATCH] scsi: smartpqi_init: Reporting 'logical unit failure'

2019-03-01 Thread Erwan Velu
[...] > Be careful printing errors per-IO; you could get thousands of them if things > go bad. > The block layer print_req_error() uses printk_ratelimited(KERN_ERR) for that > reason, > and the SCSI layer scsi_io_completion_action() maintains a ratelimit on its > own. > > The dev_err_ratelimited

Re: [PATCH v2] scsi: smartpqi_init: Reporting 'logical unit failure'

2019-03-01 Thread Erwan Velu
Le 01/03/2019 à 16:26, James Bottomley a écrit : > [...] > Shouldn't this be a variant of sdev/scmd_printk? Otherwise it tells > you what disk in the array terms is the problem but not what device in > your actual system is affected. Hey James, My initial take on that was that pqi_take_device_o

Re: [PATCH v2] scsi: smartpqi_init: Reporting 'logical unit failure'

2019-03-01 Thread Erwan Velu
Le 01/03/2019 à 16:56, James Bottomley a écrit : > [...] > I was thinking just > > if (printk_ratelimit()) > scmd_printk(KERN_ERR, scmd, "received 'logical unit failure' from > controller for scsi %d:%d:%d:%d\n", ... > > That will give all the necessary information I'm pretty new to this a

[PATCH v3] scsi: smartpqi_init: Reporting 'logical unit failure'

2019-03-01 Thread Erwan Velu
part of the code makes clear this is because driver received a HARDWARE_ERROR/0x3e/0x1 which is a 'logical unit failure'. This patch is just about reporting the reason behind the offlining to ease the analyse. Signed-off-by: Erwan Velu --- drivers/scsi/smartpqi/smartpqi_init.c | 6

[PATCH] scsi: megaraid_sas: Reporting evt_detail->code in megasas_decode_evt

2019-04-09 Thread Erwan Velu
When printing a megasas_decode_evt() message, the code member of the evt_detail is not reported. This make the debugging more complicated as some code paths depends on this value. Reporting the code member makes the context more explicit. Signed-off-by: Erwan Velu --- drivers/scsi/megaraid

Re: [PATCH] scsi: smartpqi: Reporting unhandled SCSI errors

2019-04-10 Thread Erwan Velu
Hi there ! Any reactions to this one ? I didn't got a single comment. Cheers, Erwan, Le jeu. 21 mars 2019 à 10:49, Erwan Velu a écrit : > > When a HARDWARE_ERROR is triggered for asc=0x3e, the actual code is only > considering the case where ascq=0x1. > > Following the http