Re: [PATCH v1] drm/i915/hwmon: expose fan speed

2024-07-22 Thread Riana Tauro
Hi Raag On 7/12/2024 5:53 PM, Raag Jadav wrote: Add hwmon support for fan1_input attribute, which will expose fan speed in RPM. With this in place we can monitor fan speed using lm-sensors tool. $ sensors i915-pci-0300 Adapter: PCI adapter in0: 653.00 mV fan1:3833 RPM power1:

Re: [PATCH v1] drm/i915/hwmon: expose fan speed

2024-07-23 Thread Riana Tauro
On 7/23/2024 3:53 PM, Raag Jadav wrote: On Mon, Jul 22, 2024 at 04:20:51PM +0530, Riana Tauro wrote: Hi Raag On 7/12/2024 5:53 PM, Raag Jadav wrote: Add hwmon support for fan1_input attribute, which will expose fan speed in RPM. With this in place we can monitor fan speed using lm-sensors

Re: [PATCH v2] drm/i915/hwmon: expose fan speed

2024-08-06 Thread Riana Tauro
: N/A (max = 43.00 W) energy1: 32.02 kJ v2: - Add mutex protection - Handle overflow - Add ABI documentation - Aesthetic adjustments Signed-off-by: Raag Jadav Add the name in front of the comments given by reviewer in version history With that Reviewed-by: Riana Tauro

Re: [PATCH 4/4] drm/xe/xe_hw_error: Handle CSC Firmware reported Hardware errors

2025-06-03 Thread Riana Tauro
Hi Himal On 6/3/2025 3:01 PM, Ghimiray, Himal Prasad wrote: On 03-06-2025 13:44, Riana Tauro wrote: Add support to handle CSC firmware reported errors. When CSC firmware errors are encoutered, a error interrupt is received by the GFX device as a MSI interrupt. Device Source control

Re: [PATCH 1/4] drm: Add a firmware flash method to device wedged uevent

2025-06-05 Thread Riana Tauro
Hi Raag On 6/4/2025 4:13 PM, Raag Jadav wrote: On Tue, Jun 03, 2025 at 01:43:57PM +0530, Riana Tauro wrote: A device is declared wedged when it is non-recoverable from the driver context. Some firmware errors can also cause the device to enter this state and the only method to recover from

Re: [PATCH 2/4] drm/xe: Add a helper function to set recovery method

2025-06-19 Thread Riana Tauro
Hi Raag Thank you for the review comments On 6/6/2025 8:42 PM, Raag Jadav wrote: On Tue, Jun 03, 2025 at 01:43:58PM +0530, Riana Tauro wrote: Add a helper function to set recovery method. The recovery method has to be set before declaring the device wedged and sending the drm wedged uevent

Re: [PATCH v2 1/5] drm: Add a firmware flash method to device wedged uevent

2025-06-24 Thread Riana Tauro
Hi Christian On 6/24/2025 5:56 PM, Christian König wrote: On 23.06.25 12:01, Riana Tauro wrote: A device is declared wedged when it is non-recoverable from the driver context. Well, not quite. i took this from the below document. Should it be changed? https://www.kernel.org/doc/html/v6.16

[PATCH 0/4] Handle Firmware reported Hardware Errors

2025-06-03 Thread Riana Tauro
: 50875, 53073, 53074, 53075, 53076 Riana Tauro (4): drm: Add a firmware flash method to device wedged uevent drm/xe: Add a helper function to set recovery method drm/xe: Add support to handle hardware errors drm/xe/xe_hw_error: Handle CSC Firmware reported Hardware errors Documentation/gpu

[PATCH 1/4] drm: Add a firmware flash method to device wedged uevent

2025-06-03 Thread Riana Tauro
A device is declared wedged when it is non-recoverable from the driver context. Some firmware errors can also cause the device to enter this state and the only method to recover from this would be to do a firmware flash Signed-off-by: Riana Tauro --- Documentation/gpu/drm-uapi.rst | 6

[PATCH 3/4] drm/xe: Add support to handle hardware errors

2025-06-03 Thread Riana Tauro
warm reset Add basic support to handle these errors Bspec: 50875, 53073, 53074, 53075, 53076 Co-developed-by: Himal Prasad Ghimiray Signed-off-by: Himal Prasad Ghimiray Signed-off-by: Riana Tauro --- drivers/gpu/drm/xe/Makefile| 1 + drivers/gpu/drm/xe/regs/xe_hw_error_regs.h

[PATCH 2/4] drm/xe: Add a helper function to set recovery method

2025-06-03 Thread Riana Tauro
Add a helper function to set recovery method. The recovery method has to be set before declaring the device wedged and sending the drm wedged uevent. If no method is set, default unbind/re-bind method will be set Signed-off-by: Riana Tauro --- drivers/gpu/drm/xe/xe_device.c | 30

[PATCH 4/4] drm/xe/xe_hw_error: Handle CSC Firmware reported Hardware errors

2025-06-03 Thread Riana Tauro
and userspace is notified with a drm uevent Signed-off-by: Riana Tauro --- drivers/gpu/drm/xe/regs/xe_gsc_regs.h | 2 + drivers/gpu/drm/xe/regs/xe_hw_error_regs.h | 7 ++- drivers/gpu/drm/xe/xe_device_types.h | 3 + drivers/gpu/drm/xe/xe_hw_error.c | 65

Re: [PATCH v5 9/9] drm/xe/xe_hw_error: Add fault injection to trigger csc error handler

2025-07-15 Thread Riana Tauro
On 7/15/2025 10:28 PM, Summers, Stuart wrote: On Tue, 2025-07-15 at 22:09 +0530, Riana Tauro wrote: Hi Stuart On 7/15/2025 7:40 PM, Summers, Stuart wrote: On Tue, 2025-07-15 at 16:17 +0530, Riana Tauro wrote: Add a debugfs fault handler to trigger csc error handler that wedges the device

[PATCH v5 1/9] drm: Add a vendor-specific recovery method to device wedged uevent

2025-07-15 Thread Riana Tauro
ls to commit message (Sima, Rodrigo, Raag) add an example to the documentation (by Raag) Cc: André Almeida Cc: Christian König Cc: David Airlie Co-developed-by: Raag Jadav Signed-off-by: Raag Jadav Signed-off-by: Riana Tauro --- Documentation/gpu/drm-uapi.rst | 41 +

[PATCH v5 0/9] Handle Firmware reported Hardware Errors

2025-07-15 Thread Riana Tauro
able runtime survivability mode when csc errors are reported Rev4: refactor survivability code Rev5: Add more documentation add user friendly logs remove checks for BMG if not necessary fix other review comments Riana Tauro (9): drm: Add a vendor-specific recovery method to

[PATCH v5 2/9] drm/xe: Set GT as wedged before sending wedged uevent

2025-07-15 Thread Riana Tauro
Userspace should be notified after setting the device as wedged. Re-order function calls to set gt wedged before sending uevent. Cc: Matthew Brost Suggested-by: Raag Jadav Signed-off-by: Riana Tauro Reviewed-by: Matthew Brost --- drivers/gpu/drm/xe/xe_device.c | 12 1 file

[PATCH v5 4/9] drm/xe/xe_survivability: Refactor survivability mode

2025-07-15 Thread Riana Tauro
The patches in these series refactor the boot survivability code to allow adding runtime survivability Refactor existing code to separate both the modes This patch renames the functions and separates init and enable Signed-off-by: Riana Tauro --- drivers/gpu/drm/xe/xe_device.c

[PATCH v5 3/9] drm/xe: Add a helper function to set recovery method

2025-07-15 Thread Riana Tauro
Add a helper function to set recovery method. The recovery method has to be set before declaring the device wedged and sending the drm wedged uevent. If no method is set, default unbind/re-bind method will be set Signed-off-by: Riana Tauro --- drivers/gpu/drm/xe/xe_device.c | 26

[PATCH v5 7/9] drm/xe: Add support to handle hardware errors

2025-07-15 Thread Riana Tauro
Signed-off-by: Riana Tauro Reviewed-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/xe/Makefile| 1 + drivers/gpu/drm/xe/regs/xe_hw_error_regs.h | 15 +++ drivers/gpu/drm/xe/regs/xe_irq_regs.h | 1 + drivers/gpu/drm/xe/xe_hw_error.c | 106

[PATCH v5 8/9] drm/xe/xe_hw_error: Handle CSC Firmware reported Hardware errors

2025-07-15 Thread Riana Tauro
vendor recovery method with runtime survivability (Christian, Rodrigo, Raag) v3: move declare wedged to runtime survivability mode (Rodrigo) v4: update commit message Signed-off-by: Riana Tauro Reviewed-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/xe/regs/xe_gsc_regs.h | 2 + drivers

[PATCH v5 6/9] drm/xe/doc: Document device wedged and runtime survivability

2025-07-15 Thread Riana Tauro
Add documentation for vendor specific device wedged recovery method and runtime survivability. v2: fix documentation (Raag) v3: add userspace tool for firmware update (Raag) Signed-off-by: Riana Tauro --- Documentation/gpu/xe/index.rst | 1 + Documentation/gpu/xe/xe_device.rst

[PATCH v5 9/9] drm/xe/xe_hw_error: Add fault injection to trigger csc error handler

2025-07-15 Thread Riana Tauro
Add a debugfs fault handler to trigger csc error handler that wedges the device and sends drm uevent v2: add debugfs only for bmg (Umesh) Signed-off-by: Riana Tauro --- drivers/gpu/drm/xe/xe_debugfs.c | 3 +++ drivers/gpu/drm/xe/xe_hw_error.c | 11 +++ 2 files changed, 14 insertions

[PATCH v5 5/9] drm/xe/xe_survivability: Add support for Runtime survivability mode

2025-07-15 Thread Riana Tauro
that device is in survivability mode /sys/bus/pci/devices//survivability_mode v2: Fix kernel-doc (Umesh) v3: Add user friendly dmesg (Frank) Signed-off-by: Riana Tauro --- drivers/gpu/drm/xe/xe_survivability_mode.c| 43 ++- drivers/gpu/drm/xe/xe_survivability_mode.h| 1

Re: [PATCH v4 1/9] drm: Add a vendor-specific recovery method to device wedged uevent

2025-07-09 Thread Riana Tauro
Hi Sima On 7/9/2025 7:11 PM, Simona Vetter wrote: On Wed, Jul 09, 2025 at 04:50:13PM +0530, Riana Tauro wrote: Certain errors can cause the device to be wedged and may require a vendor specific recovery method to restore normal operation. Add a recovery method 'WEDGED=vendor-specific

[PATCH v4 1/9] drm: Add a vendor-specific recovery method to device wedged uevent

2025-07-09 Thread Riana Tauro
umentation (Raag) Cc: André Almeida Cc: Christian König Cc: David Airlie Cc: Suggested-by: Raag Jadav Signed-off-by: Riana Tauro --- Documentation/gpu/drm-uapi.rst | 9 + drivers/gpu/drm/drm_drv.c | 2 ++ include/drm/drm_device.h | 4 3 files changed, 11 insertions(+), 4

Re: [PATCH v4 1/9] drm: Add a vendor-specific recovery method to device wedged uevent

2025-07-10 Thread Riana Tauro
at 12:52:05PM -0400, Rodrigo Vivi wrote: On Wed, Jul 09, 2025 at 05:18:54PM +0300, Raag Jadav wrote: On Wed, Jul 09, 2025 at 04:09:20PM +0200, Christian König wrote: On 09.07.25 15:41, Simona Vetter wrote: On Wed, Jul 09, 2025 at 04:50:13PM +0530, Riana Tauro wrote: Certain errors can cause the

Re: [PATCH v4 1/9] drm: Add a vendor-specific recovery method to device wedged uevent

2025-07-13 Thread Riana Tauro
, 2025 at 04:09:20PM +0200, Christian König wrote: On 09.07.25 15:41, Simona Vetter wrote: On Wed, Jul 09, 2025 at 04:50:13PM +0530, Riana Tauro wrote: Certain errors can cause the device to be wedged and may require a vendor specific recovery method to restore normal operation. Add a recovery method

Re: [PATCH v3 1/7] drm: Add a vendor-specific recovery method to device wedged uevent

2025-07-02 Thread Riana Tauro
On 7/3/2025 12:10 PM, Raag Jadav wrote: On Thu, Jul 03, 2025 at 10:50:53AM +0530, Riana Tauro wrote: On 7/3/2025 9:36 AM, Raag Jadav wrote: On Wed, Jul 02, 2025 at 07:41:11PM +0530, Riana Tauro wrote: Certain errors can cause the device to be wedged and may require a vendor specific

Re: [PATCH v3 1/7] drm: Add a vendor-specific recovery method to device wedged uevent

2025-07-02 Thread Riana Tauro
On 7/3/2025 9:36 AM, Raag Jadav wrote: On Wed, Jul 02, 2025 at 07:41:11PM +0530, Riana Tauro wrote: Certain errors can cause the device to be wedged and may require a vendor specific recovery method to restore normal operation. Add a recovery method 'WEDGED=vendor-specific' for s

[PATCH v3 1/7] drm: Add a vendor-specific recovery method to device wedged uevent

2025-07-02 Thread Riana Tauro
lmeida Cc: Christian König Cc: David Airlie Cc: Suggested-by: Raag Jadav Signed-off-by: Riana Tauro --- Documentation/gpu/drm-uapi.rst | 5 - drivers/gpu/drm/drm_drv.c | 2 ++ include/drm/drm_device.h | 4 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/Documen

Re: [PATCH v2 1/5] drm: Add a firmware flash method to device wedged uevent

2025-07-01 Thread Riana Tauro
On 7/1/2025 9:32 PM, Raag Jadav wrote: On Tue, Jul 01, 2025 at 04:35:42PM +0200, Christian König wrote: On 01.07.25 16:23, Raag Jadav wrote: On Tue, Jul 01, 2025 at 05:11:24PM +0530, Riana Tauro wrote: On 7/1/2025 5:07 PM, Riana Tauro wrote: On 6/30/2025 11:03 PM, Rodrigo Vivi wrote: On

Re: [PATCH v5 9/9] drm/xe/xe_hw_error: Add fault injection to trigger csc error handler

2025-07-15 Thread Riana Tauro
Hi Stuart On 7/15/2025 7:40 PM, Summers, Stuart wrote: On Tue, 2025-07-15 at 16:17 +0530, Riana Tauro wrote: Add a debugfs fault handler to trigger csc error handler that wedges the device and sends drm uevent v2: add debugfs only for bmg (Umesh) Signed-off-by: Riana Tauro ---  drivers/gpu

Re: [PATCH v5 7/9] drm/xe: Add support to handle hardware errors

2025-07-15 Thread Riana Tauro
Hi Stuart On 7/15/2025 7:38 PM, Summers, Stuart wrote: On Tue, 2025-07-15 at 16:17 +0530, Riana Tauro wrote: Gfx device reports two classes of errors: uncorrectable and correctable. Depending on the severity uncorrectable errors are further classified Non-Fatal and Fatal Correctable and Non

Re: [PATCH v2 1/5] drm: Add a firmware flash method to device wedged uevent

2025-07-01 Thread Riana Tauro
Hi Rodrigo/Christian On 6/30/2025 11:03 PM, Rodrigo Vivi wrote: On Mon, Jun 30, 2025 at 10:29:10AM +0200, Christian König wrote: On 27.06.25 23:38, Rodrigo Vivi wrote: Or at least print a big warning into the system log? I mean a firmware update is usually something which the system administr

Re: [PATCH v2 1/5] drm: Add a firmware flash method to device wedged uevent

2025-07-01 Thread Riana Tauro
On 7/1/2025 5:07 PM, Riana Tauro wrote: Hi Rodrigo/Christian On 6/30/2025 11:03 PM, Rodrigo Vivi wrote: On Mon, Jun 30, 2025 at 10:29:10AM +0200, Christian König wrote: On 27.06.25 23:38, Rodrigo Vivi wrote: Or at least print a big warning into the system log? I mean a firmware update is

Re: [PATCH v5 4/9] drm/xe/xe_survivability: Refactor survivability mode

2025-07-23 Thread Riana Tauro
On 7/23/2025 7:30 PM, Raag Jadav wrote: On Tue, Jul 15, 2025 at 04:17:24PM +0530, Riana Tauro wrote: The patches in these series refactor the boot survivability code to allow adding runtime survivability Refactor existing code to separate both the modes Punctuations please! This patch

Re: [PATCH v5 5/9] drm/xe/xe_survivability: Add support for Runtime survivability mode

2025-07-23 Thread Riana Tauro
On 7/23/2025 7:38 PM, Raag Jadav wrote: On Tue, Jul 15, 2025 at 04:17:25PM +0530, Riana Tauro wrote: Certain runtime firmware errors can cause the device to be in a unusable state requiring a firmware flash to restore normal operation. Runtime Survivability Mode indicates firmware flash is

Re: [PATCH v5 6/9] drm/xe/doc: Document device wedged and runtime survivability

2025-07-23 Thread Riana Tauro
On 7/23/2025 7:04 PM, Raag Jadav wrote: On Tue, Jul 15, 2025 at 04:17:26PM +0530, Riana Tauro wrote: Add documentation for vendor specific device wedged recovery method and runtime survivability. ... diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index

[PATCH v6 7/9] drm/xe: Add support to handle hardware errors

2025-07-24 Thread Riana Tauro
maintained through SBR Add basic support to handle these errors Bspec: 50875, 53073, 53074, 53075, 53076 v2: Format commit message (Umesh) v3: fix documentation (Stuart) Cc: Stuart Summers Co-developed-by: Himal Prasad Ghimiray Signed-off-by: Himal Prasad Ghimiray Signed-off-by: Riana Tauro

[PATCH v6 8/9] drm/xe/xe_hw_error: Handle CSC Firmware reported Hardware errors

2025-07-24 Thread Riana Tauro
: use vendor recovery method with runtime survivability (Christian, Rodrigo, Raag) v3: move declare wedged to runtime survivability mode (Rodrigo) v4: update commit message Signed-off-by: Riana Tauro Reviewed-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/xe/regs/xe_gsc_regs.h | 2

[PATCH v6 9/9] drm/xe/xe_hw_error: Add fault injection to trigger csc error handler

2025-07-24 Thread Riana Tauro
Add a debugfs fault handler to trigger csc error handler that wedges the device and enables runtime survivability mode v2: add debugfs only for bmg (Umesh) v3: do not use csc_fault attribute if debugfs is not enabled Cc: Lucas De Marchi Signed-off-by: Riana Tauro --- drivers/gpu/drm/xe

[PATCH v6 6/9] drm/xe/doc: Document device wedged and runtime survivability

2025-07-24 Thread Riana Tauro
Add documentation for vendor specific device wedged recovery method and runtime survivability. v2: fix documentation (Raag) v3: add userspace tool for firmware update (Raag) v4: use consistent documentation (Raag) Signed-off-by: Riana Tauro --- Documentation/gpu/xe/index.rst | 1

[PATCH v6 4/9] drm/xe/xe_survivability: Refactor survivability mode

2025-07-24 Thread Riana Tauro
The patches in these series refactor the boot survivability code to allow adding runtime survivability. Refactor existing code to separate both the modes This patch renames the functions and separates init and enable. Signed-off-by: Riana Tauro --- drivers/gpu/drm/xe/xe_device.c

[PATCH v6 5/9] drm/xe/xe_survivability: Add support for Runtime survivability mode

2025-07-24 Thread Riana Tauro
that device is in survivability mode /sys/bus/pci/devices//survivability_mode v2: Fix kernel-doc (Umesh) v3: Add user friendly dmesg (Frank) Signed-off-by: Riana Tauro --- drivers/gpu/drm/xe/xe_survivability_mode.c| 43 ++- drivers/gpu/drm/xe/xe_survivability_mode.h| 1

[PATCH v6 0/9] Handle Firmware reported Hardware Errors

2025-07-24 Thread Riana Tauro
mode when csc errors are reported Rev4: refactor survivability code Rev5: Add more documentation add user friendly logs remove checks for BMG if not necessary fix other review comments Rev6: Use consistent words revert to include BMG checks Riana Tauro (9): drm: Add a vendo

[PATCH v6 3/9] drm/xe: Add a helper function to set recovery method

2025-07-24 Thread Riana Tauro
Add a helper function to set recovery method. The recovery method has to be set before declaring the device wedged and sending the drm wedged uevent. If no method is set, default unbind/re-bind method will be set v2: fix documentation (Raag) Signed-off-by: Riana Tauro Reviewed-by: Raag Jadav

[PATCH v6 1/9] drm: Add a vendor-specific recovery method to drm device wedged uevent

2025-07-24 Thread Riana Tauro
ore details to commit message (Sima, Rodrigo, Raag) add an example script to the documentation (Raag) v4: use consistent naming (Raag) Cc: André Almeida Cc: Christian König Cc: David Airlie Co-developed-by: Raag Jadav Signed-off-by: Raag Jadav Signed-off-by: Ria

[PATCH v6 2/9] drm/xe: Set GT as wedged before sending wedged uevent

2025-07-24 Thread Riana Tauro
Userspace should be notified after setting the device as wedged. Re-order function calls to set gt wedged before sending uevent. Cc: Matthew Brost Suggested-by: Raag Jadav Signed-off-by: Riana Tauro Reviewed-by: Matthew Brost --- drivers/gpu/drm/xe/xe_device.c | 12 1 file

Re: [PATCH v6 1/9] drm: Add a vendor-specific recovery method to drm device wedged uevent

2025-07-24 Thread Riana Tauro
On 7/24/2025 9:48 PM, Rodrigo Vivi wrote: On Thu, Jul 24, 2025 at 08:04:30PM +0530, Riana Tauro wrote: This patch addresses the need for a recovery method (firmware flash on Firmware errors) introduced in the later patches of Xe KMD. Whenever XE KMD detects a firmware error, a drm device

[PATCH v7 0/9] Handle Firmware reported Hardware Errors

2025-07-28 Thread Riana Tauro
mode when csc errors are reported Rev4: refactor survivability code Rev5: Add more documentation add user friendly logs remove checks for BMG if not necessary fix other review comments Rev6: Use consistent words revert to include BMG checks Rev7: fix cosmetic changes Riana

[PATCH v7 6/9] drm/xe/doc: Document device wedged and runtime survivability

2025-07-28 Thread Riana Tauro
Add documentation for vendor specific device wedged recovery method and runtime survivability. v2: fix documentation (Raag) v3: add userspace tool for firmware update (Raag) v4: use consistent documentation (Raag) Signed-off-by: Riana Tauro Reviewed-by: Rodrigo Vivi Reviewed-by: Raag Jadav

[PATCH v7 3/9] drm/xe: Add a helper function to set recovery method

2025-07-28 Thread Riana Tauro
Add a helper function to set recovery method. The recovery method can be set before declaring the device wedged and sending the drm wedged uevent. If no method is set, default unbind/re-bind method will be set. v2: fix documentation (Raag) Signed-off-by: Riana Tauro Reviewed-by: Raag Jadav

[PATCH v7 4/9] drm/xe/xe_survivability: Refactor survivability mode

2025-07-28 Thread Riana Tauro
Refactor survivability mode code to support both boot and runtime survivability. Signed-off-by: Riana Tauro Reviewed-by: Raag Jadav --- drivers/gpu/drm/xe/xe_device.c| 2 +- drivers/gpu/drm/xe/xe_heci_gsc.c | 2 +- drivers/gpu/drm/xe/xe_pci.c

[PATCH v7 2/9] drm/xe: Set GT as wedged before sending wedged uevent

2025-07-28 Thread Riana Tauro
Userspace should be notified after setting the device as wedged. Re-order function calls to set gt wedged before sending uevent. Cc: Matthew Brost Suggested-by: Raag Jadav Signed-off-by: Riana Tauro Reviewed-by: Matthew Brost --- drivers/gpu/drm/xe/xe_device.c | 12 1 file

[PATCH v7 5/9] drm/xe/xe_survivability: Add support for Runtime survivability mode

2025-07-28 Thread Riana Tauro
that device is in survivability mode /sys/bus/pci/devices//survivability_mode v2: Fix kernel-doc (Umesh) v3: Add user friendly dmesg (Frank) Signed-off-by: Riana Tauro Reviewed-by: Raag Jadav --- drivers/gpu/drm/xe/xe_survivability_mode.c| 43 ++- drivers/gpu/drm/xe

[PATCH v7 1/9] drm: Add a vendor-specific recovery method to drm device wedged uevent

2025-07-28 Thread Riana Tauro
add an example script to the documentation (Raag) v4: use consistent naming (Raag) v5: fix commit message Cc: André Almeida Cc: Christian König Cc: David Airlie Cc: Simona Vetter Co-developed-by: Raag Jadav Signed-off-by: Raag Jadav Signed-off-by: Riana Tauro Reviewed-by: Rodr

[PATCH v7 7/9] drm/xe: Add support to handle hardware errors

2025-07-28 Thread Riana Tauro
maintained through SBR. Add basic support to handle these errors. Bspec: 50875, 53073, 53074, 53075, 53076 v2: Format commit message (Umesh) v3: fix documentation (Stuart) Cc: Stuart Summers Co-developed-by: Himal Prasad Ghimiray Signed-off-by: Himal Prasad Ghimiray Signed-off-by: Riana Tauro

[PATCH v7 8/9] drm/xe/xe_hw_error: Handle CSC Firmware reported Hardware errors

2025-07-28 Thread Riana Tauro
: use vendor recovery method with runtime survivability (Christian, Rodrigo, Raag) v3: move declare wedged to runtime survivability mode (Rodrigo) v4: update commit message Signed-off-by: Riana Tauro Reviewed-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/xe/regs/xe_gsc_regs.h | 2

[PATCH v7 9/9] drm/xe/xe_hw_error: Add fault injection to trigger csc error handler

2025-07-28 Thread Riana Tauro
Add a debugfs fault handler to trigger csc error handler that wedges the device and enables runtime survivability mode. v2: add debugfs only for bmg (Umesh) v3: do not use csc_fault attribute if debugfs is not enabled Cc: Lucas De Marchi Signed-off-by: Riana Tauro Reviewed-by: Raag Jadav

Re: [PATCH v7 1/9] drm: Add a vendor-specific recovery method to drm device wedged uevent

2025-07-31 Thread Riana Tauro
On 7/31/2025 6:31 PM, Maxime Ripard wrote: On Thu, Jul 31, 2025 at 04:43:46PM +0530, Riana Tauro wrote: Hi Maxim On 7/31/2025 3:02 PM, Maxime Ripard wrote: Hi, On Wed, Jul 30, 2025 at 07:33:01PM +0530, Riana Tauro wrote: On 7/28/2025 3:57 PM, Riana Tauro wrote: Address the need for a

Re: [PATCH v7 1/9] drm: Add a vendor-specific recovery method to drm device wedged uevent

2025-07-31 Thread Riana Tauro
Hi Maxim On 7/31/2025 3:02 PM, Maxime Ripard wrote: Hi, On Wed, Jul 30, 2025 at 07:33:01PM +0530, Riana Tauro wrote: On 7/28/2025 3:57 PM, Riana Tauro wrote: Address the need for a recovery method (firmware flash on Firmware errors) introduced in the later patches of Xe KMD. Whenever XE KMD

Re: [PATCH v7 1/9] drm: Add a vendor-specific recovery method to drm device wedged uevent

2025-07-30 Thread Riana Tauro
On 7/28/2025 3:57 PM, Riana Tauro wrote: Address the need for a recovery method (firmware flash on Firmware errors) introduced in the later patches of Xe KMD. Whenever XE KMD detects a firmware error, a firmware flash is required to recover the device to normal operation. The initial