Hi Chris,

I'm not aware of how ROCM SMI using feature nodes. but not all the sysfs are 
intended to be used by upper level apps/libs. 

There are bunches of sysfs entries that have multiple line value. The most 
complicate one would be pp_power_profile_mode, which looks like. 

0 BOOTUP_DEFAULT*:
                    0(       GFXCLK)       0       0       1       0       4    
 800 4587520  -65536       0
                    1(       SOCCLK)       0       0       1       0       4    
 800  327680   -6553       0
                    2(         UCLK)       0       0       1       0       4    
 800  327680  -65536       0
.......
1 3D_FULL_SCREEN :
                    0(       GFXCLK)       0       1       1       0       4    
 800 4587520  -65536       0
                    1(       SOCCLK)       0       1       4     850       4    
 800  327680  -65536       0

Regards,
Hawking
-----Original Message-----
From: Christian König <ckoenig.leichtzumer...@gmail.com> 
Sent: 2019年8月8日 22:25
To: Zhang, Hawking <hawking.zh...@amd.com>; Russell, Kent 
<kent.russ...@amd.com>; Zhou1, Tao <tao.zh...@amd.com>; 
amd-gfx@lists.freedesktop.org; Pan, Xinhui <xinhui....@amd.com>; Freehill, 
Chris <chris.freeh...@amd.com>
Subject: Re: [PATCH] drm/amdgpu: update ras sysfs feature info

Hi Hawking,

looks like you skipped my response.

Even the current way how sysfs is used in the ras code is a clear NO-GO and 
should be fixed before this is pushed upstream.

A sysfs entry should seriously NOT return a multi line value which needs to be 
extensively parsed by the application.

Regards,
Christian.

Am 08.08.19 um 15:50 schrieb Zhang, Hawking:
> Understood and agree we should keep stable interfaces.
>
> But the information in feature node is not correct and makes people 
> confusing. Basically, each IP blocks can support all the four error types, 
> not just multi-uncorrectable. As a result, any upper level apps/libs that 
> read from this file will just get confusing information as well. The feature 
> mask is already good enough for this node.
>
> Regards,
> Hawking
> -----Original Message-----
> From: Russell, Kent <kent.russ...@amd.com>
> Sent: 2019年8月8日 20:51
> To: Zhang, Hawking <hawking.zh...@amd.com>; Zhou1, Tao 
> <tao.zh...@amd.com>; amd-gfx@lists.freedesktop.org; Pan, Xinhui 
> <xinhui....@amd.com>; Freehill, Chris <chris.freeh...@amd.com>
> Cc: Zhou1, Tao <tao.zh...@amd.com>
> Subject: RE: [PATCH] drm/amdgpu: update ras sysfs feature info
>
> +Chris Freehill
>
> While I can understand this change, this broke our SMI interface, which was 
> expecting a specific string format for the ras/features file. This has 
> happened a few times now, where changes to the RAS sysfs files has broke the 
> SMI CLI and/or SMI LIB. Can we please get a stable interface and sysfs format 
> set up before publishing patches? This is creating a lot of extra work for 
> developers with the SMI to constantly keep up with the changes being made to 
> sysfs files. Thank you.
>
>   Kent
>
> -----Original Message-----
> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of 
> Zhang, Hawking
> Sent: Monday, August 5, 2019 4:15 AM
> To: Zhou1, Tao <tao.zh...@amd.com>; amd-gfx@lists.freedesktop.org; 
> Pan, Xinhui <xinhui....@amd.com>
> Cc: Zhou1, Tao <tao.zh...@amd.com>
> Subject: RE: [PATCH] drm/amdgpu: update ras sysfs feature info
>
> Reviewed-by: Hawking Zhang <hawking.zh...@amd.com>
>
> Regards,
> Hawking
> -----Original Message-----
> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of Tao 
> Zhou
> Sent: 2019年8月5日 16:04
> To: amd-gfx@lists.freedesktop.org; Pan, Xinhui <xinhui....@amd.com>; 
> Zhang, Hawking <hawking.zh...@amd.com>
> Cc: Zhou1, Tao <tao.zh...@amd.com>
> Subject: [PATCH] drm/amdgpu: update ras sysfs feature info
>
> remove confused ras error type info
>
> Signed-off-by: Tao Zhou <tao.zh...@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 17 +++++------------
>   1 file changed, 5 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index d2e8a85f6e38..369651247b23 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -787,25 +787,18 @@ static ssize_t amdgpu_ras_sysfs_features_read(struct 
> device *dev,
>       struct amdgpu_device *adev = ddev->dev_private;
>       struct ras_common_if head;
>       int ras_block_count = AMDGPU_RAS_BLOCK_COUNT;
> -     int i;
> +     int i, enabled;
>       ssize_t s;
> -     struct ras_manager *obj;
>   
>       s = scnprintf(buf, PAGE_SIZE, "feature mask: 0x%x\n", 
> con->features);
>   
>       for (i = 0; i < ras_block_count; i++) {
>               head.block = i;
> +             enabled = amdgpu_ras_is_feature_enabled(adev, &head);
>   
> -             if (amdgpu_ras_is_feature_enabled(adev, &head)) {
> -                     obj = amdgpu_ras_find_obj(adev, &head);
> -                     s += scnprintf(&buf[s], PAGE_SIZE - s,
> -                                     "%s: %s\n",
> -                                     ras_block_str(i),
> -                                     ras_err_str(obj->head.type));
> -             } else
> -                     s += scnprintf(&buf[s], PAGE_SIZE - s,
> -                                     "%s: disabled\n",
> -                                     ras_block_str(i));
> +             s += scnprintf(&buf[s], PAGE_SIZE - s,
> +                             "%s ras feature mask: %s\n",
> +                             ras_block_str(i), enabled?"on":"off");
>       }
>   
>       return s;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to