On 2023-11-13 12:04, Yat Sin, David wrote:

[AMD Official Use Only - General]


*From:* Zhu, James <james....@amd.com>
*Sent:* Monday, November 13, 2023 10:20 AM
*To:* Yat Sin, David <david.yat...@amd.com>; Zhu, James <james....@amd.com>; amd-gfx@lists.freedesktop.org *Cc:* Kuehling, Felix <felix.kuehl...@amd.com>; Greathouse, Joseph <joseph.greatho...@amd.com>
*Subject:* Re: [PATCH 19/24] drm/amdkfd: enable pc sampling stop

On 2023-11-10 14:07, Yat Sin, David wrote:

    [AMD Official Use Only - General]

        -----Original Message-----

        From: Zhu, James<james....@amd.com>  <mailto:james....@amd.com>

        Sent: Friday, November 3, 2023 9:12 AM

        To:amd-gfx@lists.freedesktop.org

        Cc: Kuehling, Felix<felix.kuehl...@amd.com>  
<mailto:felix.kuehl...@amd.com>; Greathouse, Joseph

        <joseph.greatho...@amd.com>  <mailto:joseph.greatho...@amd.com>; Yat Sin, 
David<david.yat...@amd.com>  <mailto:david.yat...@amd.com>; Zhu,

        James<james....@amd.com>  <mailto:james....@amd.com>

        Subject: [PATCH 19/24] drm/amdkfd: enable pc sampling stop

        Enable pc sampling stop.

        Signed-off-by: James Zhu<james....@amd.com>  <mailto:james....@amd.com>

        ---

          drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 28 +++++++++++++++++--

        -

          drivers/gpu/drm/amd/amdkfd/kfd_priv.h        |  2 ++

          2 files changed, 27 insertions(+), 3 deletions(-)

        diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

        b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

        index 33d003ca0093..2c4ac5b4cc4b 100644

        --- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

        +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

        @@ -108,10 +108,32 @@ static int kfd_pc_sample_start(struct

        kfd_process_device *pdd,

               return 0;

          }

        -static int kfd_pc_sample_stop(struct kfd_process_device *pdd)

        +static int kfd_pc_sample_stop(struct kfd_process_device *pdd,

        +                                     struct pc_sampling_entry 
*pcs_entry)

          {

        -     return -EINVAL;

        +     bool pc_sampling_stop = false;

        +

        +     pcs_entry->enabled = false;

        +     mutex_lock(&pdd->dev->pcs_data.mutex);

    For the START/STOP/DESTROY ioctls, I think we can hold 
pdd->dev->pcs_data.mutex through the whole IOCTL. Then we would not have to 
deal with the intermediate states where the START/STOP/DESTROY are happening at the 
same time.

[JZ] pdd->dev->pcs_data.mutex is per device, not per process. In the future, also it will share protection within different pc sampling methods on the same devices. So I don't think a bigger lock here is good idea. [David] I think the CREATE/START/STOP/DESTROY actions are not time critical. So if two processes are using the same GPU, it is ok for amdgpu to block the 2^nd process until amdgpu has fully completed the request from the 1^st process. I think we are making the code more complex for a use-case that would be very rare.

[JZ] IIRC, the init RFC version used bigger lock, and is questioned as an inefficient way,


        +     pdd->dev->pcs_data.hosttrap_entry.base.active_count--;

        +     if (!pdd->dev->pcs_data.hosttrap_entry.base.active_count) {

        +             WRITE_ONCE(pdd->dev-

            pcs_data.hosttrap_entry.base.stop_enable, true);

        +             pc_sampling_stop = true;

        +     }

        +     mutex_unlock(&pdd->dev->pcs_data.mutex);

        +     if (pc_sampling_stop) {

        +             kfd_process_set_trap_pc_sampling_flag(&pdd->qpd,

        +                     pdd->dev-

            pcs_data.hosttrap_entry.base.pc_sample_info.method,

        +false);

        +

        +             mutex_lock(&pdd->dev->pcs_data.mutex);

        +             pdd->dev->pcs_data.hosttrap_entry.base.target_simd = 0;

        +             pdd->dev->pcs_data.hosttrap_entry.base.target_wave_slot = 
0;

        +             WRITE_ONCE(pdd->dev-

            pcs_data.hosttrap_entry.base.stop_enable, false);

        +             mutex_unlock(&pdd->dev->pcs_data.mutex);

        +     }

        +

        +     return 0;

          }

          static int kfd_pc_sample_create(struct kfd_process_device *pdd, @@ 
-251,7

        +273,7 @@ int kfd_pc_sample(struct kfd_process_device *pdd,

                       if (!pcs_entry->enabled)

                               return -EALREADY;

                       else

        -                     return kfd_pc_sample_stop(pdd);

        +                     return kfd_pc_sample_stop(pdd, pcs_entry);

               }

               return -EINVAL;

        diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

        b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

        index 613910e0d440..badaa4d68cc4 100644

        --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

        +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

        @@ -259,6 +259,8 @@ struct kfd_dev;

          struct kfd_dev_pc_sampling_data {

               uint32_t use_count;         /* Num of PC sampling sessions */

               uint32_t active_count;      /* Num of active sessions */

        +     uint32_t target_simd;       /* target simd for trap */

        +     uint32_t target_wave_slot;  /* target wave slot for trap */

               bool stop_enable;           /* pc sampling stop in process */

               struct idr pc_sampling_idr;

               struct kfd_pc_sample_info pc_sample_info;

        --

        2.25.1

Reply via email to