On 2019-05-13 12:03 p.m., Zeng, Oak wrote:
> Hi Felix,
>
> See comments inline [Oak]
>
> Hi Kent, there is one FYI embedded, so be careful when you merge this change 
> back to kfd-staging branch.
>
> Regards,
> Oak
>
> -----Original Message-----
> From: Kuehling, Felix <felix.kuehl...@amd.com>
> Sent: Friday, May 10, 2019 4:28 PM
> To: Zeng, Oak <oak.z...@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: Keely, Sean <sean.ke...@amd.com>
> Subject: Re: [PATCH 7/8] drm/amdkfd: New IOCTL to allocate queue GWS
>
> On 2019-05-10 12:01 p.m., Zeng, Oak wrote:
>> Add a new kfd ioctl to allocate queue GWS. Queue GWS is released on
>> queue destroy.
>>
>> Change-Id: I60153c26a577992ad873e4292e759e5c3d5bbd15
>> Signed-off-by: Oak Zeng <oak.z...@amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 45 
>> ++++++++++++++++++++++++++++++++
>>    include/uapi/linux/kfd_ioctl.h           | 19 +++++++++++++-
>>    2 files changed, 63 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> index 38ae53f..17dd970 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> @@ -338,6 +338,11 @@ static int kfd_ioctl_destroy_queue(struct file
>> *filp, struct kfd_process *p,
>>    
>>      mutex_lock(&p->mutex);
>>    
>> +    if (pqm_get_gws(&p->pqm, args->queue_id)) {
>> +            amdgpu_amdkfd_remove_gws_from_process(p->kgd_process_info,
>> +                            pqm_get_gws(&p->pqm, args->queue_id));
>> +            pqm_set_gws(&p->pqm, args->queue_id, NULL);
>> +    }
> It would be more robust if this was done inside pqm_destroy_queue. That way 
> you'd handle other potential code paths that destroy queues (not sure there 
> are any) and you wouldn't need pqm_get_gws exported from PQM.
>
> [Oak] Good idea. Will do it. Even though currently there is no other code 
> path destroying a queue using gws, your way is safer for future code change.
>
>
>>      retval = pqm_destroy_queue(&p->pqm, args->queue_id);
>>    
>>      mutex_unlock(&p->mutex);
>> @@ -1559,6 +1564,44 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct 
>> file *filep,
>>      return err;
>>    }
>>    
>> +static int kfd_ioctl_alloc_queue_gws(struct file *filep,
>> +            struct kfd_process *p, void *data)
>> +{
>> +    int retval;
>> +    struct kfd_ioctl_alloc_queue_gws_args *args = data;
>> +    struct kfd_dev *dev = NULL;
>> +    struct kgd_mem *mem;
>> +
>> +    if (args->num_gws == 0)
>> +            return -EINVAL;
>> +
>> +    if (!hws_gws_support ||
>> +            dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS)
>> +            return -EINVAL;
>> +
>> +    dev = kfd_device_by_id(args->gpu_id);
>> +    if (!dev) {
>> +            pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
>> +            return -EINVAL;
>> +    }
>> +
>> +    retval = amdgpu_amdkfd_add_gws_to_process(p->kgd_process_info,
>> +                    dev->gws, &mem);
>> +    if (unlikely(retval))
>> +            return retval;
>> +
>> +    mutex_lock(&p->mutex);
>> +    retval = pqm_set_gws(&p->pqm, args->queue_id, mem);
>> +    mutex_unlock(&p->mutex);
>> +
>> +    if (unlikely(retval))
>> +            amdgpu_amdkfd_remove_gws_from_process(p->kgd_process_info, mem);
>> +
>> +    /* The gws_array parameter is reserved for future extension*/
>> +    args->gws_array[0] = 0;
>> +    return retval;
>> +}
>> +
>>    static int kfd_ioctl_get_dmabuf_info(struct file *filep,
>>              struct kfd_process *p, void *data)
>>    {
>> @@ -1761,6 +1804,8 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] 
>> = {
>>      AMDKFD_IOCTL_DEF(AMDKFD_IOC_IMPORT_DMABUF,
>>                              kfd_ioctl_import_dmabuf, 0),
>>    
>> +    AMDKFD_IOCTL_DEF(AMDKFD_IOC_ALLOC_QUEUE_GWS,
>> +                    kfd_ioctl_alloc_queue_gws, 0),
>>    };
>>    
>>    #define AMDKFD_CORE_IOCTL_COUNT   ARRAY_SIZE(amdkfd_ioctls)
>> diff --git a/include/uapi/linux/kfd_ioctl.h
>> b/include/uapi/linux/kfd_ioctl.h index 20917c5..1964ab2 100644
>> --- a/include/uapi/linux/kfd_ioctl.h
>> +++ b/include/uapi/linux/kfd_ioctl.h
>> @@ -410,6 +410,20 @@ struct kfd_ioctl_unmap_memory_from_gpu_args {
>>      __u32 n_success;                /* to/from KFD */
>>    };
>>    
>> +/* Allocate GWS for specific queue
>> + *
>> + * @gpu_id:      device identifier
>> + * @queue_id:    queue's id that GWS is allocated for
>> + * @num_gws:     how many GWS to allocate
>> + * @gws_array:   used to return the allocated gws
>> + */
>> +struct kfd_ioctl_alloc_queue_gws_args {
>> +    __u32 gpu_id;           /* to KFD */
>> +    __u32 queue_id;         /* to KFD */
>> +    __u32 num_gws;          /* to KFD */
>> +    __u32 *gws_array;       /* from KFD */
> Don't use pointers in ioctl structures. Use uint64_t. Accessing user mode 
> pointers requires copy_to/from_user or similar.
>
> Also prefer to move 64-bit elements to the first element to ensure proper 
> alignment, and pad the structure to 64-bit for ABI compatibility.
>
> I'm not sure what your plan is for that gws_array. If it's a pointer to a 
> user mode array, then that array needs be allocated by user mode. And user 
> mode should probably pass down the size of the array it allocated in another 
> parameter.
>
> That said, I think what we want is not an array, but just the index of the 
> first GWS entry that was allocated for the queue, which is currently always 
> 0. So I'm not sure why you're calling this an "array".
>
> [Oak] For the current design, one queue always get all 64 GWS, so returning 
> the index of the first GWS (0) is not a problem. In the future, is it 
> possible queue can only allocate a few none-contiguous GWS, for example GWS3 
> and GWS56? If this is the case, we will have to copy an array of gws back to 
> user space.

Current HW only support contiguous allocations of GWS. I don't foresee 
that changing. I think we can acknowledge that in the API.


>
>> +};
>> +
>>    struct kfd_ioctl_get_dmabuf_info_args {
>>      __u64 size;             /* from KFD */
>>      __u64 metadata_ptr;     /* to KFD */
>> @@ -529,7 +543,10 @@ enum kfd_mmio_remap {
>>    #define AMDKFD_IOC_IMPORT_DMABUF          \
>>              AMDKFD_IOWR(0x1D, struct kfd_ioctl_import_dmabuf_args)
>>    
>> +#define AMDKFD_IOC_ALLOC_QUEUE_GWS          \
>> +            AMDKFD_IOWR(0x1E, struct kfd_ioctl_alloc_queue_gws_args)
>> +
> This will force us to move some ioctl numbers in amd-kfd-staging and the DKMS 
> branch, which will break the ABI of our ROCm releases. Not sure there is a 
> good way to avoid that other than leaving a whole in the upstream ioctl 
> space. I got push-back on that kind of thing when I originally upstreamed 
> KFD. So this is just an FYI.
> [Oak] Yes, when we merge this changes back to amd-kfd-staging branch, we have 
> to change the ioctl number for GWS allocation to 0x22, to accommodate extra 
> kfd ioctls on kfd-staging branch. @Russell, Kent FYI

No, the other way around. With APIs that are upstream we should have 
amd-kfd-staging match the upstream ABI. That means we have to move the 
other non-upstream ioctl numbers.

Regards,
   Felix


>
> Regards,
>     Felix
>
>>    #define AMDKFD_COMMAND_START              0x01
>> -#define AMDKFD_COMMAND_END          0x1E
>> +#define AMDKFD_COMMAND_END          0x1F
>>    
>>    #endif
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to