On 16 May 2023, at 13:45, Maxime Coquelin wrote:

> On 5/16/23 13:36, Eelco Chaudron wrote:
>>
>>
>> On 16 May 2023, at 12:12, David Marchand wrote:
>>
>>> On Tue, May 16, 2023 at 10:53 AM Eelco Chaudron <echau...@redhat.com> wrote:
>>>> On 10 May 2023, at 13:44, David Marchand wrote:
>>>
>>> [snip]
>>>
>>>>>> @@ -846,6 +848,14 @@ vhost_user_socket_mem_free(struct vhost_user_socket 
>>>>>> *vsocket)
>>>>>>                  vsocket->path = NULL;
>>>>>>          }
>>>>>>
>>>>>> +       if (vsocket && vsocket->alloc_notify_ops) {
>>>>>> +#pragma GCC diagnostic push
>>>>>> +#pragma GCC diagnostic ignored "-Wcast-qual"
>>>>>> +               free((struct rte_vhost_device_ops *)vsocket->notify_ops);
>>>>>> +#pragma GCC diagnostic pop
>>>>>> +               vsocket->notify_ops = NULL;
>>>>>> +       }
>>>>>
>>>>> Rather than select the behavior based on a boolean (and here force the
>>>>> compiler to close its eyes), I would instead add a non const pointer
>>>>> to ops (let's say alloc_notify_ops) in vhost_user_socket.
>>>>> The code can then unconditionnally call free(vsocket->alloc_notify_ops);
>>>>
>>>> Good idea, I will make the change in v3.
>>>
>>> Feel free to use a better name for this field :-).
>>>
>>>>
>>>>>> +
>>>>>>          if (vsocket) {
>>>>>>                  free(vsocket);
>>>>>>                  vsocket = NULL;
>>>
>>> [snip]
>>>
>>>>>> +       /*
>>>>>> +        * Although the ops structure is a const structure, we do need to
>>>>>> +        * override the guest_notify operation. This is because with the
>>>>>> +        * previous APIs it was "reserved" and if any garbage value was 
>>>>>> passed,
>>>>>> +        * it could crash the application.
>>>>>> +        */
>>>>>> +       if (ops && !ops->guest_notify) {
>>>>>
>>>>> Hum, as described in the comment above, I don't think we should look
>>>>> at ops->guest_notify value at all.
>>>>> Checking ops != NULL should be enough.
>>>>
>>>> Not sure I get you here. If the guest_notify passed by the user is NULL, 
>>>> it means the previously ‘reserved[1]’ field is NULL, so we do not need to 
>>>> use a new structure.
>>>>
>>>> I guess your comment would be true if we would introduce a new field in 
>>>> the data structure, not replacing a reserved one.
>>>
>>> Hum, I don't understand my comment either o_O'.
>>> Too many days off... or maybe my evil twin took over the keyboard.
>>>
>>>
>>>>
>>>>>> +               struct rte_vhost_device_ops *new_ops;
>>>>>> +
>>>>>> +               new_ops = malloc(sizeof(*new_ops));
>>>>>
>>>>> Strictly speaking, we lose the numa affinity of "ops" by calling malloc.
>>>>> I am unclear of the impact though.
>>>>
>>>> Don’t think there is a portable API that we can use to determine the NUMA 
>>>> for the ops memory and then allocate this on the same numa?
>>>>
>>>> Any thoughts or ideas on how to solve this? I hope most people will 
>>>> memset() the ops structure and the reserved[1] part is zero, but it might 
>>>> be a problem in the future if more extensions get added.
>>>
>>> Determinining current numa is doable, via 'ops'
>>> get_mempolicy(MPOL_F_NODE | MPOL_F_ADDR), like what is done for vq in
>>> numa_realloc().
>>> The problem is how to allocate on this numa with the libc allocator
>>> for which I have no idea...
>>> We could go with the dpdk allocator (again, like numa_realloc()).
>>>
>>>
>>> In practice, the passed ops will be probably from a const variable in
>>> the program .data section (for which I think fields are set to 0
>>> unless explicitly initialised), or a memset() will be called for a
>>> dynamic allocation from good citizens.
>>> So we can probably live with the current proposal.
>>> Plus, this is only for one release, since in 23.11 with the ABI bump,
>>> we will drop this compat code.
>>>
>>> Maxime, Chenbo, what do you think?
>>
>> Wait for their response, but for now I assume we can just keep the numa 
>> unaware malloc().
>
> Let's keep it as is as we'll get rid of it in 23.11.

Thanks for confirming.

>>>
>>> [snip]
>>>
>>>>>
>>>>> But putting indentation aside, is this change equivalent?
>>>>> -               if ((vhost_need_event(vhost_used_event(vq), new, old) &&
>>>>> -                                       (vq->callfd >= 0)) ||
>>>>> -                               unlikely(!signalled_used_valid)) {
>>>>> +               if ((vhost_need_event(vhost_used_event(vq), new, old) ||
>>>>> +                               unlikely(!signalled_used_valid)) &&
>>>>> +                               vq->callfd >= 0) {
>>>>
>>>> They are not equal, but in the past eventfd_write() should also not have 
>>>> been called with callfd < 0, guess this was an existing bug ;)
>>>
>>> I think this should be a separate fix.
>>
>> ACK, will add a separate patch in this series to fix it.
>
> I also caught & fixed it while implementing my VDUSE series [0].
> You can pick it in your series, and I will rebase my series on top of
> it.

Thanks for the details I’ll include your patch in my series.

I will send out a new revision soon (after testing the changes with OVS).

Thanks,

Eelco

> Thanks,
> Maxime
>
> [0]: 
> https://gitlab.com/mcoquelin/dpdk-next-virtio/-/commit/b976e1f226db5c09834148847d994045eb89be93
>
>
>>
>>>
>>>>
>>>>>> +                       vhost_vring_inject_irq(dev, vq);
>>>
>>>
>>> -- 
>>> David Marchand
>>

Reply via email to