On 16 May 2023, at 12:12, David Marchand wrote:
> On Tue, May 16, 2023 at 10:53 AM Eelco Chaudron <echau...@redhat.com> wrote:
>> On 10 May 2023, at 13:44, David Marchand wrote:
>
> [snip]
>
>>>> @@ -846,6 +848,14 @@ vhost_user_socket_mem_free(struct vhost_user_socket
>>>> *vsocket)
>>>> vsocket->path = NULL;
>>>> }
>>>>
>>>> + if (vsocket && vsocket->alloc_notify_ops) {
>>>> +#pragma GCC diagnostic push
>>>> +#pragma GCC diagnostic ignored "-Wcast-qual"
>>>> + free((struct rte_vhost_device_ops *)vsocket->notify_ops);
>>>> +#pragma GCC diagnostic pop
>>>> + vsocket->notify_ops = NULL;
>>>> + }
>>>
>>> Rather than select the behavior based on a boolean (and here force the
>>> compiler to close its eyes), I would instead add a non const pointer
>>> to ops (let's say alloc_notify_ops) in vhost_user_socket.
>>> The code can then unconditionnally call free(vsocket->alloc_notify_ops);
>>
>> Good idea, I will make the change in v3.
>
> Feel free to use a better name for this field :-).
>
>>
>>>> +
>>>> if (vsocket) {
>>>> free(vsocket);
>>>> vsocket = NULL;
>
> [snip]
>
>>>> + /*
>>>> + * Although the ops structure is a const structure, we do need to
>>>> + * override the guest_notify operation. This is because with the
>>>> + * previous APIs it was "reserved" and if any garbage value was
>>>> passed,
>>>> + * it could crash the application.
>>>> + */
>>>> + if (ops && !ops->guest_notify) {
>>>
>>> Hum, as described in the comment above, I don't think we should look
>>> at ops->guest_notify value at all.
>>> Checking ops != NULL should be enough.
>>
>> Not sure I get you here. If the guest_notify passed by the user is NULL, it
>> means the previously ‘reserved[1]’ field is NULL, so we do not need to use a
>> new structure.
>>
>> I guess your comment would be true if we would introduce a new field in the
>> data structure, not replacing a reserved one.
>
> Hum, I don't understand my comment either o_O'.
> Too many days off... or maybe my evil twin took over the keyboard.
>
>
>>
>>>> + struct rte_vhost_device_ops *new_ops;
>>>> +
>>>> + new_ops = malloc(sizeof(*new_ops));
>>>
>>> Strictly speaking, we lose the numa affinity of "ops" by calling malloc.
>>> I am unclear of the impact though.
>>
>> Don’t think there is a portable API that we can use to determine the NUMA
>> for the ops memory and then allocate this on the same numa?
>>
>> Any thoughts or ideas on how to solve this? I hope most people will memset()
>> the ops structure and the reserved[1] part is zero, but it might be a
>> problem in the future if more extensions get added.
>
> Determinining current numa is doable, via 'ops'
> get_mempolicy(MPOL_F_NODE | MPOL_F_ADDR), like what is done for vq in
> numa_realloc().
> The problem is how to allocate on this numa with the libc allocator
> for which I have no idea...
> We could go with the dpdk allocator (again, like numa_realloc()).
>
>
> In practice, the passed ops will be probably from a const variable in
> the program .data section (for which I think fields are set to 0
> unless explicitly initialised), or a memset() will be called for a
> dynamic allocation from good citizens.
> So we can probably live with the current proposal.
> Plus, this is only for one release, since in 23.11 with the ABI bump,
> we will drop this compat code.
>
> Maxime, Chenbo, what do you think?
Wait for their response, but for now I assume we can just keep the numa unaware
malloc().
>
> [snip]
>
>>>
>>> But putting indentation aside, is this change equivalent?
>>> - if ((vhost_need_event(vhost_used_event(vq), new, old) &&
>>> - (vq->callfd >= 0)) ||
>>> - unlikely(!signalled_used_valid)) {
>>> + if ((vhost_need_event(vhost_used_event(vq), new, old) ||
>>> + unlikely(!signalled_used_valid)) &&
>>> + vq->callfd >= 0) {
>>
>> They are not equal, but in the past eventfd_write() should also not have
>> been called with callfd < 0, guess this was an existing bug ;)
>
> I think this should be a separate fix.
ACK, will add a separate patch in this series to fix it.
>
>>
>>>> + vhost_vring_inject_irq(dev, vq);
>
>
> --
> David Marchand