On 5/1/20 12:58 AM, Ferruh Yigit wrote:
> On 4/30/2020 10:14 AM, Joyce Kong wrote:
>> In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the frontend
>> and backend are assumed to be implemented in software, that is they can
>> run on identical CPUs in an SMP configuration.
>> Thus a weak form of memory barriers like rte_smp_r/wmb, other than
>> rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1)
>> and yields better performance.
>> For the above case, this patch helps yielding even better performance
>> by replacing the two-way barriers with C11 one-way barriers for used
>> index in split ring.
>>
>> Signed-off-by: Joyce Kong <joyce.k...@arm.com>
>> Reviewed-by: Gavin Hu <gavin...@arm.com>
>> Reviewed-by: Maxime Coquelin <maxime.coque...@redhat.com>
>
> <...>
>
>> @@ -464,8 +464,33 @@ virtio_get_queue_type(struct virtio_hw *hw, uint16_t
>> vtpci_queue_idx)
>> return VTNET_TQ;
>> }
>>
>> -#define VIRTQUEUE_NUSED(vq) ((uint16_t)((vq)->vq_split.ring.used->idx - \
>> - (vq)->vq_used_cons_idx))
>> +/* virtqueue_nused has load-acquire or rte_cio_rmb insed */
>> +static inline uint16_t
>> +virtqueue_nused(const struct virtqueue *vq)
>> +{
>> + uint16_t idx;
>> +
>> + if (vq->hw->weak_barriers) {
>> + /**
>> + * x86 prefers to using rte_smp_rmb over __atomic_load_n as it
>> + * reports a slightly better perf, which comes from the saved
>> + * branch by the compiler.
>> + * The if and else branches are identical with the smp and cio
>> + * barriers both defined as compiler barriers on x86.
>> + */
>> +#ifdef RTE_ARCH_X86_64
>> + idx = vq->vq_split.ring.used->idx;
>> + rte_smp_rmb();
>> +#else
>> + idx = __atomic_load_n(&(vq)->vq_split.ring.used->idx,
>> + __ATOMIC_ACQUIRE);
>> +#endif
>> + } else {
>> + idx = vq->vq_split.ring.used->idx;
>> + rte_cio_rmb();
>> + }
>> + return idx - vq->vq_used_cons_idx;
>> +}
>
> AltiVec implementation (virtio_rxtx_simple_altivec.c) is also using
> 'VIRTQUEUE_NUSED' macro, it also needs to be updated with this change.
>
I reproduced and fix the build issue.
You can fetch my tree with fixed series.
Thanks,
Maxime