On Mon, May 07, 2018 at 11:49:49AM +0800, Tiwei Bie wrote: > On Fri, May 04, 2018 at 05:48:05PM +0200, Maxime Coquelin wrote: > > Hi Tiwei, > > > > On 05/03/2018 01:56 PM, Tiwei Bie wrote: > > > On Mon, Apr 30, 2018 at 05:59:54PM +0200, Maxime Coquelin wrote: > [...] > > > > +static __rte_always_inline void > > > > +vhost_log_cache_sync(struct virtio_net *dev, struct vhost_virtqueue > > > > *vq) > > > > +{ > > > > + uint32_t *log_base; > > > > + int i; > > > > + > > > > + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) || > > > > + !dev->log_base)) > > > > + return; > > > > + > > > > + log_base = (uint32_t *)(uintptr_t)dev->log_base; > > > > + > > > > + /* To make sure guest memory updates are committed before > > > > logging */ > > > > + rte_smp_wmb(); > > > > > > It seems that __sync_fetch_and_or() can be considered a full > > > barrier [1]. So do we really need this rte_smp_wmb()? > > > > That's a good point, thanks for the pointer. > > > > > Besides, based on the same doc [1], it seems that the __sync_ > > > version is deprecated in favor of the __atomic_ one. > > > > I will change to __atomic_. For the memory model, do you agree I should > > use __ATOMIC_SEQ_CST? > > Maybe we can use __ATOMIC_RELAXED and keep rte_smp_wmb(). > > Best regards, > Tiwei Bie
Yes I think if you do your own barriers, you can use __ATOMIC_RELAXED in theory. One issue is that as one lwn article noted "there will be some seriously suboptimal code production before gcc-7.1". So if you are using these new APIs, you should also check that a relatively new compiler is used. > > > > > [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html > > > > > > > + > > > > + for (i = 0; i < vq->log_cache_nb_elem; i++) { > > > > + struct log_cache_entry *elem = vq->log_cache + i; > > > > + > > > > + __sync_fetch_and_or(log_base + elem->offset, elem->val); > > > > + } > > > > + > > > > + vq->log_cache_nb_elem = 0; > > > > +} > > > > + > > > [...] > > > > > > > Thanks, > > Maxime