> -----Original Message----- > From: Jason Wang [mailto:jasow...@redhat.com] > Sent: Thursday, July 11, 2019 12:11 PM > To: Liu, Yong <yong....@intel.com>; Bie, Tiwei <tiwei....@intel.com>; > maxime.coque...@redhat.com; dev@dpdk.org > Subject: Re: [dpdk-dev] [RFC PATCH 02/13] add vhost packed ring fast enqueue > function > > > On 2019/7/10 下午3:30, Liu, Yong wrote: > > > >> -----Original Message----- > >> From: Jason Wang [mailto:jasow...@redhat.com] > >> Sent: Wednesday, July 10, 2019 12:28 PM > >> To: Liu, Yong <yong....@intel.com>; Bie, Tiwei <tiwei....@intel.com>; > >> maxime.coque...@redhat.com; dev@dpdk.org > >> Subject: Re: [dpdk-dev] [RFC PATCH 02/13] add vhost packed ring fast > enqueue > >> function > >> > >> > >> On 2019/7/9 上午1:13, Marvin Liu wrote: > >>> In fast enqueue function, will first check whether descriptors are > >>> cache aligned. Fast enqueue function will check prerequisites in the > >>> beginning. Fast enqueue function do not support chained mbufs, normal > >>> function will handle that. > >>> > >>> Signed-off-by: Marvin Liu <yong....@intel.com> > >> Any reason for not letting compiler to unroll the loops? > >> > > Hi Jason, > > I'm not sure about how much compiler can help on unrolling loops as it > can't know how much loops will create in one call. > > After force not using unroll-loop optimization by "-fno-unroll-loops", > virtio_dev_rx_packed function size remained the same. > > So look like gcc unroll-loop optimization do not help here. > > > I meant something like "pragma GCC unroll N" just before the loop you > want unrolled. > > Thanks >
Hi Jason, Just tired with gcc8.3.0 and master code, only 0.1Mpps performance gain with "#pragma GCC unroll". I think this compiler pragma is not helpful in the big loop which contained so much functions. Thanks, Marvin > > > > > And fast enqueue function not only did unroll loop, it also checked cache > alignment which can help performance in another side. > > > > Regards, > > Marvin > > > >> Thanks > >>