On Tue, Oct 11, 2016 at 02:57:49PM +0800, Yuanhan Liu wrote: > > > > > > There was an example: the vhost enqueue optmization patchset from > > > > > > Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)
Though it doesn't matter now, but I have verified it yesterday (with and wihtout memset), the drop could be up to 30+%. This is to let you know that it could behaviour badly if memset is not inlined. > > > > > > on my Ivybridge server: it has no such issue on his server though. > > > > > > > > > > > > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html > > > > > > > > > > > > --yliu > > > > > > > > > > I'd say that's weird. what's your config? any chance you > > > > > are using an old compiler? > > > > > > > > Not really, it's gcc 5.3.1. Maybe Zhihong could explain more. IIRC, > > > > he said the memset is not well optimized for Ivybridge server. > > > > > > The dst is remote in that case. It's fine on Haswell but has complication > > > in Ivy Bridge which (wasn't supposed to but) causes serious frontend > > > issue. > > > > > > I don't think gcc inlined it there. I'm using fc24 gcc 6.1.1. > > > > > > So try something like this then: > > Yes, I saw memset is inlined when this diff is applied. I have another concern though: It's a trick could let gcc do the inline, I am not quite sure whether that's ture with other compilers (i.e. clang, icc, or even, older gcc). For this case, I think I still prefer some trick like *(struct ..*) = {0, } Or even, we may could introduce rte_memset(). IIRC, that has been proposed somehow before? --yliu