> On 21 Mar 2019, at 06:51, Nitin Saxena <nsax...@marvell.com> wrote: > > Hi, > > First all sorry for responding late to this mail chain. Please see my answers > inline in blue > > Thanks, > Nitin > > > From: Damjan Marion <dmar...@me.com> > Sent: Monday, March 18, 2019 4:48 PM > To: Honnappa Nagarahalli > Cc: vpp-dev; Nitin Saxena > Subject: [EXT] Re: [vpp-dev] 128 byte cache line support > > External Email > > >> On 15 Mar 2019, at 04:52, Honnappa Nagarahalli >> <honnappa.nagaraha...@arm.com> wrote: >> >> >> >> Related to change 18278[1], I was wondering if there is really a benefit of >> dealing with 128-byte cachelines like we do today. >> Compiling VPP with cacheline size set to 128 will basically just add 64 >> bytes of unused space at the end of each cacheline so >> vlib_buffer_t for example will grow from 128 bytes to 256 bytes, but we will >> still need to prefetch 2 cachelines like we do by default. >> >> [Nitin]: This is the existing model. In case of forwarding mainly first vlib >> cache line size is being used. We are utilising existing hole (in first vlib >> cache line) by putting packet parsing info (Size ==64B). This has many >> benefits, one of them is to avoid ipv4-input-no-chksum() software checks. It >> gives us ~20 cycles benefits on our platform. So I do not want to lose that >> gain.
That sounds like a terribly bad idea, and it likely will never be upstreamed. vlib_buffer_t is 128-byte data structure, and it is perfect fit for 128-byte cacheline size systems. I don't see a point in growing this to 256-byte. If you need more space, you can always grow headroom space for additional cacheline and store whatever you want there. >> Whta will happen if we just leave that to be 64? >> >> [Nitin]: This will create L1D holes on 128B targets right? >> >> Unutilized holes are not acceptable as it will waste L1D space and thereby >> affecting performance. On the contrary we want to pack structures from >> 2x64B to 1x128B cache line size to reduce number of pending prefetches in >> core pipeline. VPP heavily prefetches LOAD/STORE version of 64B and our >> effort is to reduce them for our target. >> Not sure what do you mean by L1D holes. My proposal is that we align all per-thread data structures to 128 bytes, not to grow anything. >> [Honnappa] Currently, ThunderX1 and Octeon TX have 128B cache line. What I >> have heard from Marvel folks is 64B cache line setting in DPDK does not >> work. I have not gone into details on what does not work exactly. May be >> Nitin can elaborate. >> > I’m curious to hear details… >> >> 1. sometimes (and not very frequently) we will issue 2 prefetch instructions >> for same cacheline, but I hope hardware is smart enough to just ignore 2nd >> one >> >> 2. we may face false sharing issues if first 64 bytes is touched by one >> thread and another 64 bytes are touched by another one >> >> Second one sounds to me like a real problem, but it can be solved by >> aligning all per-thread data structures to 2 x cacheline size. >> >> [Honnappa] Sorry, I don’t understand you here. Even if the data structure is >> aligned on 128B (2 X 64B), 2 contiguous blocks of 64B data would be on a >> single cache line. >> > I wanted to say that we can align all per-thread data structures to 128 > bytes, even on systems which have 64 byte cacheline size. >> >> Actually If i remember correctly, even on x86 some of hardware prefetchers >> are dealing with blocks of 2 cachelines. >> >> So unless I missed something, my proposal here is, instead of maintaining >> special 128 byte images for some ARM64 machines, >> let’s just align all per-thread data structures to 128 and have just one ARM >> image. >> >> [Honnappa] When we run VPP compiled with 128B cache line size on platforms >> with 64B cache line size, there is a performance degradation. >> > Yeah, sure, what I’m suggesting here is how to address that perf degradation. > [Nitin]: Is this proposal for Intel as well? If yes then I am fine with the > proposal but I think it will decrease performance on 64B architecture with > existing code. I’m curious to hear why do you think so…
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#12597): https://lists.fd.io/g/vpp-dev/message/12597 Mute This Topic: https://lists.fd.io/mt/30633927/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-