On 2/12/20, Gleb Smirnoff <gleb...@freebsd.org> wrote: > On Wed, Feb 12, 2020 at 11:12:14AM +0000, Mateusz Guzik wrote: > M> Author: mjg > M> Date: Wed Feb 12 11:12:13 2020 > M> New Revision: 357805 > M> URL: https://svnweb.freebsd.org/changeset/base/357805 > M> > M> Log: > M> amd64: store per-cpu allocations subtracted by __pcpu > M> > M> This eliminates a runtime subtraction from counter_u64_add. > M> > M> before: > M> mov 0x4f00ed(%rip),%rax # 0xffffffff80c01788 > <numfullpathfail4> > M> sub 0x808ff6(%rip),%rax # 0xffffffff80f1a698 <__pcpu> > M> addq $0x1,%gs:(%rax) > M> > M> after: > M> mov 0x4f02fd(%rip),%rax # 0xffffffff80c01788 > <numfullpathfail4> > M> addq $0x1,%gs:(%rax) > M> > M> Reviewed by: jeff > M> Differential Revision: https://reviews.freebsd.org/D23570 > > Neat optimization! Thanks. Why didn't we do it back when created counter? >
Don't look at me, I did not work on it. You can top it for counters like the above -- most actual counters are known to be there at compilatin time and they never disappear. Meaning that in the simplest case they can just be a part of one big array in struct pcpu. Then assembly could resort to addq $0x1,%gs:(someoffset) removing the mov loading the address -- faster single threaded and less cache use. I'm confident I noted this at least few times. -- Mateusz Guzik <mjguzik gmail.com> _______________________________________________ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"