On Sat, Aug 13, 2016 at 11:45 AM, Ingo Molnar <mi...@kernel.org> wrote: > > * Brian Gerst <brge...@gmail.com> wrote: > >> On Sat, Aug 13, 2016 at 1:16 PM, Linus Torvalds >> <torva...@linux-foundation.org> wrote: >> > On Sat, Aug 13, 2016 at 9:38 AM, Brian Gerst <brge...@gmail.com> wrote: >> >> This patch set simplifies the switch_to() code, by moving the stack switch >> >> code out of line into an asm stub before calling __switch_to(). This ends >> >> up being more readable, and using the C calling convention instead of >> >> clobbering all registers improves code generation. It also allows newly >> >> forked processes to construct a special stack frame to seamlessly flow >> >> to ret_from_fork, instead of using a test and branch, or an unbalanced >> >> call/ret. >> > >> > Do you have performance numbers? Is it noticeable/measurable? >> >> How do I measure it? The perf documentation isn't easy to understand. > > Something like this: > > taskset 1 perf stat -a -e '{instructions,cycles}' --repeat 10 perf bench > sched pipe > > ... will give a very good idea about the general impact of these changes on > context switch overhead. >
I will be quite surprised if you can measure any effect at all. I've never seen context switches take fewer than ~2k cycles, and on my laptop, they take 8k-9k cycles. The scheduler is really, really slow. (Why doesn't that perf command show cycles per context switch?) --Andy