> On Jan 11, 2019, at 8:07 AM, Josh Poimboeuf <jpoim...@redhat.com> wrote: > > On Fri, Jan 11, 2019 at 03:48:59PM +0000, Nadav Amit wrote: >>> I liked the idea, BUT, how would it work for callee-saved PV ops? In >>> that case there's only one clobbered register to work with (rax). >> >> That’s would be more tricky. How about using a per-CPU trampoline code to >> hold a direct call to the target and temporarily disable preemption (which >> might be simpler by disabling IRQs): >>
Allow me to simplify/correct: >> Static-call modifier: >> >> 1. synchronize_sched() to ensure per-cpu trampoline is not used No need for (1) (We are going to sync using IPI). >> 2. Patches the jmp in a per-cpu trampoline (see below) >> 3. Saves the call source RIP in [per-cpu scratchpad RIP] (below) Both do not need to be per-cpu >> 4. Configures the int3 handler to use static-call int3 handler >> 5. Patches the call target (as it currently does). Note that text_poke_bp() would do eventually: on_each_cpu(do_sync_core, NULL, 1); So you should know no cores run the trampoline after (5).