On Wed, Apr 12, 2017 at 10:07:26PM +0200, Thomas Gleixner wrote: > While dealing with the fallout of the scheduler cleanups on RT, we found > several racy usage sites of the following scheme: > > cpumask_copy(&save_cpus_allowed, ¤t->cpus_allowed); > set_cpus_allowed_ptr(current, cpumask_of(cpu)); > do_stuff(); > set_cpus_allowed_ptr(current, &save_cpus_allowed); > > That's racy in two aspects: > > 1) Nothing prevents the CPU from being unplugged after the temporary > affinity setting is in place. This results on code being executed on the > wrong CPU(s). > > 2) Nothing prevents a concurrent affinity setting from user space. That > also results in code being executed on the wrong CPU(s) and the restore > of the previous affinity setting overwrites the new one. > > Various variants of cleanups: > > - Removal, because the calling thread is already guaranteed to run on the > correct CPU. > > - Conversion to smp function calls (simple register read/write) > > - Conversion to work_on_cpu(). There were even files containing comments > to that effect. > > - The rest needs seperate hotplug protection for work_on_cpu(). To avoid open > coding the > > get_online_cpus(); > if (cpu_online(cpu)) > ret = do_stuff(); > else > ret = -ENODEV; > put_online_cpus(); > > scheme this series provides a new helper function work_on_cpu_safe() > which implements the above. > > Aside of fixing these races this allows to restrict the access to > current->cpus_allowed with a follow up series.
Looks good, thanks for tackling these!