----- On Jul 24, 2017, at 5:58 PM, Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
> The sys_membarrier() system call has proven too slow for some use > cases, which has prompted users to instead rely on TLB shootdown. > Although TLB shootdown is much faster, it has the slight disadvantage > of not working at all on arm and arm64. This commit therefore adds > an expedited option to the sys_membarrier() system call. Is this now possible because the synchronize_sched_expedited() implementation does not require to send IPIs to all CPUS ? I suspect that using tree srcu now solves this somehow, but can you tell us a bit more about why it is now OK to expose this to user-space ? The commit message here does not explain why it is OK real-time wise to expose this feature as a system call. Thanks, Mathieu > > Signed-off-by: Paul E. McKenney <paul...@linux.vnet.ibm.com> > --- > include/uapi/linux/membarrier.h | 11 +++++++++++ > kernel/membarrier.c | 7 ++++++- > 2 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h > index e0b108bd2624..ba36d8a6be61 100644 > --- a/include/uapi/linux/membarrier.h > +++ b/include/uapi/linux/membarrier.h > @@ -40,6 +40,16 @@ > * (non-running threads are de facto in such a > * state). This covers threads from all processes > * running on the system. This command returns 0. > + * @MEMBARRIER_CMD_SHARED_EXPEDITED: Execute a memory barrier on all > + * running threads, but in an expedited fashion. > + * Upon return from system call, the caller thread > + * is ensured that all running threads have passed > + * through a state where all memory accesses to > + * user-space addresses match program order between > + * entry to and return from the system call > + * (non-running threads are de facto in such a > + * state). This covers threads from all processes > + * running on the system. This command returns 0. > * > * Command to be passed to the membarrier system call. The commands need to > * be a single bit each, except for MEMBARRIER_CMD_QUERY which is assigned to > @@ -48,6 +58,7 @@ > enum membarrier_cmd { > MEMBARRIER_CMD_QUERY = 0, > MEMBARRIER_CMD_SHARED = (1 << 0), > + MEMBARRIER_CMD_SHARED_EXPEDITED = (2 << 0), > }; > > #endif /* _UAPI_LINUX_MEMBARRIER_H */ > diff --git a/kernel/membarrier.c b/kernel/membarrier.c > index 9f9284f37f8d..b749c39bb219 100644 > --- a/kernel/membarrier.c > +++ b/kernel/membarrier.c > @@ -22,7 +22,8 @@ > * Bitmask made from a "or" of all commands within enum membarrier_cmd, > * except MEMBARRIER_CMD_QUERY. > */ > -#define MEMBARRIER_CMD_BITMASK (MEMBARRIER_CMD_SHARED) > +#define MEMBARRIER_CMD_BITMASK (MEMBARRIER_CMD_SHARED | > \ > + MEMBARRIER_CMD_SHARED_EXPEDITED) > > /** > * sys_membarrier - issue memory barriers on a set of threads > @@ -64,6 +65,10 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags) > if (num_online_cpus() > 1) > synchronize_sched(); > return 0; > + case MEMBARRIER_CMD_SHARED_EXPEDITED: > + if (num_online_cpus() > 1) > + synchronize_sched_expedited(); > + return 0; > default: > return -EINVAL; > } > -- > 2.5.2 -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com