Am Sat, 24 Mar 2018 23:58:44 +0000 (UTC)
Jeff Roberson <j...@freebsd.org> schrieb:

> Author: jeff
> Date: Sat Mar 24 23:58:44 2018
> New Revision: 331508
> URL: https://svnweb.freebsd.org/changeset/base/331508
> 
> Log:
>   Document new NUMA related syscalls and utility options.
>   
>   Sponsored by:       Netflix, Dell/EMC Isilon
> 
> Modified:
>   head/lib/libc/sys/Makefile.inc
>   head/lib/libc/sys/cpuset.2
>   head/lib/libc/sys/cpuset_getaffinity.2
>   head/share/man/man9/Makefile
>   head/share/man/man9/malloc.9
>   head/share/man/man9/zone.9
>   head/usr.bin/cpuset/cpuset.1
> 
> Modified: head/lib/libc/sys/Makefile.inc
> ==============================================================================
> --- head/lib/libc/sys/Makefile.inc    Sat Mar 24 23:26:54 2018        
> (r331507)
> +++ head/lib/libc/sys/Makefile.inc    Sat Mar 24 23:58:44 2018        
> (r331508)
> @@ -174,6 +174,7 @@ MAN+=     abort2.2 \
>       connectat.2 \
>       cpuset.2 \
>       cpuset_getaffinity.2 \
> +     cpuset_getdomain.2 \
>       dup.2 \
>       execve.2 \
>       _exit.2 \
> @@ -371,6 +372,7 @@ MLINKS+=nanosleep.2 clock_nanosleep.2
>  MLINKS+=cpuset.2 cpuset_getid.2 \
>       cpuset.2 cpuset_setid.2
>  MLINKS+=cpuset_getaffinity.2 cpuset_setaffinity.2
> +MLINKS+=cpuset_getdomain.2 cpuset_setdomain.2
>  MLINKS+=dup.2 dup2.2
>  MLINKS+=execve.2 fexecve.2
>  MLINKS+=extattr_get_file.2 extattr.2 \
> 
> Modified: head/lib/libc/sys/cpuset.2
> ==============================================================================
> --- head/lib/libc/sys/cpuset.2        Sat Mar 24 23:26:54 2018        
> (r331507)
> +++ head/lib/libc/sys/cpuset.2        Sat Mar 24 23:58:44 2018        
> (r331508)
> @@ -48,21 +48,21 @@
>  The
>  .Nm
>  family of system calls allow applications to control sets of processors and
> -assign processes and threads to these sets.
> -Processor sets contain lists of CPUs that members may run on and exist only
> -as long as some process is a member of the set.
> +memory domains and assign processes and threads to these sets.
> +Processor sets contain lists of CPUs and domains that members may run on
> +and exist only as long as some process is a member of the set.
>  All processes in the system have an assigned set.
>  The default set for all processes in the system is the set numbered 1.
>  Threads belong to the same set as the process which contains them,
>  however, they may further restrict their set with the anonymous
> -per-thread mask.
> +per-thread mask to bind to a specific CPU or subset of CPUs and memory 
> domains.
>  .Pp
>  Sets are referenced by a number of type
>  .Ft cpuset_id_t .
>  Each thread has a root set, an assigned set, and an anonymous mask.
>  Only the root and assigned sets are numbered.
> -The root set is the set of all CPUs available in the system or in the
> -system partition the thread is running in.
> +The root set is the set of all CPUs and memory domains available in the 
> system
> +or in the system partition the thread is running in.
>  The assigned set is a subset of the root set and is administratively
>  assignable on a per-process basis.
>  Many processes and threads may be members of a numbered set.
> @@ -72,7 +72,8 @@ set.
>  It is intended that administrators will manipulate numbered sets using
>  .Xr cpuset 1
>  while application developers will manipulate anonymous sets using
> -.Xr cpuset_setaffinity 2 .
> +.Xr cpuset_setaffinity 2 and
> +.Xr cpuset_setdomain 2 .
>  .Pp
>  To select the correct set a value of type
>  .Ft cpulevel_t
> @@ -175,9 +176,10 @@ with a process or thread is unsupported since
>  this references the unnumbered anonymous mask.
>  .Pp
>  The actual contents of the sets may be retrieved or manipulated using
> -.Xr cpuset_getaffinity 2
> -and
> -.Xr cpuset_setaffinity 2 .
> +.Xr cpuset_getaffinity 2 ,
> +.Xr cpuset_setaffinity 2 ,
> +.Xr cpuset_getdomain 2 , and
> +.Xr cpuset_setdomain 2 .
>  See those manual pages for more detail.
>  .Sh RETURN VALUES
>  .Rv -std
> @@ -220,6 +222,8 @@ for allocation.
>  .Xr cpuset 1 ,
>  .Xr cpuset_getaffinity 2 ,
>  .Xr cpuset_setaffinity 2 ,
> +.Xr cpuset_getdomain 2 ,
> +.Xr cpuset_setdomain 2 ,
>  .Xr pthread_affinity_np 3 ,
>  .Xr pthread_attr_affinity_np 3 ,
>  .Xr cpuset 9
> 
> Modified: head/lib/libc/sys/cpuset_getaffinity.2
> ==============================================================================
> --- head/lib/libc/sys/cpuset_getaffinity.2    Sat Mar 24 23:26:54 2018
> (r331507) +++ head/lib/libc/sys/cpuset_getaffinity.2  Sat Mar 24 23:58:44
> 2018  (r331508) @@ -160,6 +160,8 @@ See
>  .Xr cpuset 2 ,
>  .Xr cpuset_getid 2 ,
>  .Xr cpuset_setid 2 ,
> +.Xr cpuset_getdomain 2 ,
> +.Xr cpuset_setdomain 2 ,
>  .Xr pthread_affinity_np 3 ,
>  .Xr pthread_attr_affinity_np 3 ,
>  .Xr cpuset 9
> 
> Modified: head/share/man/man9/Makefile
> ==============================================================================
> --- head/share/man/man9/Makefile      Sat Mar 24 23:26:54 2018        
> (r331507)
> +++ head/share/man/man9/Makefile      Sat Mar 24 23:58:44 2018        
> (r331508)
> @@ -1271,6 +1271,8 @@ MLINKS+=make_dev.9 destroy_dev.9 \
>       make_dev.9 make_dev_p.9 \
>       make_dev.9 make_dev_s.9
>  MLINKS+=malloc.9 free.9 \
> +     malloc.9 malloc_domain.9 \
> +     malloc.9 free_domain.9 \
>       malloc.9 mallocarray.9 \
>       malloc.9 MALLOC_DECLARE.9 \
>       malloc.9 MALLOC_DEFINE.9 \
> @@ -2213,10 +2215,12 @@ MLINKS+=vslock.9 vsunlock.9
>  MLINKS+=zone.9 uma.9 \
>       zone.9 uma_zalloc.9 \
>       zone.9 uma_zalloc_arg.9 \
> +     zone.9 uma_zalloc_domain.9 \
>       zone.9 uma_zcreate.9 \
>       zone.9 uma_zdestroy.9 \
>       zone.9 uma_zfree.9 \
>       zone.9 uma_zfree_arg.9 \
> +     zone.9 uma_zfree_domain.9 \
>       zone.9 uma_zone_get_cur.9 \
>       zone.9 uma_zone_get_max.9 \
>       zone.9 uma_zone_set_max.9 \
> 
> Modified: head/share/man/man9/malloc.9
> ==============================================================================
> --- head/share/man/man9/malloc.9      Sat Mar 24 23:26:54 2018        
> (r331507)
> +++ head/share/man/man9/malloc.9      Sat Mar 24 23:58:44 2018        
> (r331508)
> @@ -46,9 +46,13 @@
>  .Ft void *
>  .Fn malloc "size_t size" "struct malloc_type *type" "int flags"
>  .Ft void *
> +.Fn malloc_domain "size_t size" "struct malloc_type *type" "int domain" "int 
> flags"
> +.Ft void *
>  .Fn mallocarray "size_t nmemb" "size_t size" "struct malloc_type *type" "int 
> flags"
>  .Ft void
>  .Fn free "void *addr" "struct malloc_type *type"
> +.Ft void
> +.Fn free_domain "void *addr" "struct malloc_type *type"
>  .Ft void *
>  .Fn realloc "void *addr" "size_t size" "struct malloc_type *type" "int flags"
>  .Ft void *
> @@ -64,6 +68,14 @@ The
>  function allocates uninitialized memory in kernel address space for an
>  object whose size is specified by
>  .Fa size .
> +.Pp
> +The
> +.Fn malloc_domain
> +variant allocates the object from the specified memory domain.  Memory 
> allocated
> +with this function should be returned with
> +.Fn free_domain .
> +See
> +.Xr numa 9 for more details.
>  .Pp
>  The
>  .Fn mallocarray
> 
> Modified: head/share/man/man9/zone.9
> ==============================================================================
> --- head/share/man/man9/zone.9        Sat Mar 24 23:26:54 2018        
> (r331507)
> +++ head/share/man/man9/zone.9        Sat Mar 24 23:58:44 2018        
> (r331508)
> @@ -32,8 +32,10 @@
>  .Nm uma_zcreate ,
>  .Nm uma_zalloc ,
>  .Nm uma_zalloc_arg ,
> +.Nm uma_zalloc_domain ,
>  .Nm uma_zfree ,
>  .Nm uma_zfree_arg ,
> +.Nm uma_zfree_domain ,
>  .Nm uma_zdestroy ,
>  .Nm uma_zone_set_max ,
>  .Nm uma_zone_get_max ,
> @@ -55,11 +57,15 @@
>  .Fn uma_zalloc "uma_zone_t zone" "int flags"
>  .Ft "void *"
>  .Fn uma_zalloc_arg "uma_zone_t zone" "void *arg" "int flags"
> +.Ft "void *"
> +.Fn uma_zalloc_domain "uma_zone_t zone" "void *arg" "int domain" "int flags"
>  .Ft void
>  .Fn uma_zfree "uma_zone_t zone" "void *item"
>  .Ft void
>  .Fn uma_zfree_arg "uma_zone_t zone" "void *item" "void *arg"
>  .Ft void
> +.Fn uma_zfree_domain "uma_zone_t zone" "void *item" "void *arg"
> +.Ft void
>  .Fn uma_zdestroy "uma_zone_t zone"
>  .Ft int
>  .Fn uma_zone_set_max "uma_zone_t zone" "int nitems"
> @@ -78,10 +84,13 @@
>  .Fn SYSCTL_ADD_UMA_CUR ctx parent nbr name access zone descr
>  .Sh DESCRIPTION
>  The zone allocator provides an efficient interface for managing
> -dynamically-sized collections of items of similar size.
> +dynamically-sized collections of items of identical size.
>  The zone allocator can work with preallocated zones as well as with
>  runtime-allocated ones, and is therefore available much earlier in the
> -boot process than other memory management routines.
> +boot process than other memory management routines.  The zone allocator
> +provides per-cpu allocation caches with linear scalability on SMP
> +systems as well as round-robin and first-touch policies for NUMA
> +systems.
>  .Pp
>  A zone is an extensible collection of items of identical size.
>  The zone allocator keeps track of which items are in use and which
> @@ -209,6 +218,11 @@ The zone is for the
>  subsystem.
>  .It Dv UMA_ZONE_VM
>  The zone is for the VM subsystem.
> +.It Dv UMA_ZONE_NUMA
> +The zone should use a first-touch NUMA policy rather than the round-robin
> +default. Callers that do not free memory on the same domain it is allocated
> +from will cause mixing in per-cpu caches.  See
> +.Xr numa 9 for more details.
>  .El
>  .Pp
>  To allocate an item from a zone, simply call
> @@ -243,12 +257,21 @@ The variations
>  .Fn uma_zalloc_arg
>  and
>  .Fn uma_zfree_arg
> -allow to
> +allow callers to
>  specify an argument for the
>  .Dv ctor
>  and
>  .Dv dtor
>  functions, respectively.
> +The 
> +.Fn uma_zalloc_domain
> +function allows callers to specify a fixed
> +.Xr numa 9 domain to allocate from.  This uses a guaranteed but slow path in
> +the allocator which reduces concurrency.  The 
> +.Fn uma_zfree_domain
> +function should be used to return memory allocated in this fashion.  This
> +function infers the domain from the pointer and does not require it as an
> +argument.
>  .Pp
>  Created zones,
>  which are empty,
> 
> Modified: head/usr.bin/cpuset/cpuset.1
> ==============================================================================
> --- head/usr.bin/cpuset/cpuset.1      Sat Mar 24 23:26:54 2018        
> (r331507)
> +++ head/usr.bin/cpuset/cpuset.1      Sat Mar 24 23:58:44 2018        
> (r331508)
> @@ -34,20 +34,24 @@
>  .Sh SYNOPSIS
>  .Nm
>  .Op Fl l Ar cpu-list
> +.Op Fl n Ar policy:domain-list 
>  .Op Fl s Ar setid
>  .Ar cmd ...
>  .Nm
>  .Op Fl l Ar cpu-list
> +.Op Fl n Ar policy:domain-list 
>  .Op Fl s Ar setid
>  .Fl p Ar pid
>  .Nm
>  .Op Fl c
>  .Op Fl l Ar cpu-list
> +.Op Fl n Ar policy:domain-list 
>  .Fl C
>  .Fl p Ar pid
>  .Nm
>  .Op Fl c
>  .Op Fl l Ar cpu-list
> +.Op Fl n Ar policy:domain-list 
>  .Op Fl j Ar jailid | Fl p Ar pid | Fl t Ar tid | Fl s Ar setid | Fl x Ar irq
>  .Nm
>  .Fl g
> @@ -57,8 +61,9 @@
>  The
>  .Nm
>  command can be used to assign processor sets to processes, run commands
> -constrained to a given set or list of processors, and query information
> -about processor binding, sets, and available processors in the system.
> +constrained to a given set or list of processors and memory domains, and 
> query
> +information about processor binding, memory binding and policy, sets, and
> +available processors and memory domains in the system.
>  .Pp
>  .Nm
>  requires a target to modify or query.
> @@ -92,6 +97,15 @@ This last set is the list of all possible CPUs in the 
>  queried using
>  .Fl r .
>  .Pp
> +Most sets include NUMA memory domain and policy information.  This can be
> +inspected with
> +.Fl g
> +and set with
> +.Fl n .
> +This will specify which NUMA domains are visible to the process and
> +affect where anonymous memory and file pages will be stored on first access.
> +Files accessed first by other processes may specify conflicting policy.
> +.Pp
>  When running a command it may join a set specified with
>  .Fl s
>  otherwise a new set is created.
> @@ -110,7 +124,8 @@ Create a new cpuset and assign the target process to t
>  The requested operation should reference the cpuset available via the
>  target specifier.
>  .It Fl d Ar domain
> -Specifies a NUMA domain id as the target of the operation.
> +Specifies a NUMA domain id as the target of the operation.  This can only
> +be used to query the cpus visible in each numberd domain.
>  .It Fl g
>  Causes
>  .Nm
> @@ -130,6 +145,13 @@ numbers separated by '-' for ranges and commas separat
>  A special list of
>  .Dq all
>  may be specified in which case the list includes all CPUs from the root set.
> +.It Fl n Ar domain-list:policy
> +Specifies a list of domains and allocation policy to apply to a target.  
> Ranges
> +may be specified as in
> +.Fl l .
> +Valid policies include first-touch, ft, round-robin, rr, and prefer.  The 
> prefer
> +policy accepts only a single domain in the set.  The parent of the set is
> +consulted if the preferred domain is unavailable.
>  .It Fl p Ar pid
>  Specifies a pid as the target of the operation.
>  .It Fl s Ar setid
> _______________________________________________
> svn-src-head@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/svn-src-head
> To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


A buildkernel fails with:

[...]
--- all_subdir_lib/libc ---
make[4]: make[4]: don't know how to make cpuset_getdomain.2. Stop

make[4]: stopped in /usr/src/lib/libc


-- 
O. Hartmann

Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).

Attachment: pgpI_c9V667Hk.pgp
Description: OpenPGP digital signature

Reply via email to