rtld-...

Robert Watson Sun, 19 Apr 2009 16:50:39 -0700

On Mon, 20 Apr 2009, Ivan Voras wrote:

2009/4/20 Robert Watson <rwat...@freebsd.org>:
On Sun, 19 Apr 2009, Robert Watson wrote:
Now that the kernel defines CACHE_LINE_SIZE in machine/param.h, use thatdefinition in the custom locking code for the run-time linker rather thanlocal definitions.
This actually changes the line size used by the rtld code for pre-pthreadslocking for several architectures. I think this is an improvement, but ifarchitecture maintainers could comment on that, that would be helpful.
Will there be infrastructure for creating per-CPU structures or is usingsomething like:
int mycounter[MAXCPU] __attribute__ ((aligned(CACHE_LINE_SIZE)));

For now, yes, something along these lines. I have a local prototype I'm usingthat has an API something like this:


 *     // Definitions
 *     struct foostat  *foostatp;
 *     void            *foostat_psp;
 *
 *     // Module load
 *     if (pcpustat_alloc(&foostat_psp, "foostat", sizeof(struct foostat),
 *         sizeof(u_long)) != 0)
 *             panic("foostat_init: pcpustat_alloc failed");
 *     foostatp = pcpustat_getptr(foostat_psp);
 *
 *     // Use the pointer for a statistic
 *     foostatp[curcpu].fs_counter1++;
 *
 *     // Retrieve summary statistics and store in a single instance
 *     struct foostat fs;
 *     pcpustat_fetch(foostat_psp, &fs);
 *
 *     // Reset summary statistics.
 *     pcpustat_reset(foostat_psp);
 *
 *     // Module unload
 *     pcpustat_free(foostat_psp);
 *     foostatp = foostat_psp = NULL;

The problem with the [curcpu] model is that it embeds the assumption that it'sa good idea to have per-CPU fields in adjacent cache lines within a page. Asthe world rocks gently in the direction of NUMA, there's a legitimate questionas to whether that's a good assumption to build in. It's a better assumptionthan the assumption that it's a good idea to use a single stat across all CPUsin a single cache line, of course. Depending on how we feel about theoverhead of accessor interfaces, all this can be hidden easily enough.

A facility I'd like to have would be an API to allocate memory on all CPUs atonce, with the memory on each CPU at a constant offset from a per-CPU baseaddress. That way you could calculate the location of the per-CPU structureusing PCPU_GET(dynbase) + foostat_offset without adding an additionalindirection.


Robert N M Watson
Computer Laboratory
University of Cambridge

_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r191291 - in head: lib/libthr/thread libexec/rtld-elf/amd64 libexec/rtld-elf/arm libexec/rtld-elf/i386 libexec/rtld-elf/ia64 libexec/rtld-elf/mips libexec/rtld-elf/powerpc libexec/rtld-...

Reply via email to