[EMAIL PROTECTED] wrote: > >Offtopic: Which still reminds me to write an email about that the > >Solaris kernel is very very malloc()-happy (which is unneccesary in many > >cases now that C99-sematics are allowed). > > You are wrong.
Thank you... ;-(( See below... > We *malloc* not because of shortcomings in C but because > we don't have any room on the stack to put stuff. I know that. And I did already some research on that (and the only reason I didn't work on finishing the prototype patch is that I have to work for food on other, non-Solaris related projects - which means I only have 1-2 hours per day to do the interesting bits. And those are currently spend on the ksh93-integration project...) which provides multiple options, including: - Increase stack size if possible (for example using 64k pages on SPARC (excluding UltraSPARC-3 and older versions of SPARC64), maybe limiting this to 64bit threads (this has several compliciated reasons)) - Create a C macro |#define KMEM_TMP_ALLOC()| which expands to the following procedure: 1. Measure stack size and current available space on stack. The first 512bytes (of a 8k default stack, a 64k default stack would offer much more room (this value is a tuneable, setting it to |0| will disable the stack allocations)) are available to allocations via C99 constructs (if this fails the size will simply be set to |0|). 2. If [1] couldn't allocate space a special "temporary space" allocator is called which fetches memory from per-process preallocated 4M pages (allocated from processor-local memory and split into 8 (where |8| is a tuneable, too) slices with seperate mutexes (slice index is simply a hash over the thread id, which means that the same thread usually asks for the same slice)). Temporary space is usually (with exceptions per flags passed to |KMEM_TMP_ALLOC()|) allocated in a linear manner so a different allocator algorithm will be used here (controllable via flags and stealing some of the ideas in AmigaOS (no flamewar, please) :-) ). Additional benefits are that accesses are (usually) from local memory and that - by default - 4M pages are used (reducing the tax on the TLB). 3. If [2] fails (for example no space left in per-processor temp. memory space or the request size exceeds the maximum slice size (or when flags want us to skip [2])) we fall-back to the normal allocator (and no, I don't want to add a 3rd-level "scratchspace" like used in SUPER-UX (erm, Ok... NUMA machines may benefit from that... but then I better rely on the normal kernel allocator to deal with that...)). I know the normal allocator is fast and that it is heavily optimizsed for scalabilty. But it's even better if such calls or even any function calls for temporary memory could be avoided (note that many of the allocations are far below 128 bytes - sometimes the amount of bytes shuffeled around by function calls&co. are much bigger than the allocation itself, making this a little bit ridicoulous... ;-( ). ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) [EMAIL PROTECTED] \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 7950090 (;O/ \/ \O;) _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org