On 21 Jan 2011, at 04:29 , Sad Clouds wrote: > On Thu, 20 Jan 2011 11:59:03 +0800 > Dennis Ferguson <dennis.c.fergu...@gmail.com> wrote: > >> Hello, >> >> Is there a way to obtain the correct cache line size for the machine >> code is running on, both in the kernel and at user level? I see >> there is a compile time constant CACHE_LINE_SIZE in <sys/param.h> >> which currently seems to be always be set to 64, but I'm pretty >> certain that is not necessarily a correct value. For example, I'm >> pretty sure that for the PowerPC 32 and 128 are possibilities, and >> the same binary could run on machines with either line size so there >> is no correct compile-time answer. > > You probably won't find many processors with cache lines greater than > 64 bytes. If you're optimising for a particular processor, read > technical manuals to find out the size of cache lines, then simply > define CACHE_LINE_SIZE or whatever compile time constant you're using > to a different value.
I'm not sure about other processors but I think all 64-bit PowerPC processors have 128 byte data cache lines; the G5 certainly does. Other models have 32 or 64 byte data cache lines, so the same (32-bit) binary could be run on machines with any of those cache line sizes, or could be running on a uniprocessor where the best thing to do (at least when the concern is false sharing in threaded programs) is to ignore cache lines altogether. I don't see a good reason to optimize a program for just one of those cases when, if you could just obtain the right number at run time (which the operating system should know) you could do the best thing for all of them. The fact is, though, that the issue of false sharing is for the most part architecture and processor independent. Data caches on modern (and even old) multiprocessors all work pretty much the same way, have exactly the same issues with sharing, and have their performance improved by exactly the same considerations. With a single cache miss costing many hundreds of instructions getting data which could make good use of the cache off of cache lines which are, by program design, bound to be frequently invalidated can easily make a significant difference. If you can just obtain the right number at run time for the machine you are running on it is usually fairly simple to write code which always does the right thing. Dennis Ferguson