On Nov 10, 2011, at 23:40, Benjamin Herrenschmidt wrote: > On Thu, 2011-11-10 at 18:38 -0600, Moffett, Kyle D wrote: >> (2) Make the ppc64_caches struct apply to ppc32 as well, and >> preinitialize it with a minimum value used by any platform being >> compiled in (for "dcbXX"/"icbXX" purposes). This is safe because >> the pagesize is always a multiple of the cache block size and the >> kernel only uses dcbXX/icbXX on whole pages. The only impact is a >> temporary small performance hit from flushing or zeroing the same >> block 8 times if too small. > > Are you sure about dcbz ? Getting that wrong can be deadly ... I'd > rather get rid of some fancy optims and use a soft value in some cases. > That or we can compile multiple variants for the common case of some of > the copy routines and use patching (alternate sections) to branch to the > right one at runtime, at least for the common cases (32 and 128 for > example for 440 and 476).
Well, all of the kernel loops that use dcbz are operating on whole pages, and the PPC Book-E spec documents that the pagesize is an even multiple of the cacheline size and the cachelines are always page-aligned. So when you are clearing a whole page, there are only 2 things you can do wrong with "dcbz": (1) Call "dcbz" with an address outside of the page you want to zero. (2) Omit calls "dcbz" to dcbz for some physical cachelines in the page. Now, that's a totally different story from the userspace memset() calls that caused the problem originally, because they were frequently given memory much smaller than a page to clear, and if you didn't know exactly how many bytes a "dcbz" was going to clear you couldn't use it at all. But the kernel doesn't do that anywhere, it just uses it for page clears. Cheers, Kyle Moffett -- Curious about my work on the Debian powerpcspe port? I'm keeping a blog here: http://pureperl.blogspot.com/ _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev