> Right now, Simics tells Solaris that all of the > memory is on a single board, even though my add-on > module to Simics actually carries out the timing of > NUMA. The bottom line is that we currently model > the timing of NUMA, however Solaris does not do any > memory placement optimization because it thinks > memory is all on one board. > > Thus in order to get memory placement optimization > working, I believe I need to bring up a newer version > of Solaris, and to possibly get Simics to properly > tell Solaris about NUMA hardware.
Right. > posted the same question in the code group. > Apparently for SPARC, the platform-specific files > statically define the lgroups, and that many/most of > the platform-specific files are *not* included with > OpenSolaris. They are included. The SPARC lgrp platform code lives in usr/src/uts/sun4/os/lgrpplat.c You'll notice that many of the routines call a platform specific version (plat_lgrp_* vs lgrp_plat_*), and those functions live in various platform modules (platmods). For example, the routine associating CPUs with lgroups on Serengeti class systems lives in usr/src/uts/sun4u/serengeti/os/serengeti.c /* * Return the platform handle for the lgroup containing the given CPU * * For Serengeti, lgroup platform handle == board number */ lgrp_handle_t plat_lgrp_cpu_to_hand(processorid_t id) { return (CPUID_TO_BOARD(id)); } memnodes (expanses of physical memory) are also associated with lgroups. That association is made in lgrpplat.c as well, although the code which populates the mnode <-> lgrp platform handle lookup arrays lives in the platmods (for SPARC systems). So changing these routines such that they return something other than LGRP_DEFAULT_HANDLE is at least the basis for doing what you want. The devil is always in the details though, so if you get stuck let us know. > Thus it seems that maybe my best solution is to add a > mechanism (system-call or something) so that I can > manually define lgroups before running my database > workload. OR, I can go with the Opteron approach and > have Solaris manually probe the memory system at boot > time in order to figure out the lgroups dynamically. > Doing so might be easiest because it is easy for me > to affect the timing of Simics, but it isn't so > easy to make Simics return the right information to > Solaris (static hardware platform). Nah, it "should" be pretty easy. :) To start, I would suggest just hardcoding what the lgrp_plat_* routines return to see if you can get more than one lgroup created. The lgrp observability tools (found in the perf community) should prove helpful here. This might also be a good time to mention that we've been talking for some time about (re)examining the common/architecture/platform NUMA interfaces to see if they can be re-factored in a way that's more modular. Enabling MPO support for new architectures/platforms should be easier than it is...and it would be nice to say: "just implement your platform's version of the 4 functions defined in foo.h to enable/configure MPO"... Adding / configuring MPO support by dropping in a new loadable kernel module would be useful too. Thanks, -Eric > Thanks for the interest and response. Would welcome > any ideas. Sure, any time. -Eric This message posted from opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org