On Wed, 29 Mar 2006, Blaisorblade wrote:

On Tuesday 28 March 2006 01:42, David Lang wrote:
yOn Mon, 27 Mar 2006, David Lang wrote:
I foolishly attempted to startup 25 uml instances on one system (dual 252
opterons with 8G of ram, each um instance getting 256M)

what I found was that they seem to be getting in each others way a LOT
(just on system boot), vmstat on the host is showing almost all of the
cpu time (80%+) being spent in the system, not in userspace (which
surprised me)

so before I spend much time gathering info to try and debug this I wanted
to ask what the current limits are, and if the limits should just be cpu
and ram, then I'll do more digging to find out what's happening in my
case.

well, I reduced the count to 19 instances, and upped the ram on each one
to 400M (they were hitting oom with only 256m each)

almost an hour later the machines still haven't finished booting with top
looking basicly the same for the last half hour or so.

top - 16:42:18 up 4 days, 23:11, 25 users,  load average: 16.79, 16.65,
16.23 Tasks: 44193 total,  14 running, 139 sleeping, 44040 stopped,   0
zombie Cpu0 :  2.3% us, 97.7% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,
0.0% si Cpu1 :  3.2% us, 96.7% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,
0.1% si Mem:   8186088k total,  8145620k used,    40468k free,    11436k
buffers Swap:  2048276k total,        0k used,  2048276k free,  4959636k
cached

so far it looks to me like ram is Ok, but the high system percentage looks
strange to me. the system closest to finishing it's boot has used a little
over 10 min of cpu time (>5x the normal wall clock time for the boot) so
I am running into contention at some point here.

I know that it's maybe a bad workaround, but what about sequential startup
both of UMLs and of the jobs inside them?

I'll try it for a test and let you know how it works

I'd run "vmstat 1" to watch for increase of context switches - an eccessive
amount of them is likely to burn you out.

I'll check for this, but this would surprise me. inside the uml's the only thing that is activly running is heartbeat (linux-ha.org). even with a dozen copies running (one per uml) this should only generate a small amount of traffic (18 udp packets sent per second to the broadcast addresses for all 12 boxes combined)

David Lang


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
User-mode-linux-user mailing list
User-mode-linux-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-user

Reply via email to