Hello, I use my x64 Amilo Laptop as webserver, workstation and devel machine at once. It has 3GB mem and 8GB of zfs swap space.
After less than just one week of uptime it ran out of physical memory and continued to allocate more and more mem by increasingly going into swap. This time not the Xorg server consumed most mem (only about 180MB). And also FireFox3.x only took 522MB away. VirtualBox needed to be stopped because of the entire system's sluggishness in the state without allocatable free physical mem, so nothing like VirtualBox, Xen or qemu was running anymore. I also closed all gedit windows and tabs, even restarted Xorg. Then after having restarted Xorg, no graphical application was running anymore, except for Gnome itself and one gnome-terminal window with just 8 tabs where I had ssh access to a few other systems. So what consumed all the memory??? /usr/bin/top showed me that one of the 5 wget sessions I had started a week ago (for fetching opensolaris.org, sdlc.com/osol, genunix.org and a few LinUX mirrors) had grown to 2GB!!! And one day later it was 2.2GB. So I killed that wget pid and started the same wget (same dir / website) again. The four other wget processes were between "only" 122MB and about 250MB. So, one day later they had grown further, but here comes the absolute HAMMER: One top process was at 1340MB!!! I mean, ok: I got used to it, that small daemons like the network-auto-magic manager can consume 86MB. Also, that small almost useless little Gnome-applets can consume hundreds of MB's (wnck-applet 104MB, clock-applet 80MB, mixer-applet2 91MB, trashapplet 87MB, gnome-panel 114MB, etc etc etc ...). Remember, that each of those amounts would have been considered sufficient for a server machine until just a few years ago. But can it be? Is this normal? How can top consume 1340MB? I killed the pid 25420 and ... voila: 1.3GB of mem/swap got freed. So it was not some mis-reporting by the other top process. Performance-analysis experts: Something seems to be wrong, even though you have a rich set of the most sophisticated test suites. And little df or top show an end-user that something significant *is* wrong. How can it be? What will you explain a customer that you have under a support contract and who is running a supported SXCE (with Solaris Express service plan, don't know if there still are any) or who is running a supported Indiana? Solaris always tends to be mem-hungry and sluggish for quite some time now (especially starting with Solaris 10 and even more so with 11). But in this case it must be a memory leak and therefore a severe bug. I'm not a dtrace expert (normally more interested in other things) and leave the entire procedure that must follow to others. But I hereby meet my duty as a opensolaris.org (and Solaris in a wider sense) community member to report this problem. Shall I file this message as a bug? I noticed it under this config (which happens to be my primary box) : bash-3.2$ psrinfo -v Status of virtual processor 0 as of: 02/12/2009 03:09:19 on-line since 02/03/2009 15:31:17. The i386 processor operates at 2000 MHz, and has an i387 compatible floating point processor. Status of virtual processor 1 as of: 02/12/2009 03:09:19 on-line since 02/03/2009 15:31:23. The i386 processor operates at 2000 MHz, and has an i387 compatible floating point processor. bash-3.2$ prtdiag -v System Configuration: FUJITSU SIEMENS AMILO Notebook Pa 3515 BIOS Configuration: Phoenix Technologies LTD V1.13 10/06/2008 ==== Processor Sockets ==================================== Version Location Tag -------------------------------- -------------------------- Athlon X2 QL-62 Socket S1G2 ==== Memory Device Sockets ================================ Type Status Set Device Locator Bank Locator ----------- ------ --- ------------------- ---------------- DDR2 in use 1 S1 DIMM1 DDR2 in use 2 S2 DIMM2 ==== On-Board Devices ===================================== ATI RS690M ESS 1869 ==== Upgradeable Slots ==================================== ID Status Type Description --- --------- ---------------- ---------------------------- 11 available PCI MINI PCI bash-3.2$ uname -a SunOS unknown 5.11 snv_105 i86pc i386 i86pc bash-3.2$ (full all-clusters plus oem install with POSIX-C as system locale) and mostly default config, except for Xorg which I upgraded to a self-built version of the fox-gate from February 1rst (hence server 1.5.3 with libpciaccess) Please find two Screenshots with top-output (one time before I killed pid 25420 and another afterwards, note that the wget processes run in the background and log their output into text files via wget -m -p -k xxx > logfile.lod 2>&1 & ) http://natamar.org/content/bugs/Screenshot-12.png (image/png) 237K http://natamar.org/content/bugs/Screenshot-14.png (image/png) 239K Normally I would like to have attached them for the records/archives, but the limit it at just 40K. Regards, - Martin Bochnig Sun Contributor Agreement number OS0335 _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org