On Wed, Aug 15, 2007 at 04:29:54PM -0400, Jim Mauro wrote: > > What would be interesting here is the paging statistics during your test > case. > What does "vmstat 1", and "vmstat -p 1" look like while your generating > this behavior? > > Is it really case the reading/writing from swap is slow, or simply that > the system > on the whole is slow because it's dealing with a sustained memory deficit?
It tends to look something like this: $ vmstat 1 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr cd cd m1 m1 in sy cs us sy id 0 0 0 19625708 3285120 1 4 1 1 1 0 6 1 1 11 11 455 260 247 4 0 96 0 1 70 16001276 645628 2 27 3428 0 0 0 0 442 447 0 0 3012 516 1982 97 3 0 0 1 70 16001276 642208 0 0 3489 0 0 0 0 437 432 0 0 3074 381 2002 97 3 0 0 1 70 16001276 638964 0 0 3343 0 0 0 0 417 417 0 0 2997 350 1914 98 2 0 0 1 70 16001276 635504 0 0 3442 0 0 0 0 430 434 0 0 3067 536 2016 97 3 0 0 1 70 16001276 632076 0 0 3434 0 0 0 0 429 425 0 0 3164 885 2125 97 3 0 0 1 70 16001276 628548 0 0 3549 0 0 0 0 445 445 0 0 3185 582 2105 97 3 0 0 1 70 16001276 625104 0 0 3459 0 0 0 0 463 469 0 0 3376 594 2100 97 3 0 $ vmstat -p 1 memory page executable anonymous filesystem swap free re mf fr de sr epi epo epf api apo apf fpi fpo fpf 19625616 3285052 1 4 1 0 6 0 0 0 0 0 0 1 0 1 16001244 440392 21 31 0 0 0 0 0 0 0 0 0 2911 0 0 16001244 437120 21 0 0 0 0 0 0 0 0 0 0 3188 0 0 16001244 433592 14 0 0 0 0 0 0 0 0 0 0 3588 0 0 16001244 429732 28 0 0 0 0 0 0 0 0 0 0 3712 0 0 16001244 426036 18 0 0 0 0 0 0 0 0 0 0 3679 0 0 16001244 422448 2 0 0 0 0 0 0 0 0 0 0 3468 0 0 16001244 418980 5 0 0 0 0 0 0 0 0 0 0 3435 0 0 16001244 416012 8 0 0 0 0 0 0 0 0 0 0 2855 0 0 16001244 412648 8 0 0 0 0 0 0 0 0 0 0 3256 0 0 16001244 409292 31 0 0 0 0 0 0 0 0 0 0 3426 0 0 16001244 405760 10 0 0 0 0 0 0 0 0 0 0 3602 0 0 > Also, I'd like to understand better what you're looking to optimize for. > In general, "tuning" for swap is a pointless exercise (and it's not my > contention > that that is what you're looking to do - I'm not actually sure), because the > IO performance of the swap device is really a second order effect of > having a memory working set size larger than physical RAM, which means > the kernel spends a lot of time doing memory management things. I think we're trying to optimize for usage of swap having as little impact as possible. With multiple large java processes needing to run in as little time as possible, and with the business demands that exist making it impossible to have an overall rss < real mem 100% of the time, we want to minimize the impact of pagins. > The poor behavior of swap may really be a just a symptom of other activities > related to memory management. Possibly. > What kind of machine is this, and what does CPU utilization look like > when you're inducing this behavior? These are a variety of systems. IBM 360, sun v20z, and x4100 (we have m1's and m2's. I personally have only tested on m1 systems). This behavior seems consistant on all of them. The program we're using to pin memory is this: #include <stdio.h> #include <stdlib.h> #include <unistd.h> int main(int argc, char** argv) { if (argc != 2) { printf("Bad args\n"); return 1; } const int count = atoi(argv[1]); if (count <= 3) { printf("Bad count: %s\n", argv[1]); return 1; } // Malloc const int nints = count >> 2; int* buf = (int*)malloc(count); if (buf == NULL) { perror("Failed to malloc"); return 1; } // Init for (int i=0; i < nints; i++) { buf[i] = rand(); } // Maintain working set for (;;) { return 1; } const int count = atoi(argv[1]); if (count <= 3) { printf("Bad count: %s\n", argv[1]); return 1; } // Malloc const int nints = count >> 2; int* buf = (int*)malloc(count); if (buf == NULL) { perror("Failed to malloc"); return 1; } // Init for (int i=0; i < nints; i++) { buf[i] = rand(); } // Maintain working set for (;;) { for (int i=0; i < nints; i++) { buf[i]++; } //sleep(1); } return 0; } Nothing too complex. Reads and writes to /tmp and /var/tmp in our tests were all done with dd. I am following up with sun support for this, but in the mean time I am curious if you our anyone else out there see the same behavior? Thanks, -Peter -- The 5 year plan: In five years we'll make up another plan. Or just re-use this one. _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org