John Regehr wrote: > No need to use me as an excuse to vent your feelings about > microbenchmarks vs. good benchmarks. I'm showing how to use a > user-space instrumented application to measure scheduling behavior, not > trying to make any claims about the relative merits of the operating > systems in realistic conditions. These can and should be separate > activities.
I was only trying to point out why you were seeing the behaviour; you would have gotten the same answer on your own, eventually, from the context switch measurement suggestions that you received in answer to your question. My comments on the relative merits of the microbenchmarks had more to do with pointing out that the Linux optimization is a special case. I included my idealized benchmarking process because that would show what I said the problems with the Linux approach were, with regard to thread group affinity. > Maybe I should include a few comments to this effect in the paper, in > order to forestall reactions like yours? The last thing I want is to > get into some sort of Linux vs. FreeBSD thing. Maybe I can prevent some > of this by telling people that I used to hack the Windows 2000 > scheduler! :) Heh. The problem is that it's not clear what the graphs you posted are comparing. In the context of the paper, this will probably be mitigated somewhat. However, there are a lot of people who will turn directly to the graphs in any paper, and yell about them, so I doubt you are safe, not matter what you do. It would be useful, I think, to indicate what the benchmarks *aren't* measuring, in the paper, in addition to what they *are*, so that people don't make wrong application of them. I'm not saying that they aren't figures of merit, only that the scope of the merit is not well defined. > If it would help draw the flames now, while I can still do something > about it (paper is due around Apr 15), I'd be happy to post a pointer to > my paper. I think that would help, but if you are going to publish and present, you probably want to limit distribution. 8-(. > Technical comments follow. > > > Because you are attempting a comparative benchmark, I would > > suspect that you are probably running a significantly > > quiescent system (just the benchmark itself, and the code > > being benchmarked, running). I expect that you have not > > stopped "cron" (which runs once a second), nor have you > > stopped other "system processes" which will end up in the > > scheduling queue, > > No, I haven't stopped these activities. However, I'm only measuring the > times for context switches between threads in my test application, so > the things you mention are not throwing off the numbers. How is this > accomplished? When other apps get to run during the test, this shows up > as large gaps in the CPU time seen by my application, and these are > thrown out as outliers -- they don't influence the statistics. The test > is acutally quite robust in the sense that a fair amount of background > activity doesn't throw off the numbers, but in this case more care has > to be taken to throw out the outliers in a sensible way. What I mentioned is specific to the code in /sys/i386/i386/swtch.s. The stats that are interesting are the SWTCH_OPTIM_STATS guaraded statistics. The particular counter that you need to be looking at is _tlb_flush_count. If you look at the Linux code, when they move from one thread in a process to another thread in the same process (they are really in the same process), you will see that they do not reload CR3, and therfeore don't engage in any TLB flushing. FreeBSD does TLB flushing in the idle loop (this is arguably a stupid thing for it to do, since the last process to run may be the next process resceduled... indeed, in your test is an example of where this would be the case). It also does an unnecessary reload of the address space, when moving from one VM to another, even though they are the same because of the RFMEM flag having been set on their creation. Check out the "p_vmspace" references in i386/i386/pmap.c, i386/i386/trap.c, and i386/i386/vm_machdep.c. I don't klnow if you are using USER_LDT in your compiled FreeBSD kernels, or if you have done any other tuning of the FreeBSD to make it perform worse (or better( than GENERIC, as shipped, but USER_LDT will seriously drop performance as well. > I'm not running on an SMP either. This is actually better for Linux, since the patches for per CPU schedulers didn't go in until 2.5.2. > > Right now, you are comparing apples and oranges. > > Sure, if: > > apples == expected time to execute the Linux context switch code when > switching between two Linuxthreads, when the system load consists of 10 > CPU-bound threads and very little other activity > > oranges == expected time to execute the FreeBSD context switch code when > switching between two Linuxthreads, when the system load consists of 10 > CPU-bound threads and very little other activity apples == newer version of Linux. oranges == maintenance release of FreeBSD that was never supposed to happen because 5.0 was supposed to be out by now. You are testing specifically for something that was intentionally left unoptimized. > Thanks for the detailed answer, No problem. Good luck with the paper! I always appreciate real academic work; if you had been Joe Schmoe making the same observation, I would have considered you a troll, and not even bothered to answer. 8-). As it is, I thought that it was more important to ensure that you were operating on correct assumptions about the code, and what you were really seeing, than what people were posting telling you that you might be seeing. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message