I looked at the CPU sampling data to try to confirm what is going on. The profile data (captured by zoom, using perf) shows ~90% of the samples hitting in various invocations of expr. The tool won't summarize call stacks across processes so I manually looked at a bunch of them. I looked at samples in 15 invocations of expr. Each invocation runs for less than a ms and the sampling frequency is one KHz so no individual process got more than one sample. The samples I saw were:
- page_fault in module vmlinux - page_fault-> flush_tlb_page in module vmlinux - do_lookup_x in ld-2.15.so - perf_event__output_id_sample in vmlinux - page_fault-> do_page_fault in vmlinux - do_lookup_x in ld-2.15.so - stub_execve-> load_elf_binary-> exec_mmap in vmlinux - page_fault-> handle_pte_fault-> copy_page_c in vmlinux - page_fault-> handle_pte_fault-> page_add_file_rmap in vmlinux - system_call_fastpath-> sys_mmap-> mmap_region in vmlinux - page_fault-> do_page_fault-> handle_mm_fault in vmlinux - <Unknown Symbol> in unknown module - page_fault-> handle_pte_fault-> page_add_file_rmap in vmlinux - <Unknown Symbol> in unknown module - page_fault-> do_page_fault-> up_read in vmlinux To summarize, the zoom profiler says that ~90% of CPU time when running ExprCount() is spent in expr. Of that time it says that more than half of it is spent in page_fault or mmap. That sounds consistent with the claim that it is process startup that is consuming the missing CPU time that 'time' fails to account for. I see no indications in any of the data of a significant amount of CPU time being spent in the OS scheduler. Neither the context switch data nor the sampling data supports that conclusion. I could be wrong of course but I think my conclusion makes sense, and it is supported by significant evidence. FWIW. -----Original Message----- From: Bruce Dawson [mailto:brucedaw...@cygnus-software.com] Sent: Wednesday, March 20, 2013 7:44 AM To: 'Linda Walsh' Cc: 'bug-bash' Subject: RE: Bug/limitation in 'time' (kernel setings?)... Note that the ~2ms/process create is when the power management fails to get invoked. It drops to ~0.6ms/process create when the CPU ramps up to full speed. The chart you list from twelve years ago shows 6.0 ms to spawn a new process, so my machine is about ten times faster -- seems about right. Cloning a process is not comparable because in my test the kernel has to load and exec() expr, not just fork() bash. > Right there, it shows you that the scheduler is taking almost 33% of your time. I don't follow your logic. How do you know that it is the scheduler? I can see that when you run your schedtool script it shows 15% to 34.4% of CPU time consumed. I say that the rest is process startup, you say scheduler. I'm not sure what evidence points to the scheduler, and I'm not sure what math you used to come up with the 33% figure. You'll have to elucidate. > Then your kernel is misconfigured for your workload. What? My kernel is executing the tasks fine. When my CPU runs at full speed it is slightly faster than yours, and when it runs at 1/3 speed it is slower than yours. In both cases 'time' reports that the CPU is not pegged when it is. Our kernels are behaving quite similarly. > What documentation would you have updated? The bash documentation where it mentions the 'time' command, and the 'time' documentation. -----Original Message----- From: Linda Walsh [mailto:b...@tlinx.org] Sent: Wednesday, March 20, 2013 1:12 AM To: Bruce Dawson Cc: bug-bash Subject: Re: Bug/limitation in 'time' (kernel setings?)... Bruce Dawson wrote: > The reply below is way too long, but oh well. > > The summary is: > I am 100% certain that my ExprCount() loop is CPU bound. ---- Your stats claim 530 iterations/second when spawning expr. That's ~2ms/process create. In an article from 14 years ago on a 200Mhz Pentium, Linux process creates were ~1ms. http://www.linuxjournal.com/article/2941 (table @ http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/029 /2941/2941f1.jpg) If you believe that a modern processor running at 1.6GHz, takes twice as long as a 14 year old pentium, then there is little likelihood that you will be convinced by any logical argument. You need to rebuild your kernel with the config option turned on -- in fact, you need to build your own kernel if you want to do any sort of kernel benchmarks. > You seem convinced that the task is not CPU bound on my machine, but > I'm convinced that it is. --- Then your kernel is misconfigured for your workload. You said: > I am 100% certain that my ExprCount() loop is CPU bound. It is > unfortunate that time gives inaccurate information about this. It > would be nice if the documentation acknowledged that in order to save > future generations from confusion. ---- What documentation would you have updated? Here is a perl program: #!/usr/bin/perl die "need count" unless @ARGV; my $count=$ARGV[0]; my $counter=$count; while ($counter-->0) { my $pid; if (my $pid=fork() ) { waitpid($pid,0); }else { exec "/bin/true"; # use 'exit(0)' to just time forks w/no image load } } ----------------------------- > /usr/bin/time /tmp/sp.pl 10000 time schedtool -a 3 -e /tmp/sp.pl 10000 6.83sec 0.05usr 0.97sys (15.05% cpu) --- But if I don't confine it to 1 cpu: > time /tmp/sp.pl 10000 10.54sec 0.20usr 2.59sys (26.52% cpu) Just forks: > time /tmp/sp.pl 10000 7.23sec 0.12usr 2.36sys (34.35% cpu) > time /tmp/sp.pl 10000 7.21sec 0.12usr 2.35sys (34.39% cpu) --- 1cpu forks: > time schedtool -a 3 -e /tmp/sp.pl 10000 4.32sec 0.06usr 0.95sys (23.69% cpu) Ishtar:law> time schedtool -a 3 -e /tmp/sp.pl 10000 4.29sec 0.05usr 0.96sys (23.80% cpu) ----------------- Right there, it shows you that the scheduler is taking almost 33% of your time. That may be cpu, but it is not cpu that your program is using. That's why I say your accounting is off. >From these figures, I would say it's closer to 1000/second