On Mon, Jan 17, 2005 at 05:52:04PM +0900, GOTO Masanori wrote: > > > > Yes, and if ev67 is instruction upper compatible with ev56 (I > > > > guess so), I think it's acceptable to add a symlink "ln -sf > > > > lib/ev67/libfoo.so lib/ev56/libfoo.so". > > > > > > Ugh... that pushes the burden of maitaining support for new > > > architectures to the package. > > Yeah - I think it's trade off - whether we support library > optimization package or we don't get a bit performance improvement.
So, you are trading maintainance cost for a rather subjective speed improvent? Or should I say, preventing some performance degradation? Keep reading. > > > Please bear with me, but I'm trying to understand the issue: is > > > the cost of calling access(2) or stat(2) really so high? > > > > I'd consider it quite acceptable in this case. However, as I tried > > to express, it's not possible with glibc's current "design", and I > > didn't feel like changing that. > > Note that we should keep in mind: imagine most binaries on all debian > system over the world start to consume access(2)/stat(2) system call > cost in each binary execution time - "Many a little makes a mickle". Ok, I stopped buying this kind of argument long ago. There's a SIGGRAPH paper (2001 IIRC) which justifies certain kind of rather complex optimization because a (graphics) context switch is "too expensive", without actually defining the situation that triggers the context switch in a clear fashion. In my own testing context switches of the kind described in that paper are at least a factor of 100 _faster_ than what the authors claim. Attached is a program that measures the time a single stat(2) call takes. I get circa 5 microseconds per stat(2) call on my computer (AMD Athlon 1600+, can't recall what kind of memory it has right now). Note that the code that doesn't directly have to do with the stat(2) has a rather low overhead (circa 1 ns on my system). What that means is that you need to make about 2000 stat(2) calls to get _anywhere_ near what's measurable by a human and about 20000 to start getting said human annoyed. If a biggish GNOME program (Epiphany Browser) links to 60 libraries, you need to perform a lookup in ~ 30 paths for the start up delay to be measurable and ~ 300 for it to be annoying. ls(1) links to 6 libraries. That's one order of magnitude less, IOW, you need a path with ~ 3000 components to start being annoying. So, what exactly are you talking about? > > > I see for example that on start up the file /etc/ld.so.nohwcap is > > > accessed multiple times (and it's not present, isn't that a race? > > > what happens if the file suddenly appears in the middle of > > > program start up? what's that file anyway, I can't find it > > > mentioned in the documentation). > > > > It's supposed to disable the use of hwcaps. Stating it multiple > > times seems like a bug. The contents does not matter? > Debian glibc has been applied a special patch to check > /etc/ld.so.nohwcap before loading libraries each time. You can see > it in debian-glibc package ldso-disable-hwcap.dpatch written by Ben > and Daniel. It enables us to upgrade smoothly even if we use > optimized libraries - this effort is one of debian's nice features. > But the drawback is it needs to pay access(2) lookup cost as you > pointed out. > > Checking /etc/ld.so.nohwcap each time (some binaries call multiple > times) is the current patch design Why? I just can't see a valid reason for "wanting" the file to suddenly pop up while the program is running. > I think this is safer than checking /etc/ld.so.nohwcap once in > program startup time. Safer in what way? Again, I just don't buy that "system calls are too expensive" argument. Anyone writing shell scripts cares about a whole lot of things *but* performance. And I'm not talking about increasing running time by a factor of anything, I'm talking about adding a bunch of microseconds, which get lost in the middle of filesystem stalls, page faults and other rather common events. Marcelo
#include <cmath> #include <cstdio> #include <ctime> #include <sys/stat.h> #include <sys/time.h> #include <sys/types.h> #include <unistd.h> int main(int argc, char * argv[]) { const int N = 6; char name[N+1]; for(int i=0; i < N; ++i) name[i] = '0'; name[N] = 0; struct timeval t0, t1; gettimeofday(&t0, NULL); for(int i=0; i < N;) { struct stat buf; stat(name, &buf); for(i=0; i != N && ++name[i] == '9'+1; ++i) name[i]='0'; } gettimeofday(&t1, NULL); float dt = (t1.tv_sec - t0.tv_sec) + (t1.tv_usec - t0.tv_usec)*1E-6; printf("%g\n", dt/powf(10, N)); return 0; }