A quick look at the source reveals: A MAXMAGIS constant in file.h that estimates a limit of 1000 lines in magic. (The real number is 4802) An array sized on MAXMAGIS, that is reallocated every ALLOC_INTR lines of magic once MAXMAGIS is exceeded. The patch updates MAXMAGIS to 5000 (give a bit of room to grow) And makes ALLOC_INCR a variable that is bigger, and doubles every time it is used, to attenuate the problem if there ever ends up being 10000 entries in magic. Results on a 90Mhz Pentium: new verson time ./file ./file ./file: FreeBSD/i386 compact demand paged dynamically linked executable not stripped 0.14 real 0.11 user 0.02 sys old verson: ./file: FreeBSD/i386 compact demand paged dynamically linked executable not stripped 0.79 real 0.60 user 0.16 sys -- Peter. Peter Jeremy wrote: > > Ville-Pertti Keinonen <[EMAIL PROTECTED]> wrote: > >[EMAIL PROTECTED] (Peter Jeremy) writes: > >> I can't believe these figures. > > Based on the figures below, maybe I was overly hasty in this statement. > The changes between 2.x and 3.x magic files have far more impact than > I would have expected. > > >What are your results, then? > > All timings with everything cached (although the 386 only has 8MB > which limits the cacheability). For the 2.2.5 systems, I give timings > with both the 2.2.5 magic and the 4.0 magic (which is the same as > 3.2-RELEASE, in /tmp). > > i386SX-25 running 2.2.5 (roughly as posted earlier): > % /usr/bin/time file src/Z/dhcp-2.0b1pl26.tar.gz > src/Z/dhcp-2.0b1pl26.tar.gz: gzip compressed data, deflated, last modified: Thu Jan >1 10:00:00 1970, os: Unix > 2.82 real 1.92 user 0.84 sys > % /usr/bin/time file -m /tmp/magic src/Z/dhcp-2.0b1pl26.tar.gz > src/Z/dhcp-2.0b1pl26.tar.gz: gzip compressed data, deflated, last modified: Thu Jan >1 10:00:00 1970, os: Unix > 4.05 real 2.67 user 1.23 sys > > 486DX2-50 running 2.2.5: > % /usr/bin/time file src/Z/dhcp-3.0-alpha-19990423.tar.gz > src/Z/dhcp-3.0-alpha-19990423.tar.gz: gzip compressed data, deflated, last modified: >Thu Jan 1 10:00:00 1970, os: Unix > 1.43 real 0.96 user 0.38 sys > % /usr/bin/time file -m /tmp/magic src/Z/dhcp-3.0-alpha-19990423.tar.gz > src/Z/dhcp-3.0-alpha-19990423.tar.gz: gzip compressed data, deflated, last modified: >Thu Jan 1 10:00:00 1970, os: Unix > 2.15 real 1.62 user 0.44 sys > > PII-266 running 4.0-CURRENT: > % /usr/bin/time file src/Z/dhcp-1.4.0p6.tar.gz > src/Z/dhcp-1.4.0p6.tar.gz: gzip compressed data, deflated, last modified: Wed Mar 3 >20:57:52 1999, os: Unix > 0.13 real 0.09 user 0.03 sys > > When I profile file in a slow system (like a 386 or 486), there is an > obvious performance bottleneck: The problem is the memcpy() invoked > from fgets(). The only solution would seem to be to mmap() magic > and parse it, rather than using fgets() to read it. This bottleneck > will also be far more obvious on bandwidth-starved systems (like > 386SX and 486DX2/4), whereas virtually the whole thing fits into the > L2 cache on my P-II. > > Peter > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-hackers" in the body of the message
Common subdirectories: file/Magdir and file.new/Magdir diff -c file/apprentice.c file.new/apprentice.c *** file/apprentice.c Wed Jan 28 07:36:21 1998 --- file.new/apprentice.c Wed Jul 21 12:35:21 1999 *************** *** 50,55 **** --- 50,56 ---- static void eatsize __P((char **)); static int maxmagic = 0; + static int alloc_incr = 256; static int apprentice_1 __P((char *, int)); *************** *** 180,188 **** struct magic *m; char *t, *s; - #define ALLOC_INCR 20 if (nd+1 >= maxmagic){ ! maxmagic += ALLOC_INCR; if ((magic = (struct magic *) realloc(magic, sizeof(struct magic) * maxmagic)) == NULL) { --- 181,188 ---- struct magic *m; char *t, *s; if (nd+1 >= maxmagic){ ! maxmagic += alloc_incr; if ((magic = (struct magic *) realloc(magic, sizeof(struct magic) * maxmagic)) == NULL) { *************** *** 192,198 **** else exit(1); } ! memset(&magic[*ndx], 0, sizeof(struct magic) * ALLOC_INCR); } m = &magic[*ndx]; m->flag = 0; --- 192,199 ---- else exit(1); } ! memset(&magic[*ndx], 0, sizeof(struct magic) * alloc_incr); ! alloc_incr *= 2; } m = &magic[*ndx]; m->flag = 0; diff -c file/file.h file.new/file.h *** file/file.h Wed Jul 21 12:37:00 1999 --- file.new/file.h Wed Jul 21 12:35:40 1999 *************** *** 35,41 **** #ifndef HOWMANY # define HOWMANY 8192 /* how much of the file to look at */ #endif ! #define MAXMAGIS 1000 /* max entries in /etc/magic */ #define MAXDESC 50 /* max leng of text description */ #define MAXstring 32 /* max leng of "string" types */ --- 35,41 ---- #ifndef HOWMANY # define HOWMANY 8192 /* how much of the file to look at */ #endif ! #define MAXMAGIS 5000 /* max entries in /etc/magic */ #define MAXDESC 50 /* max leng of text description */ #define MAXstring 32 /* max leng of "string" types */