A quick look at the source reveals:

A MAXMAGIS constant in file.h that estimates a limit of 1000 lines in
magic. (The real number is 4802)

An array sized on MAXMAGIS, that is reallocated every ALLOC_INTR lines
of magic once MAXMAGIS is exceeded.

The patch updates MAXMAGIS to 5000 (give a bit of room to grow)
And makes ALLOC_INCR a variable that is bigger, and doubles every time
it is used, to attenuate the problem if there ever ends up being 10000
entries in magic.

Results on a 90Mhz Pentium:

new verson

time ./file ./file
./file: FreeBSD/i386 compact demand paged dynamically linked executable
not stripped
        0.14 real         0.11 user         0.02 sys

old verson:

./file: FreeBSD/i386 compact demand paged dynamically linked executable
not stripped
        0.79 real         0.60 user         0.16 sys




--
Peter.



Peter Jeremy wrote:
> 
> Ville-Pertti Keinonen <w...@iki.fi> wrote:
> >jere...@gsmx07.alcatel.com.au (Peter Jeremy) writes:
> >> I can't believe these figures.
> 
> Based on the figures below, maybe I was overly hasty in this statement.
> The changes between 2.x and 3.x magic files have far more impact than
> I would have expected.
> 
> >What are your results, then?
> 
> All timings with everything cached (although the 386 only has 8MB
> which limits the cacheability).  For the 2.2.5 systems, I give timings
> with both the 2.2.5 magic and the 4.0 magic (which is the same as
> 3.2-RELEASE, in /tmp).
> 
> i386SX-25 running 2.2.5 (roughly as posted earlier):
> % /usr/bin/time file src/Z/dhcp-2.0b1pl26.tar.gz
> src/Z/dhcp-2.0b1pl26.tar.gz: gzip compressed data, deflated, last modified: 
> Thu Jan  1 10:00:00 1970, os: Unix
>         2.82 real         1.92 user         0.84 sys
> % /usr/bin/time file -m /tmp/magic src/Z/dhcp-2.0b1pl26.tar.gz
> src/Z/dhcp-2.0b1pl26.tar.gz: gzip compressed data, deflated, last modified: 
> Thu Jan  1 10:00:00 1970, os: Unix
>         4.05 real         2.67 user         1.23 sys
> 
> 486DX2-50 running 2.2.5:
> % /usr/bin/time file src/Z/dhcp-3.0-alpha-19990423.tar.gz
> src/Z/dhcp-3.0-alpha-19990423.tar.gz: gzip compressed data, deflated, last 
> modified: Thu Jan  1 10:00:00 1970, os: Unix
>         1.43 real         0.96 user         0.38 sys
> % /usr/bin/time file -m /tmp/magic src/Z/dhcp-3.0-alpha-19990423.tar.gz
> src/Z/dhcp-3.0-alpha-19990423.tar.gz: gzip compressed data, deflated, last 
> modified: Thu Jan  1 10:00:00 1970, os: Unix
>         2.15 real         1.62 user         0.44 sys
> 
> PII-266 running 4.0-CURRENT:
> % /usr/bin/time file src/Z/dhcp-1.4.0p6.tar.gz
> src/Z/dhcp-1.4.0p6.tar.gz: gzip compressed data, deflated, last modified: Wed 
> Mar  3 20:57:52 1999, os: Unix
>         0.13 real         0.09 user         0.03 sys
> 
> When I profile file in a slow system (like a 386 or 486), there is an
> obvious performance bottleneck:  The problem is the memcpy() invoked
> from fgets().  The only solution would seem to be to mmap() magic
> and parse it, rather than using fgets() to read it.  This bottleneck
> will also be far more obvious on bandwidth-starved systems (like
> 386SX and 486DX2/4), whereas virtually the whole thing fits into the
> L2 cache on my P-II.
> 
> Peter
> 
> To Unsubscribe: send mail to majord...@freebsd.org
> with "unsubscribe freebsd-hackers" in the body of the message
Common subdirectories: file/Magdir and file.new/Magdir
diff -c file/apprentice.c file.new/apprentice.c
*** file/apprentice.c   Wed Jan 28 07:36:21 1998
--- file.new/apprentice.c       Wed Jul 21 12:35:21 1999
***************
*** 50,55 ****
--- 50,56 ----
  static void eatsize   __P((char **));
  
  static int maxmagic = 0;
+ static int alloc_incr = 256;
  
  static int apprentice_1       __P((char *, int));
  
***************
*** 180,188 ****
        struct magic *m;
        char *t, *s;
  
- #define ALLOC_INCR    20
        if (nd+1 >= maxmagic){
!           maxmagic += ALLOC_INCR;
            if ((magic = (struct magic *) realloc(magic, 
                                                  sizeof(struct magic) * 
                                                  maxmagic)) == NULL) {
--- 181,188 ----
        struct magic *m;
        char *t, *s;
  
        if (nd+1 >= maxmagic){
!           maxmagic += alloc_incr;
            if ((magic = (struct magic *) realloc(magic, 
                                                  sizeof(struct magic) * 
                                                  maxmagic)) == NULL) {
***************
*** 192,198 ****
                else
                        exit(1);
            }
!           memset(&magic[*ndx], 0, sizeof(struct magic) * ALLOC_INCR);
        }
        m = &magic[*ndx];
        m->flag = 0;
--- 192,199 ----
                else
                        exit(1);
            }
!           memset(&magic[*ndx], 0, sizeof(struct magic) * alloc_incr);
!           alloc_incr *= 2;
        }
        m = &magic[*ndx];
        m->flag = 0;
diff -c file/file.h file.new/file.h
*** file/file.h Wed Jul 21 12:37:00 1999
--- file.new/file.h     Wed Jul 21 12:35:40 1999
***************
*** 35,41 ****
  #ifndef HOWMANY
  # define HOWMANY 8192         /* how much of the file to look at */
  #endif
! #define MAXMAGIS 1000         /* max entries in /etc/magic */
  #define MAXDESC       50              /* max leng of text description */
  #define MAXstring 32          /* max leng of "string" types */
  
--- 35,41 ----
  #ifndef HOWMANY
  # define HOWMANY 8192         /* how much of the file to look at */
  #endif
! #define MAXMAGIS 5000         /* max entries in /etc/magic */
  #define MAXDESC       50              /* max leng of text description */
  #define MAXstring 32          /* max leng of "string" types */
  

Reply via email to