A quick look at the source reveals:
A MAXMAGIS constant in file.h that estimates a limit of 1000 lines in
magic. (The real number is 4802)
An array sized on MAXMAGIS, that is reallocated every ALLOC_INTR lines
of magic once MAXMAGIS is exceeded.
The patch updates MAXMAGIS to 5000 (give a bit of room to grow)
And makes ALLOC_INCR a variable that is bigger, and doubles every time
it is used, to attenuate the problem if there ever ends up being 10000
entries in magic.
Results on a 90Mhz Pentium:
new verson
time ./file ./file
./file: FreeBSD/i386 compact demand paged dynamically linked executable
not stripped
0.14 real 0.11 user 0.02 sys
old verson:
./file: FreeBSD/i386 compact demand paged dynamically linked executable
not stripped
0.79 real 0.60 user 0.16 sys
--
Peter.
Peter Jeremy wrote:
>
> Ville-Pertti Keinonen <w...@iki.fi> wrote:
> >jere...@gsmx07.alcatel.com.au (Peter Jeremy) writes:
> >> I can't believe these figures.
>
> Based on the figures below, maybe I was overly hasty in this statement.
> The changes between 2.x and 3.x magic files have far more impact than
> I would have expected.
>
> >What are your results, then?
>
> All timings with everything cached (although the 386 only has 8MB
> which limits the cacheability). For the 2.2.5 systems, I give timings
> with both the 2.2.5 magic and the 4.0 magic (which is the same as
> 3.2-RELEASE, in /tmp).
>
> i386SX-25 running 2.2.5 (roughly as posted earlier):
> % /usr/bin/time file src/Z/dhcp-2.0b1pl26.tar.gz
> src/Z/dhcp-2.0b1pl26.tar.gz: gzip compressed data, deflated, last modified:
> Thu Jan 1 10:00:00 1970, os: Unix
> 2.82 real 1.92 user 0.84 sys
> % /usr/bin/time file -m /tmp/magic src/Z/dhcp-2.0b1pl26.tar.gz
> src/Z/dhcp-2.0b1pl26.tar.gz: gzip compressed data, deflated, last modified:
> Thu Jan 1 10:00:00 1970, os: Unix
> 4.05 real 2.67 user 1.23 sys
>
> 486DX2-50 running 2.2.5:
> % /usr/bin/time file src/Z/dhcp-3.0-alpha-19990423.tar.gz
> src/Z/dhcp-3.0-alpha-19990423.tar.gz: gzip compressed data, deflated, last
> modified: Thu Jan 1 10:00:00 1970, os: Unix
> 1.43 real 0.96 user 0.38 sys
> % /usr/bin/time file -m /tmp/magic src/Z/dhcp-3.0-alpha-19990423.tar.gz
> src/Z/dhcp-3.0-alpha-19990423.tar.gz: gzip compressed data, deflated, last
> modified: Thu Jan 1 10:00:00 1970, os: Unix
> 2.15 real 1.62 user 0.44 sys
>
> PII-266 running 4.0-CURRENT:
> % /usr/bin/time file src/Z/dhcp-1.4.0p6.tar.gz
> src/Z/dhcp-1.4.0p6.tar.gz: gzip compressed data, deflated, last modified: Wed
> Mar 3 20:57:52 1999, os: Unix
> 0.13 real 0.09 user 0.03 sys
>
> When I profile file in a slow system (like a 386 or 486), there is an
> obvious performance bottleneck: The problem is the memcpy() invoked
> from fgets(). The only solution would seem to be to mmap() magic
> and parse it, rather than using fgets() to read it. This bottleneck
> will also be far more obvious on bandwidth-starved systems (like
> 386SX and 486DX2/4), whereas virtually the whole thing fits into the
> L2 cache on my P-II.
>
> Peter
>
> To Unsubscribe: send mail to majord...@freebsd.org
> with "unsubscribe freebsd-hackers" in the body of the message
Common subdirectories: file/Magdir and file.new/Magdir
diff -c file/apprentice.c file.new/apprentice.c
*** file/apprentice.c Wed Jan 28 07:36:21 1998
--- file.new/apprentice.c Wed Jul 21 12:35:21 1999
***************
*** 50,55 ****
--- 50,56 ----
static void eatsize __P((char **));
static int maxmagic = 0;
+ static int alloc_incr = 256;
static int apprentice_1 __P((char *, int));
***************
*** 180,188 ****
struct magic *m;
char *t, *s;
- #define ALLOC_INCR 20
if (nd+1 >= maxmagic){
! maxmagic += ALLOC_INCR;
if ((magic = (struct magic *) realloc(magic,
sizeof(struct magic) *
maxmagic)) == NULL) {
--- 181,188 ----
struct magic *m;
char *t, *s;
if (nd+1 >= maxmagic){
! maxmagic += alloc_incr;
if ((magic = (struct magic *) realloc(magic,
sizeof(struct magic) *
maxmagic)) == NULL) {
***************
*** 192,198 ****
else
exit(1);
}
! memset(&magic[*ndx], 0, sizeof(struct magic) * ALLOC_INCR);
}
m = &magic[*ndx];
m->flag = 0;
--- 192,199 ----
else
exit(1);
}
! memset(&magic[*ndx], 0, sizeof(struct magic) * alloc_incr);
! alloc_incr *= 2;
}
m = &magic[*ndx];
m->flag = 0;
diff -c file/file.h file.new/file.h
*** file/file.h Wed Jul 21 12:37:00 1999
--- file.new/file.h Wed Jul 21 12:35:40 1999
***************
*** 35,41 ****
#ifndef HOWMANY
# define HOWMANY 8192 /* how much of the file to look at */
#endif
! #define MAXMAGIS 1000 /* max entries in /etc/magic */
#define MAXDESC 50 /* max leng of text description */
#define MAXstring 32 /* max leng of "string" types */
--- 35,41 ----
#ifndef HOWMANY
# define HOWMANY 8192 /* how much of the file to look at */
#endif
! #define MAXMAGIS 5000 /* max entries in /etc/magic */
#define MAXDESC 50 /* max leng of text description */
#define MAXstring 32 /* max leng of "string" types */