On Sat, Mar 03, 2001 at 12:00:19PM +0000, Philip Blundell wrote: > It's certainly very strange. Perhaps you could try disassembling the > gzip that works for you, and comparing it to one that doesn't. If > that doesn't yield any clues then I think the only thing for it is > for you to dig into one of the crashing binaries with the debugger > and find out what it is that makes it go wrong.
I've looked at it some more, but haven't found the cause. It seems like either argv[0] is corrupted somewhere, or parameters get corrupted e.g. when the program calls glibc. The generated code for gzip's main() is very different, no idea what's happening in either version of the binary. The code for basename() is identical. I guess I will have to try with gdb. <sigh> I hardly know anything about what happens before a program's main() is executed under Linux. Where's the best place to look? Also, where can I get information about ELF? It would be nice to have a crashing version of gzip with debugging symbols. (I compiled gzip on rameau yesterday, but that binary works. Still have to try debussy.) Where did the gzip in question (=current potato version) get compiled? This is what happens: When gzip is invoked, it calls one of its functions, basename(), to strip e.g. '/usr/bin/' from argv[0]: int main (argc, argv) int argc; char **argv; { int file_count; /* number of files to precess */ int proglen; /* length of progname */ int optc; /* current option */ progname = basename(argv[0]); ... ---------------------------------------------------------------------- basename() calls glibc's strrchr(): char *basename(fname) const char *fname; { char *p; if ((p = strrchr(fname, PATH_SEP)) != NULL) fname = p+1; if ('A' == 'a') strlwr(fname); /* optimized away */ return fname; } 2009230: e1a0c00d mov ip, sp 2009234: e92dd810 stmdb sp!, {r4, fp, ip, lr, pc} 2009238: e24cb004 sub fp, ip, #4 ; 0x4 200923c: e1a04000 mov r4, r0 2009240: e3a0102f mov r1, #47 ; 0x2f 2009244: ebffde6d [5] bl 0x2000c00 ; <== strrchr() 2009248: e3500000 cmp r0, #0 ; 0x0 200924c: 12800001 addne r0, r0, #1 ; 0x1 2009250: 01a00004 moveq r0, r4 2009254: e91ba810 ldmdb fp, {r4, fp, sp, pc} ---------------------------------------------------------------------- In glibc, strrchr makes a call to strchr(), which is weak-aliased to index(): char * strrchr (const char *s, int c) { register const char *found, *p; c = (unsigned char) c; /* Since strchr is fast, we use it rather than the obvious loop. */ if (c == '\0') return strchr (s, '\0'); found = NULL; while ((p = strchr (s, c)) != NULL) { found = p; s = p + 1; } return (char *) found; } strrchr: 7b30c: e1a0c00d mov ip, sp 7b310: e92dd830 stmdb sp!, {r4, r5, fp, ip, lr, pc} 7b314: e24cb004 sub fp, ip, #4 ; 0x4 7b318: e1a04001 mov r4, r1 7b31c: e21440ff ands r4, r4, #255 ; 0xff 7b320: 1a000002 bne 0x7b330 7b324: e1a01004 mov r1, r4 7b328: ebfe8bb3 bl 0x1e1fc 7b32c: e91ba830 ldmdb fp, {r4, r5, fp, sp, pc} 7b330: e3a05000 mov r5, #0 ; found = NULL 7b334: ea000001 [4] b 0x7b340 ; while 7b338: e1a05000 mov r5, r0 7b33c: e2850001 add r0, r5, #1 ; 0x1 7b340: e1a01004 [2] mov r1, r4 7b344: ebfe9190 bl 0x1f98c ; strchr(s, c) 7b348: e3500000 cmp r0, #0 ; 7b34c: 1afffff9 [3] bne 0x7b338 ---------------------------------------------------------------------- Finally, the crash occurs in index(): char * strchr (s, c_in) const char *s; int c_in; { const unsigned char *char_ptr; const unsigned long int *longword_ptr; unsigned long int longword, magic_bits, charmask; unsigned reg_char c; c = (unsigned char) c_in; /* Handle the first few characters by reading one character at a time. Do this until CHAR_PTR is aligned on a longword boundary. */ for (char_ptr = s; ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; ++char_ptr) if (*char_ptr == c) return (void *) char_ptr; else if (*char_ptr == '\0') return NULL; ... index/strchr: 79e30: e1a0c00d [1] mov ip, sp 79e34: e92dd810 stmdb sp!, {r4, fp, ip, lr, pc} 79e38: e24cb004 sub fp, ip, #4 79e3c: e1a03000 mov r3, r0 ; char_ptr = s 79e40: e3130003 tst r3, #3 ; 0x3 79e44: e20110ff and r1, r1, #255 ; c = (uchar) c_in; 79e48: 0a000007 beq 0x79e6c 79e4c: e5d30000 ldrb r0, [r3] ; <== CRASH HERE ---------------------------------------------------------------------- *** Segmentation fault Register dump: R0: 00000001 R1: 0000002f R2: bffffb9c R3: 00000001 R4: 0000002f R5: 00000000 R6: 02000a18 R7: 400207a4 R8: 00000001 R9: 02000fe4 SL: 4013c2c8 FP: bffffb0c IP: bffffb10 SP: bffffafc LR: 400a7348 PC: 400a5e4c CPSR: 20000010 Trap: 0000000e Error: 00000002 OldMask: 00000000 Backtrace: /lib/libc.so.6(index+0x1c)[0x400a5e4c] /lib/libc.so.6(strrchr+0x3c)[0x400a7348] ./gzip.crashes(basename+0x18)[0x2009248] ./gzip.crashes(strcpy+0x304)[0x2001004] /lib/libc.so.6(__libc_start_main+0x108)[0x4004be50] ./gzip.crashes(strcpy+0x34)[0x2000d34] The register dump implies that at [1] strchr() was called with R0=1. Thus, R0 must also have been 1 at [2]. [2] could only be reached via the branch at [4]; R0 would have had to be 0 at [3], but in that case that branch is not taken. Thus, R0 was 1 on entry to strrchr(), and also when strrchr() was called at [5]. Confused, Richard -- __ _ |_) /| Richard Atterer | CS student at the Technische | GPG key: | \/¯| http://atterer.net | Universität München, Germany | 888354F7 ¯ ´` ¯