Re: tunefs -p doesn't work for read-write mounts
I just send-pr'ed a patch for this: misc/17143 David Malone wrote: > > On Fri, Mar 03, 2000 at 11:23:33AM +0200, Sheldon Hearn wrote: > > > Shouldn't I be able to show the current tuneables for a given filesystem? > > > > # tunefs -p /usr > > tunefs: cannot work on read-write mounted file system > > Tunefs seems to have had trouble with being given filesystem names > for a bit. Giving the device name works in 3.4, but not in 4.0. > > David. > > 4.0-CURRENT: > > 10:04:gonzo 3# df /usr > Filesystem 1024-blocks UsedAvail Capacity Mounted on > /dev/ad1s1f 858367 783550 614899%/usr > 10:04:gonzo 4# tunefs -p /usr > tunefs: cannot work on read-write mounted file system > 10:04:gonzo 5# tunefs -p /dev/ad1s1f > tunefs: cannot open /dev/ad1s1f: Device busy > 10:04:gonzo 6# tunefs -p /dev/rad1s1f > tunefs: cannot open /dev/rad1s1f: Device busy > > 3.4-STABLE: > > 10:05:walton 1# df /usr > Filesystem 1024-blocks UsedAvail Capacity Mounted on > /dev/da0s1f 762223 6732362801096%/usr > 10:05:walton 2# tunefs -p /usr > tunefs: cannot work on read-write mounted file system > 10:05:walton 3# tunefs -p /dev/rda0s1f > tunefs: soft updates: (-n)disabled > tunefs: maximum contiguous block count: (-a) 15 > tunefs: rotational delay between contiguous blocks: (-d) 0 ms > tunefs: maximum blocks per file in a cylinder group: (-e) 2048 > tunefs: minimum percentage of free space: (-m) 8% > tunefs: optimization preference: (-o) time > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: C shell scripts (was: Re: which(1), rewritten in C?)
Assar Westerlund wrote: > > There's a real reason for not writing this in csh. Because the > built-in function will return results for csh, which might not be the > right ones for other shells. > I got bitten by this by HP-UX 10's csh-based "which". My solaris-hosted NFS home directory had the default Solaris .cshrc that changes $path. Needless to say, "which" was somewhat misleading :-) Ugh. -- Peter. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
tput(1) appears broken.
Hi, (Sorry if this hits the list a couple of times, I had some problems with my mail system) I never thought I'd see "clear" break :-) Here's a fix. It seems to be down to an ncurses change in tgetstr(). The implementation in ncurses just ignores the "area" parameter, while the one in libtermcap is does something quite bizzare with it. tput.c relies on the old functionality. A better patch would probably be to the manpage, though. It doesn't even mention the "area" parameter, apart from in the synopsis. I wonder if "cls" ever broke on DOS? -- Peter. *** tput.c.orig Wed Sep 1 00:00:14 1999 --- tput.c Tue Aug 31 23:55:22 1999 *** *** 64,70 extern char *optarg; extern int optind; int ch, exitval, n; ! char *cptr, *p, *term, buf[1024], tbuf[1024]; term = NULL; while ((ch = getopt(argc, argv, "T:")) != -1) --- 64,70 extern char *optarg; extern int optind; int ch, exitval, n; ! char *cptr, *p, *term, *capstr, buf[1024], tbuf[1024]; term = NULL; while ((ch = getopt(argc, argv, "T:")) != -1) *** *** 105,112 break; } cptr = buf; ! if (tgetstr(p, &cptr)) ! argv = process(p, buf, argv); else if ((n = tgetnum(p)) != -1) (void)printf("%d\n", n); else --- 105,112 break; } cptr = buf; ! if (capstr = tgetstr(p, &cptr)) ! argv = process(p, capstr, argv); else if ((n = tgetnum(p)) != -1) (void)printf("%d\n", n); else _ Get your free E-mail at http://www.ireland.com To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
fstat(1) breakage + fix
Hi, fstat(1) should be able to take a set of filenames as arguments to limit the results of its output to the specified files. However, it doesn't work at the moment, because of the existance of udev_t. (It compares the st_dev from the stat structure used by stat(2) with in-kernel dev_t structures. As it stands, "fstat " will never produce any output other than a header.) I've attached a patch that appears reliable. Can someone review it (and possibly commit??) -- Peter. _ Get your free E-mail at http://www.ireland.com fstat.c.patch
Re: vn.ko load/unload/mount = panic
I had a longer look at this, and a more complete patch is logged as PR kern/18270 (try at your own risk: it works for me). I'd appreciate someone more experienced having a look at it and commenting. Cheers, Peter. Wilko Bulte wrote: > > On Wed, Apr 26, 2000 at 04:25:46PM +0100, Peter Edwards (local) wrote: > > How about send-pr ing this stuff? > > Wilko > > > Hi, > > After a (very) quick look at the source it looks like there's a missing > > cdevsw_remove() missing from the MOD_UNLOAD/MOD_SHUTDOWN event handling > > I haven't time to test it, but try this: > > > > *** vn.c.oldWed Apr 26 16:23:03 2000 > > --- vn.cWed Apr 26 16:24:06 2000 > > *** > > *** 762,767 > > --- 762,768 > > case MOD_UNLOAD: > > /* fall through */ > > case MOD_SHUTDOWN: > > + cdevsw_remove(&vn_cdevsw); > > for (;;) { > > vn = SLIST_FIRST(&vn_list); > > if (!vn) > > > > > > Maxim Sobolev wrote: > > > > > > Hi, > > > > > > I've already submitted this crash report earlier but it seems that developers > > > in -current list are too busy discussing whether Matt allowed to commit his SMP > > > work into 4.0 to pay attention to "ordinary" panic reports :-(. Following is > > > slightly simplified course of actions which is known to produce kernel panic on > > > both 4.0 and 5.0: > > > > > > root@notebook# kldstat > > > Id Refs AddressSize Name > > > 12 0xc010 1c2f48 kernel > > > 21 0xc02c3000 30c8 splash_bmp.ko > > > root@notebook# mount /dev/vn0c /mnt > > > mount: Device not configured > > > root@notebook# kldload /modules/vn.ko > > > root@notebook# kldstat > > > Id Refs AddressSize Name > > > 13 0xc010 1c2f48 kernel > > > 21 0xc02c3000 30c8 splash_bmp.ko > > > 31 0xc0823000 3000 vn.ko > > > root@notebook# kldunload -i 3 > > > root@notebook# mount /dev/vn0c /mnt > > > [BINGO] > > > Fatal trap 12: page fault while in kernel mode > > > [...] > > > > > > -Maxim > > -- > Wilko Bulte Powered by FreeBSD http://www.freebsd.org > http://www.tcja.nl > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: vn.ko load/unload/mount = panic
Nick Hibma wrote: > > Correct me if I am wrong, but I don't think you actually have to > disassociate any dev_t's from the driver (by clearing the si_drv[12] > fields) because we call destroy_dev and cdevsw_remove, so any later uses > of dev_t's get an error because the device has gone away. > > Apart from that I don't think we the calls to cdevsw_add/_destroy in the > first place, because we create the cdevsw on demand (with make_dev). > > Poul, is this correct? > > nick Hi, For the specific case of vn, there is a one-to-many correspondance between vn instances and dev_ts. The make_dev()/destroy_dev() pairs only bracket one of the dev_ts. Opens on the same instances of the vn through different dev_t's just assign the former's softc to the latter's si_drv1 field. There is no corresponding "make_dev()" or "destroy_dev()" call for such dev_t in the vn driver, so the si_drv1 fields dangle after the driver is unloaded The cdevsw_remove only stops access to the offending dev_t's until the driver is loaded again. After that, the old dev_ts' si_drv1 fields are still dangling, and the vn driver ends up with a pointer to a free()d vn softc, and bites the dust. After reading the rest of the discussion on this thread, and moving out of my depth a little, I assume vn should probably be using disk_create()/disk_destroy(), and attaching its softc to the disk object rather than the device object. (However, I suppose given the special nature of vn, there might be reasons for not using this interface). I'll gladly "disk"ify vn as a mini- junior-kernel-hacker task if someone indicates that it is needed, and if no one more qualified wants to do it. -- Peter. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
gdb of C++ program broken due to missing objects from Makefile.
Hi, Debugging trivial C++ programs in GDB is knackered, complaining about "ABI doesn't define required function XXX" when doing operations such as printing classes and structures which inherit from others, and setting breakpoints in virtual functions, etc. Adding "gnu-v2-abi.c" and "gnu-v3-abi.c" to XSRCS in src/gnu/usr.bin/binutils/gdb/Makefile fixes the problem. This seems like a reasonably large problem with GDB as it stands currently. Does someone fancy committing this? patch and before/after example follow. Index: Makefile === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/gnu/usr.bin/binutils/gdb/Makefile,v retrieving revision 1.62 diff -u -r1.62 Makefile --- Makefile12 Oct 2002 21:23:53 - 1.62 +++ Makefile3 Jan 2003 11:40:52 - @@ -37,6 +37,7 @@ ui-file.c ui-out.c wrapper.c cli-out.c \ cli-cmds.c cli-cmds.h cli-decode.c cli-decode.h cli-script.c\ cli-script.h cli-setshow.c cli-setshow.h cli-utils.c cli-utils.h +XSRCS+= gnu-v2-abi.c gnu-v3-abi.c XSRCS+=freebsd-uthread.c kvm-fbsd.c SRCS= init.c ${XSRCS} nm.h tm.h xm.h gdbversion.c xregex.h petere@rocklobster$ cat > test.cc << . heredoc> petere@rocklobster$ cat > test.cc << . heredoc> #include heredoc> heredoc> struct A { heredoc> const char *fieldOfA; heredoc> A() : fieldOfA("A's field") {} heredoc> }; heredoc> heredoc> struct B : public A { heredoc> const char * fieldOfB; heredoc> B() : fieldOfB("B's field") {} heredoc> }; heredoc> heredoc> main() heredoc> { heredoc> B myB; heredoc> pause(); heredoc> } heredoc> . petere@rocklobster$ g++ -g test.cc petere@rocklobster$ /usr/bin/gdb ./a.out GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-undermydesk-freebsd"... (gdb) b 16 Breakpoint 1 at 0x80485b7: file test.cc, line 16. (gdb) run Starting program: /local/petere/a.out Breakpoint 1, main () at test.cc:16 16 pause(); (gdb) p myB $1 = {ABI doesn't define required function baseclass_offset (gdb) quit The program is running. Exit anyway? (y or n) y petere@rocklobster$ /tmp/peters_fixed_gdb ./a.out GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-undermydesk-freebsd"... (gdb) b 16 Breakpoint 1 at 0x80485b7: file test.cc, line 16. (gdb) run Starting program: /local/petere/a.out Breakpoint 1, main () at test.cc:16 16 pause(); (gdb) p myB $1 = { = { fieldOfA = 0x8048692 "A's field" }, members of B: fieldOfB = 0x8048688 "B's field" } (gdb) To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Opening /dev/tty in session leader after controlling terminal is revoked causes panic.
Attached is a panic and patch a patch for the problem in the Subject line. The problem is in kern/tty_tty.c:ctty_clone. It's assuming that if the process has its P_CONTROLT flag set, then it's session has a valid vnode for it's controlling terminal. This doesn't hold if the terminal was revoked. Cheers, Peter Edwards. GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-undermydesk-freebsd"... panic: bdwrite: buffer is not busy panic messages: --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0x88 fault code = supervisor read, page not present instruction pointer = 0x8:0xc020795a stack pointer = 0x10:0xcdd7e7b8 frame pointer = 0x10:0xcdd7e7c8 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 986 (mozilla-bin) trap number = 12 panic: page fault syncing disks, buffers remaining... panic: bdwrite: buffer is not busy Uptime: 20h40m24s Dumping 256 MB ata0: resetting devices .. done 16 32 48 64 80 96 112 128 144 160 176 192[CTRL-C to abort] 208 224 240 --- #0 doadump () at /pub/FreeBSD/current/src/sys/kern/kern_shutdown.c:232 232 dumping++; (kgdb) p/a 0xc020795a $1 = 0xc020795a (kgdb) bt #0 doadump () at /pub/FreeBSD/current/src/sys/kern/kern_shutdown.c:232 #1 0xc01d1969 in boot (howto=260) at /pub/FreeBSD/current/src/sys/kern/kern_shutdown.c:364 #2 0xc01d1bc3 in panic () at /pub/FreeBSD/current/src/sys/kern/kern_shutdown.c:531 #3 0xc0214ea0 in bdwrite (bp=0xc7783bf0) at /pub/FreeBSD/current/src/sys/kern/vfs_bio.c:955 #4 0xc028abab in ffs_update (vp=0xc27d3b18, waitfor=0) at /pub/FreeBSD/current/src/sys/ufs/ffs/ffs_inode.c:125 #5 0xc029f62f in ffs_fsync (ap=0xcdd7e5c0) at /pub/FreeBSD/current/src/sys/ufs/ffs/ffs_vnops.c:315 #6 0xc029e5f7 in ffs_sync (mp=0xc2691000, waitfor=2, cred=0xc0eb6e80, td=0xc034f000) at vnode_if.h:612 #7 0xc022a7db in sync (td=0xc034f000, uap=0x0) at /pub/FreeBSD/current/src/sys/kern/vfs_syscalls.c:138 #8 0xc01d154c in boot (howto=256) at /pub/FreeBSD/current/src/sys/kern/kern_shutdown.c:273 #9 0xc01d1bc3 in panic () at /pub/FreeBSD/current/src/sys/kern/kern_shutdown.c:531 #10 0xc02f9c42 in trap_fatal (frame=0xcdd7e778, eva=0) at /pub/FreeBSD/current/src/sys/i386/i386/trap.c:844 #11 0xc02f9922 in trap_pfault (frame=0xcdd7e778, usermode=0, eva=136) at /pub/FreeBSD/current/src/sys/i386/i386/trap.c:758 #12 0xc02f93f0 in trap (frame= {tf_fs = -841547752, tf_es = -841547760, tf_ds = -1069416432, tf_edi = -1033343948, tf_esi = - 841487372, tf_ebp = -841488440, tf_isp = -841488476, tf_ebx = -841488376, tf_edx = -841488260, tf_ec x = -1070401078, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1071613606, tf_cs = 8, tf_eflags = 66050, tf_esp = -841488264, tf_ss = -1070401082}) at /pub/FreeBSD/current/src/sys/i386/i386/trap.c:445 #13 0xc02e94a8 in calltrap () at {standard input}:98 #14 0xc0195f38 in devfs_lookupx (ap=0x0) at /pub/FreeBSD/current/src/sys/fs/devfs/devfs_vnops.c:382 #15 0xc01961db in devfs_lookup (ap=0xcdd7e914) at /pub/FreeBSD/current/src/sys/fs/devfs/devfs_vnops.c:448 #16 0xc021ed12 in lookup (ndp=0xcdd7ebcc) at vnode_if.h:52 #17 0xc021e71b in namei (ndp=0xcdd7ebcc) at /pub/FreeBSD/current/src/sys/kern/vfs_lookup.c:181 #18 0xc0231eb9 in vn_open_cred (ndp=0xcdd7ebcc, flagp=0xcdd7eccc, cmode=420, cred=0xc2afb500) at /pub/FreeBSD/current/src/sys/kern/vfs_vnops.c:122 #19 0xc0231e59 in vn_open (ndp=0x0, flagp=0x0, cmode=0) at /pub/FreeBSD/current/src/sys/kern/vfs_vnops.c:86 #20 0xc022b683 in kern_open (td=0xc264d8c0, path=0x0, pathseg=UIO_USERSPACE, flags=1538, mode=438) at /pub/FreeBSD/current/src/sys/kern/vfs_syscalls.c:662 #21 0xc022b490 in open (td=0x0, uap=0x0) at /pub/FreeBSD/current/src/sys/kern/vfs_syscalls.c:625 #22 0xc02f9f6a in syscall (frame= {tf_fs = -65489, tf_es = -1078001617, tf_ds = -1078001617, tf_edi = 703138768, tf_esi = 678015 352, tf_ebp = -1077956120, tf_isp = -841486988, tf_ebx = 677213476, tf_edx = 1537, tf_ecx = 67801535 2, tf_eax = 5, tf_trapno = 12, tf_err = 2, tf_eip = 677550019, tf_cs = 31, tf_eflags = 518, tf_esp = -1077956148, tf_ss = 47}) at /pub/FreeBSD/current/src/sys/i386/i386/trap.c:1033 #23 0xc02e94fd in Xint0x80_syscall () at {standard input}:140 ---Can't read userspace from dump, or kernel process--- (kgdb) disas ctty_clone Dump of assembler code for function ctty_clone: 0xc0207910 :push %ebp 0xc0207911 : mov%esp,%ebp 0xc0207913
Re: Opening /dev/tty in session leader after controlling terminal is revoked causes panic.
[EMAIL PROTECTED] wrote: > > In message <[EMAIL PROTECTED]>, "Peter Edwards" writes: > > >The problem is in kern/tty_tty.c:ctty_clone. It's assuming that if the process > >has its P_CONTROLT flag set, then it's session has a valid vnode for it's > >controlling terminal. This doesn't hold if the terminal was revoked. > > Can you try this patch ? [snip patch] Yes, this patch also fixes the problem. -- Peter Edwards. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
/bin/ps buggette
The keyword "mtxname" was changed to "logname" in the list of keywords in bin/ps/keyword.c Unfortunately, this table needs to be sorted, and the change broke the sort-order. The incredibly complex patch is included below, if someone wants to commit it. :-) Index: keyword.c === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/bin/ps/keyword.c,v retrieving revision 1.60 diff -u -r1.60 keyword.c --- keyword.c 19 Jan 2003 00:31:15 - 1.60 +++ keyword.c 5 Feb 2003 14:56:16 - @@ -98,6 +98,8 @@ {"label", "LABEL", NULL, LJUST|DSIZ, label, s_label, SHRT_MAX, 0, CHAR, NULL, 0}, {"lim", "LIM", NULL, 0, maxrss, NULL, 5, 0, CHAR, NULL, 0}, + {"lockname", "LOCK", NULL, LJUST, lockname, NULL, 6, 0, CHAR, NULL, + 0}, {"login", "LOGIN", NULL, LJUST, logname, NULL, MAXLOGNAME-1, 0, CHAR, NULL, 0}, {"logname", "", "login", 0, NULL, NULL, 0, 0, CHAR, NULL, 0}, @@ -111,8 +113,6 @@ LONG, "ld", 0}, {"msgsnd", "MSGSND", NULL, USER, rvar, NULL, 4, ROFF(ru_msgsnd), LONG, "ld", 0}, - {"lockname", "LOCK", NULL, LJUST, lockname, NULL, 6, 0, CHAR, NULL, - 0}, {"mwchan", "MWCHAN", NULL, LJUST, mwchan, NULL, 6, 0, CHAR, NULL, 0}, {"ni", "", "nice", 0, NULL, NULL, 0, 0, CHAR, NULL, 0}, {"nice", "NI", NULL, 0, kvar, NULL, 2, KOFF(ki_nice), CHAR, "d", -- Peter Edwards. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
[resend] gdb, threads, and corefiles
Sorry if this appears twice: my webmail client appears to have dropped the original message on the floor. gdb didn't find threads in corefiles: The support was just missing. The attached patch does the job. Also attached is a small test program which easily generates a corefile with threads in it, if anyone wants to test/commit it. Before and after output below: > petere@rocklobster$ cc -o threadcore -Wall -g -pthread threadcore.c > petere@rocklobster$ ./threadcore > started thread 0 > started thread 1 > started thread 2 > started thread 3 > zsh: bus error (core dumped) ./threadcore > petere@rocklobster$ gdb threadcore threadcore.core > GNU gdb 5.2.1 (FreeBSD) > Copyright 2002 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you > are welcome to change it and/or distribute copies of it under certain > conditions. Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for > details. This GDB was configured as "i386-undermydesk-freebsd"... > Core was generated by `threadcore'. > Program terminated with signal 10, Bus error. > Reading symbols from /usr/lib/libc_r.so.5...done. > Loaded symbols for /usr/lib/libc_r.so.5 > Reading symbols from /usr/lib/libc.so.5...done. > Loaded symbols for /usr/lib/libc.so.5 > Reading symbols from /usr/libexec/ld-elf.so.1...done. > Loaded symbols for /usr/libexec/ld-elf.so.1 > #0 0x280d5463 in kill () from /usr/lib/libc.so.5 > (gdb) i thr > * 1 process 41359 0x280d5463 in kill () from /usr/lib/libc.so.5 > (gdb) quit > petere@rocklobster$ /usr/src/gnu/usr.bin/binutils/gdb/gdb threadcore > threadcore.core GNU gdb 5.2.1 (FreeBSD) > Copyright 2002 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you > are welcome to change it and/or distribute copies of it under certain > conditions. Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for > details. This GDB was configured as "i386-undermydesk-freebsd"... > Core was generated by `threadcore'. > Program terminated with signal 10, Bus error. > Reading symbols from /usr/lib/libc_r.so.5...done. > Loaded symbols for /usr/lib/libc_r.so.5 > Reading symbols from /usr/lib/libc.so.5...done. > Loaded symbols for /usr/lib/libc.so.5 > Reading symbols from /usr/libexec/ld-elf.so.1...done. > Loaded symbols for /usr/libexec/ld-elf.so.1 > #0 0x280d5463 in kill () from /usr/lib/libc.so.5 > (gdb) i thr > 7 Process 41359 0x280d5463 in kill () from /usr/lib/libc.so.5 > 6 Process 41359 0x2807b7ec in _thread_kern_sched () from > /usr/lib/libc_r.so.5 5 Process 41359 0x2807b7ec in _thread_kern_sched > () from /usr/lib/libc_r.so.5 4 Process 41359 0x2807b7ec in > _thread_kern_sched () from /usr/lib/libc_r.so.5 3 Process 41359 > 0x2807b7ec in _thread_kern_sched () from /usr/lib/libc_r.so.5 2 > Process 41359 0x2807b7ec in _thread_kern_sched () from > /usr/lib/libc_r.so.5 > * 1 Process 41359 0x280d5463 in kill () from /usr/lib/libc.so.5 > (gdb) Enjoy, Peter. #include #include #include #include #include #include void *thr(void *a) { printf("started thread %d\n", (int)a); pause(); return 0; } int main(int argc, char **argv) { const int THREADCOUNT = 4; int i; pthread_t threads[THREADCOUNT]; for (i = 0; i < THREADCOUNT; i++) if (pthread_create(&threads[i], 0, thr, (void *)i) != 0) errx(-1, "cannot create thread"); sleep(1); kill(0, SIGBUS); return 0; } Index: freebsd-uthread.c === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/gnu/usr.bin/binutils/gdb/freebsd-uthread.c,v retrieving revision 1.10 diff -u -r1.10 freebsd-uthread.c --- freebsd-uthread.c 4 Jan 2003 17:35:54 - 1.10 +++ freebsd-uthread.c 7 Feb 2003 12:34:19 - @@ -46,6 +46,10 @@ #include #include "gdbcore.h" +int coreops_suppress_target = 1; /* Ugh. Override the version in corelow.c. */ +extern struct target_ops core_ops; /* target vector for corelow.c */ +static struct target_ops orig_core_ops;/* target vector for corelow.c */ + extern int child_suppress_run; extern struct target_ops child_ops; /* target vector for inftarg.c */ @@ -60,6 +64,7 @@ /* Pointer to the next function on the objfile event chain. */ static void (*target_new_objfile_chain) (struct objfile *objfile); +static void freebsd_uthread_find_new_threads (void); static void freebsd_uthread_resume PARAMS ((ptid_t pid, int step, enum target_signal signo)); @@ -472,7 +477,10 @@ if (freebsd_uthread_attaching || TIDGET(inferior_ptid) == 0) { - child_ops.to_fetch_registers (regno); + if (target_has_execution) + child_ops.to_fetch_registers (regno); + else + orig_core_ops.to_fetch_registers (regno); return; } @@ -481,7 +489,10 @
Re: bge & vlan stranges
Hm. A bit of a stab in the dark, but from sys/dev/bge/if_bge.c, line 3185 (on 5.1 release, 2399) > /* Specify MTU. */ > CSR_WRITE_4(sc, BGE_RX_MTU, ifp->if_mtu + > ETHER_HDR_LEN + ETHER_CRC_LEN); > > Wonder if this should be > /* Specify MTU. */ > CSR_WRITE_4(sc, BGE_RX_MTU, ifp->if_mtu + > ETHER_HDR_LEN + ETHER_CRC_LEN + ETHER_VLAN_ENCAP_LEN); > Given that bge advertises IFCAP_VLAN_MTU?? ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bge & vlan stranges
John Polstra <[EMAIL PROTECTED]> wrote: > Peter Edwards <[EMAIL PROTECTED]> wrote: > > > CSR_WRITE_4(sc, BGE_RX_MTU, ifp->if_mtu + > > > ETHER_HDR_LEN + ETHER_CRC_LEN + ETHER_VLAN_ENCAP_LEN); > Good guess, but the approved way of doing it is to add this code > near the point where IFCAP_VLAN_MTU is set: > > ifp->if_data.ifi_hdrlen = sizeof(struct ether_vlan_header); > > See "sys/dev/fxp/if_fxp.c" for an example that works. Sorry for being obtuse, but just to clarify: fxp just seems to have an "allow long frames" flag, rather than a "max frame size" register in the hardware, so you never seem to have to tell the hardware the max size of a frame it needs to accept. I assume you mean, that after setting if_hdrlen, you still need to write to the PCI register, like this: CSR_WRITE_4(sc, BGE_RX_MTU, ifp->if_mtu + ifp->if_hdrlen + ETHER_CRC_LEN); I don't have a bge device, so I can't muck about with it to try. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bge & vlan stranges
John Polstra <[EMAIL PROTECTED]> wrote: > No, you are right. I didn't read the posting carefully enough. > Sorry! No problem. [snip] >> I assume you mean, that after setting if_hdrlen, [snip] > I think you also have to set if_data.ifi_hdrlen as I said [snip] My fault: I jumped from one term for the same thing to the other without explanation. if_hdrlen is a macro for if_data.ifi_hdrlen. I'm not a big fan of hiding those kind of details with macros, but I assume they're defined by smarter people than me in order that they be used :-) ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bge & vlan stranges
Ok. After all that, and given I've gone this far... Boris, does the patch included fix your problem? -- Peter Edwards. Index: sys/dev/bge/if_bge.c === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/sys/dev/bge/if_bge.c,v retrieving revision 1.46 diff -u -r1.46 if_bge.c --- sys/dev/bge/if_bge.c25 Jul 2003 20:33:43 - 1.46 +++ sys/dev/bge/if_bge.c1 Aug 2003 18:33:56 - @@ -2356,6 +2356,7 @@ ifp->if_watchdog = bge_watchdog; ifp->if_init = bge_init; ifp->if_mtu = ETHERMTU; + ifp->if_hdrlen = sizeof(struct ether_vlan_header); ifp->if_snd.ifq_maxlen = BGE_TX_RING_CNT - 1; ifp->if_hwassist = BGE_CSUM_FEATURES; ifp->if_capabilities = IFCAP_HWCSUM | IFCAP_VLAN_HWTAGGING | @@ -3181,8 +3182,8 @@ ifp = &sc->arpcom.ac_if; /* Specify MTU. */ - CSR_WRITE_4(sc, BGE_RX_MTU, ifp->if_mtu + - ETHER_HDR_LEN + ETHER_CRC_LEN); + CSR_WRITE_4(sc, BGE_RX_MTU, ifp->if_mtu + ifp->if_hdrlen + + ETHER_CRC_LEN); /* Load our MAC address. */ m = (u_int16_t *)&sc->arpcom.ac_enaddr[0];___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Fun with gdb and threads...
Hi. This might be of interest to anyone who has tried debugging multi-threaded programs (of the libc_r variety) with gdb. This has been bugging me for months, and I finally got frustrated enough to find out what was going on. The symptom: Once you call any function that puts a thread to sleep, the target process crashes (simple program, 1.c attached, and log of gdb killing it in crash.txt) The problem: I traced this to an interaction between gdb and the threads scheduler. The initial crash comes from gdb adding internal breakpoints in the "(_)?(sig)?longjmp" functions. This breakpoint gets hit when the thread scheduler calls "_thread_kern_sched" After handling the breakpoint, gdb then needs to reset the instruction pointer in the "current thread" to re-run the instruction the breakpoint was at. However, at that point, gdb's freebsd_uthread_store_registers() barfs, thinking that the thread in question is not "active", because its not in state PS_RUNNING (it's just about to go to sleep). As a result, it mucks up the resetting of the instruction pointer, because it thinks it just needs to twiddle with the threads context, rather than the "live" registers. Once the process is resumed, it starts in the middle of whatever instruction the breakpoint overwrote, and generally fscking things up. The fix: I added a couple of "nop"s to "___longjmp", and created a new entrypoint below them called "___longjmp_raw". This provides a way for the libc_r library to avoid hitting the gdb breakpoints at sensitive moments. All other consumers still work the exact same way (modulo the time spent executing a couple of nops). The patch is attached, and makes gdb behave perfectly for me. Does anyone have any comments on this, or ideas on how to improve on it? The only penalty I can see is an extra "nop" instruction for normal longjmps, which I'll gladly trade for a usable debugger. PS: before anyone suggests it, I initially tried changing freebsd_uthread.c to check for the active thread more effectively, as is done in freebsd_uthread_fetch_registers, by comparing it with "_pthread_run", rather than checking the state. This improved things, but gdb still got confused, and started stopping unexpectedly when it lost it's breakpoints, etc, so I figured the other approach was probably going to be more stable. Index: lib/libc/i386/gen/_setjmp.S === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/lib/libc/i386/gen/_setjmp.S,v retrieving revision 1.16 diff -u -r1.16 _setjmp.S --- lib/libc/i386/gen/_setjmp.S 23 Mar 2002 02:05:17 - 1.16 +++ lib/libc/i386/gen/_setjmp.S 7 Aug 2003 20:42:08 - @@ -66,6 +66,17 @@ .weak CNAME(_longjmp) .set CNAME(_longjmp),CNAME(___longjmp) ENTRY(___longjmp) +/* + * Debuggers tend to put breakpoints in longjmp, while + * threads libraries don't like to be interrupted. + * The extra nop for the exposed "_longjmp" stops + * ___longjmp getting mucked about with by the debugger + * The threads library can then call ___longjmp_raw + * with impunity. + */ + nop + nop +ENTRY(___longjmp_raw) movl 4(%esp),%edx movl 8(%esp),%eax movl 0(%edx),%ecx Index: lib/libc_r/uthread/uthread_kern.c === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/lib/libc_r/uthread/uthread_kern.c,v retrieving revision 1.45 diff -u -r1.45 uthread_kern.c --- lib/libc_r/uthread/uthread_kern.c 5 Oct 2002 02:22:26 - 1.45 +++ lib/libc_r/uthread/uthread_kern.c 7 Aug 2003 20:39:44 - @@ -95,7 +95,7 @@ curthread->check_pending = 1; /* Switch to the thread scheduler: */ - ___longjmp(_thread_kern_sched_jb, 1); + ___longjmp_raw(_thread_kern_sched_jb, 1); } @@ -165,7 +165,7 @@ } } /* Switch to the thread scheduler: */ - ___longjmp(_thread_kern_sched_jb, 1); + ___longjmp_raw(_thread_kern_sched_jb, 1); } void @@ -582,7 +582,7 @@ #if NOT_YET _setcontext(&curthread->ctx.uc); #else - ___longjmp(curthread->ctx.jb, 1); + ___longjmp_raw(curthread->ctx.jb, 1); #endif /* This point should not be reached. */ PANIC("Thread has returned from sigreturn or longjmp"); [EMAIL PROTECTED] gcc -o 1 -g -Wall -pthread 1.c [EMAIL PROTECTED] gdb ./1 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-undermydesk-freebsd"... (gdb) b threadFunc Breakpoint 1 at 0x804861e: file 1.c, line 10. (gdb) run Starting program: /local/petere/1 Breakpoint 1, threadFunc (arg=0x0) at 1.c:10 10 sleep(1); (gdb) n Program received signal SIGSEGV, Segmentation fault. 0x280d0138 in _longjmp () from /usr/lib/libc.so.5 (gdb) _
Re: 5.1, Data Corruption, Intel, Oh my! [patch] - Fatal trap 12
On Wed, 2003-08-13 at 07:38, Terry Lambert wrote: > Peter Edwards wrote: > > > ... He might also want to look for any function pointer > > > that takes 5 arguments; > > > > Nice tactic, but misleading in this case, methinks. > > > > I assume your basing this on the 5 arguments shown in the backtrace. > > The 5 arguments passed to the "function" at 0x5949 is probably just > > defaulted; I doubt it has any significance. > > > > Long version: > > > > ddb tries to work out the number of arguments passed to a function at a > > particular stack frame first based on symbolic information for the > > function itself (obviously not an option here), then based on the > > instruction at the return address in that frame. This works at best > > sporadically in the face of -O compiled C code. The fact that there's no > > function under the "(null)" would strongly suggest that ddb got confused > > with the frame pointer here and didn't get any useful information with > > which to work out the argument count. > > I don't know how accurate this assumption is. I don't thing > DDB is confused, because the NULL is consistent with the reported > fault address. Even if we assume that it's confused, the PC is > enough information to locate the function pointer dereference that > is occurring. I also have to assume that the function pointer is > in scope, since it's able to call through it to fault the kernel. > > > In the face of failure, ddb just wildly prints out the 5 words under the > > stack pointer. > > I did suggest that the correct thing to do would be to decode > what those words were pointing at, and thereby what types the > arguments were... My main point was really just commenting on the your original statement that "He might also want to look for any function pointer that takes 5 arguments". I was assuming that you were suggesting this based on the fact that the stack frame containing the "(null)" indicated 5 arguments passed to the function at 0x5949. DDB has no symbol for this address (it's certainly not a function) and does not know where it returns to (there's is no function below it on the stack). DDB has no other way of working out how many arguments were passed in a particular stack frame. As a result, It is merely showing the first 5 _possible_ argument values to the function. Agreed? ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 5.1, Data Corruption, Intel, Oh my! [patch] - Fatal trap 12
On Tue, 2003-08-12 at 12:52, Terry Lambert wrote: > Bosko Milekic wrote: > > > db> trace > > > _mtx_lock_flags(0,0,c07aa287,11e,c0c21aaa) at _mtx_lock_flags+0x43 > > > vm_fault(c102f000,c000,2,0,c08205c0) at vm_fault+0x2b4 > > > trap_pfault(c0c21b9e,0,c4d8,10,c4d8) at trap_pfault+0x152 > > > trap(6c200018,10,1bc40060,1c,0) at trap+0x30d > > > calltrap() at calltrap+0x5 > > > --- trap 0xc, eip = 0x5949, esp = 0xc0c21bde, dbp = 0xc0c21be4 --- > > > (null)(1bf80058,0,530e0102,80202,505a61) at 0x5949 > > > db> > > ... He might also want to look for any function pointer > that takes 5 arguments; Nice tactic, but misleading in this case, methinks. I assume your basing this on the 5 arguments shown in the backtrace. The 5 arguments passed to the "function" at 0x5949 is probably just defaulted; I doubt it has any significance. Long version: ddb tries to work out the number of arguments passed to a function at a particular stack frame first based on symbolic information for the function itself (obviously not an option here), then based on the instruction at the return address in that frame. This works at best sporadically in the face of -O compiled C code. The fact that there's no function under the "(null)" would strongly suggest that ddb got confused with the frame pointer here and didn't get any useful information with which to work out the argument count. In the face of failure, ddb just wildly prints out the 5 words under the stack pointer. Given that there's no real function at 0x5949, the stack frame won't have been set up at all, the frame pointer is still pointing to the caller's frame, which could be foobar anyway. What can be useful is to print out the values on the stack symbolically. (in gdb, p/a ((void **)$sp)[EMAIL PROTECTED] I'm sure ddb can do something similar, but no idea how...). And hope to find the caller's return address lying in the output. HTH, Peter. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Crash in g_dev_strategy / CURRENT as of yesterday.
On Tue, 2003-08-12 at 20:39, Poul-Henning Kamp wrote: > I'm not of a gdb wizard either, but I think you type "up" or "down" until > you are at stack frame #12, and the simply say "print *bp->b_dev" This might help. The original stack trace had this: > #10 0xc04f3c65 in trap (frame= > {tf_fs = -1059913704, tf_es = -890109936, tf_ds = -1070268400, > tf_edi > = -1040540480, tf_esi = -978597456, tf_ebp = -890095148, tf_isp = > -890095220, tf_ebx = 0, tf_edx = 0, tf_ecx = 0, tf_eax = 16343040, > tf_trapno = 12, tf_err = 2, tf_eip = -1070560519, tf_cs = 8, tf_eflags > = > 66054, tf_esp = -978597456, tf_ss = -1067143852}) at > /usr/src/sys/i386/i386/trap.c:420 > Look at tf_eip: -1070560519 = 0xc0308af9 What does "list *0xc0308af9" show in gdb? ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 5.1-R: zero byte core file.
On Wed, 2003-08-20 at 22:11, Yogeshwar Shenoy wrote: > While using 5.1-RELEASE, I find that if my application program seg > faults, it produces "programname.core"; but it is 0 bytes. > I ran the exact same program on another machine that was running > 4.4-RELEASE, and I do get a core file that I can use with gdb. > I'd really appreciate if someone could help me resolve this. > > Additional details: > - It is not specific to the application program. I tried a 2 line program: > char p[8]; > memcpy(p, "1234567890123456789012345678901234567890", 40); > with same results on 5.1-R(0 byte core file) and 4.4-R(usable core file) > > - "ulimit -a" on the 5.1-R machine gives >core file size(blocks, -c) unlimited > > - Just to be sure I used getrlimit() to find what the limit for > RLIMIT_CORE is in my processes, and it is RLIM_INFINITY. > > - I did the basic checks like write permission on current directory, it > looks fine. > > Can someone help me resolve this? > > Thanks, > Yogeshwar. Hi, Is the core file been dumped to an NFS mount? If so, are you need to be running rpc.lockd or use the "-L" option for the NFS mount. Otherwise, you'll get the behaviour you've noticed. When dumping core, the kernel attempts to get an advisory lock on the corefile, to reduce the overhead in the pathological case where a large number of processes simultaneously start dumping core to the same file. There's a conflict between two default behaviours: rpc.lockd doesn't run by default (disabled in /etc/defaults/rc.conf), and NFS mounts try to use the service it provides unless you use the "-L" option. So, by default, advisory locking (and, by extension, core dumping) doesn't work on NFS mounts. The -stable branch doesn't do client-side NFS locking: It's advisory locking for NFS mounts "works", but is only visible to the local client. HTH, Peter. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
useful workaround and analysis of vnode-backed md deadlock
There's been few reports of deadlocks in md on the lists recently, and I walked into it trying to generate flash images for my shiny new Soekris box. In particular, A previous mail mentioned something getting stuck in "wdrain": (Message-ID <[EMAIL PROTECTED]> from [EMAIL PROTECTED]) For the impatient, a way I found around the problem was to mount the md-backed filesystems with the "sync" option. I analysed the deadlock a little, and here's a synopsis, in case they're of use to anyone. This down as well as I could, and it appears to be an interaction between three processes. This may (and most likely isn't) the only md deadlock, but once I otherwise leave the backing file alone, I don't experience any problems once I mount the filesystem sync, And, because the underlying filesystem is async, access to the md filesystem isn't painfully slower than normal. 1: One thread is operating on the filesystem. In general, this thread is creating dirty buffers for later processing by the bufdaemon, and also making direct write requests. This doesn't actually participate in the deadlock, but does set the stage for it. 2: The "md" thread, processing requests from (1), attempts to lock the vnode for the underlying md device, in order to fulfill a queued write request on the md device. 3: Meanwhile the bufdaemon has kicked in, and is flushing dirty buffers. Some of these are for the files on the md filesystem, some are for the vnode backing the md device itself (actually, I assume that the flushing of the former causes a sudden surge in the latter, as the writes to the md filesystem are converted to writes to the backing vnode) The bufdaemon has locked the md vnode in order to write bufs to it. However, it needs to wait for "runningbufspace", which is designed to limit the number of in-flight async buffer writes. Once the running buffer space exceeds a high threshold, the scheduler is blocked, to be awakened when completed async writes bring it under the low threshold. However, a large chunk of the running buf space is sitting queued for the md thread to process. The md thread can't continue without the vnode lock, so the running buffer space will not fall, and the bufdaemon cannot continue without running buffer space, so will never release the vnode lock. -- Peter Edwards. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Text file busy
Terry Lambert wrote: Wesley Morgan wrote: It's also unfortunate that this protection does not seem to extend to libaries. I've had some in-use X libraries get overwritten with some very colorful results. So send patches. I did a year ago :-) See PR 37554. (Not the original patch, the self-follow-up). That was for 4.5-STABLE: It's been running on a box that does nightly builds of -current and -stable (and infrequent installworlds of -stable) since then without any ill effects. A -current equivalent (with a sysctl knob, "vm.mmap_exec_immutable", to turn the behaviour on/off) is attached, in case anyone's interested. As noted in the original PR, the choice of PROT_EXEC to decide to add VV_TEXT to the vnode might be better done with a new mmap flag, say, PROT_IMMUTABLE or something, but PROT_EXEC works fine for me. Index: sys/vm/vm_mmap.c === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/sys/vm/vm_mmap.c,v retrieving revision 1.165 diff -u -r1.165 vm_mmap.c --- sys/vm/vm_mmap.c7 Sep 2003 18:47:54 - 1.165 +++ sys/vm/vm_mmap.c15 Sep 2003 13:36:46 - @@ -91,6 +91,11 @@ static int max_proc_mmap; SYSCTL_INT(_vm, OID_AUTO, max_proc_mmap, CTLFLAG_RW, &max_proc_mmap, 0, ""); +static int mmap_exec_immutable = 1; +SYSCTL_INT(_vm, OID_AUTO, mmap_exec_immutable, CTLFLAG_RW, +&mmap_exec_immutable, 1, "mmap(2) of a regular file for execute access " +"marks the file as immutable"); + /* * Set the maximum number of vm_map_entry structures per process. Roughly * speaking vm_map_entry structures are tiny, so allowing them to eat 1/100 @@ -443,8 +448,18 @@ error = vm_mmap(&vms->vm_map, &addr, size, prot, maxprot, flags, handle, pos); mtx_lock(&Giant); - if (error == 0) + if (error == 0) { + /* +* If mapping a regular file as PROT_EXEC, and configured to, +* mark the file as immutable +*/ + if (mmap_exec_immutable && + handle != NULL && vp != NULL && + (prot & PROT_EXEC) && vp->v_type == VREG) + vp->v_vflag |= VV_TEXT; td->td_retval[0] = (register_t) (addr + pageoff); + } + done: if (vp) vput(vp); ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Error assigning master socket: Too many open files
Andreas Klemm wrote: Hi, Urgend question, wanna help a collegue, who secured a router, but trying to scan ports fails with -current. I don't want to blame anybody, I know what the policy of current is. If I can't get quick help on this I use a Windows tool, no problem. I only want to save me the work to install this Win tool and I think its interesting, to find out, that there might be a problem. The machine was freshly booted Is there a workaround ? [EMAIL PROTECTED] /usr/ports/security/portscanner/work/PortScanner-1.2 portscanner -vv -v -v -b 1 -e 6 xx.xxx.xxx.xx xx.xxx.xx.xx Error assigning master socket: Too many open files Exit 255 The patch applied by the port appears bogus. It adds braces around an "if" that stops it executing the way it was intended. I've a sneaking suspicion that the braces were added for "clarity", but the indentation in the original file is so badly off that the terminating brace was put in the wrong place. Try replacing patch-ab with this: --- portscanner.c.orig Wed Aug 19 18:37:44 1998 +++ portscanner.c Wed Oct 22 15:28:05 2003 @@ -25,8 +25,8 @@ /***/ #include -#include #include +#include #include #include #include ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
RE: lockmgr panic on shutdown
>I can confirm the lockmgr panic on shutdown reported by someone else >earlier (whose message I mistakenly deleted). > >It looks like swapper is trying to undo a lock from pagedaemon and runs >into trouble. This is probably related to the Giant pushdown of >vm_pageout() that alc did last week. > >I'm building with INVARIANTS to see if that will catch more info. Will >report back soon. Just happened me too. I think I see the problem: When boot() calls sync(), it passes &thread0 as the thread argument. This gets propgated up to ffs_sync, which: calls vget(), which takes a thread argument. does some stuff calls vput(), which does _not_ take a thread argument The vget() is passed thread0, as passed from boot. The vput() gets the current thread, which is the process calling boot. The unlocking in vput is asserting that the same thread that aquired the lock is releasing it, which seems reasonable. The obvious solution might be to change line 1161 of ffs_vfsops to pass vget() "curthread" rather than td. I assume there's a good reason why "thread0" is passed from boot(), but I can't see why that's of any use to the vnode locking. i.e.: Index: ffs_vfsops.c === RCS file: /usr/cvs/FreeBSD-CVS/src/sys/ufs/ffs/ffs_vfsops.c,v retrieving revision 1.221 diff -u -r1.221 ffs_vfsops.c --- ffs_vfsops.c1 Nov 2003 05:51:54 - 1.221 +++ ffs_vfsops.c2 Nov 2003 03:06:42 - @@ -1158,7 +1158,7 @@ continue; } mtx_unlock(&mntvnode_mtx); - if ((error = vget(vp, lockreq, td)) != 0) { + if ((error = vget(vp, lockreq, curthread)) != 0) { mtx_lock(&mntvnode_mtx); if (error == ENOENT) goto loop; How come tha parameters to vget and vput are lopsided like this? This might have something to do with the commit of revision 1.218 of ffs_vfsops.c, but I'm not sure. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
RE: lockmgr panic on shutdown
>> For giggles I'm rolling back vfs_default.c back to 1.87 since its along >> the backtrace path. > >This didn't work so -CURRENT is fully broke. > >I'd suggest staying on 10/30 not before 4PM PST if you want to not crash >on shutdown. > The patch worked for me. (Well, a slightly modified one: I passed 0 for the thread argument to vget: It recognises that as special). Included here is the patch to both the ffs and default "sync" operations. I didn't exercise the default one, but the ffs case is certainly behaving itself. Index: kern/vfs_default.c === RCS file: /usr/cvs/FreeBSD-CVS/src/sys/kern/vfs_default.c,v retrieving revision 1.89 diff -u -r1.89 vfs_default.c --- kern/vfs_default.c 1 Nov 2003 05:51:54 - 1.89 +++ kern/vfs_default.c 2 Nov 2003 03:36:03 - @@ -898,7 +898,7 @@ } mtx_unlock(&mntvnode_mtx); - if ((error = vget(vp, lockreq, td)) != 0) { + if ((error = vget(vp, lockreq, 0)) != 0) { mtx_lock(&mntvnode_mtx); if (error == ENOENT) goto loop; Index: ufs/ffs/ffs_vfsops.c === RCS file: /usr/cvs/FreeBSD-CVS/src/sys/ufs/ffs/ffs_vfsops.c,v retrieving revision 1.221 diff -u -r1.221 ffs_vfsops.c --- ufs/ffs/ffs_vfsops.c1 Nov 2003 05:51:54 - 1.221 +++ ufs/ffs/ffs_vfsops.c2 Nov 2003 03:22:13 - @@ -1158,7 +1158,7 @@ continue; } mtx_unlock(&mntvnode_mtx); - if ((error = vget(vp, lockreq, td)) != 0) { + if ((error = vget(vp, lockreq, 0)) != 0) { mtx_lock(&mntvnode_mtx); if (error == ENOENT) goto loop; ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Bug: nmount(2) lacks parameter checking.
Hi, Looking over the code for nmount(), I think I noticed a few bugs. (tried send-pr, but the lack of a web-front-end at freebsd.org, and a decent mail system locally means that's not a runner) nmount() calls vfs_nmount() pretty much directly after copying in the io vector from userland. vfs_nmount() then calls vfs_buildopts() as the first thing it does. There's a couple of problems here. Firstly, there's no check up to this point that the user passing the options in is indeed root. So, all the bugs mentioned can be tickled by a non-root user. vfs_buildopts doesn't ensure that the option's "name" is a null terminated string, but, it later calls vfs_sanitizeopts(), which assumes this. By passing in strings just at and just over the pagesize, a non-root user can cause a crash in vfs_buildopts reasonably reliably when strcmp to hit an unmappped page. (Program available on request) vfs_buildopts also leaks memory if it jumps to "bad": anything in the current option is lost to the woods. There's also no checking on how much memory is actually aquired by vfs_buildopts(): it can be passed up to MAX_IOVCOUNT (1024) elements in the iovec, and each of these can be up to 64K in size. That's 64M of memory, plus some overhead for option structures, which would be a lot to start chewing up in the kernel. The source I based these observations on is from today, while my kernel is a few weeks old, and I no longer have source for it. Given the traffic on the list recently, I figure now is Not A Good Time to install a fresh kernel, so the patch attached is tested to the point that it compiles, but I think something like it is required. Index: kern/vfs_mount.c === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/sys/kern/vfs_mount.c,v retrieving revision 1.111 diff -u -r1.111 vfs_mount.c --- kern/vfs_mount.c26 Sep 2003 09:07:27 - 1.111 +++ kern/vfs_mount.c4 Nov 2003 21:46:44 - @@ -246,6 +246,8 @@ struct vfsopt *opt; unsigned int i, iovcnt; int error, namelen, optlen; + size_t memTotal = 0; + static const size_t maxMemTotal = 1024 * 64; iovcnt = auio->uio_iovcnt; opts = malloc(sizeof(struct vfsoptlist), M_MOUNT, M_WAITOK); @@ -256,6 +258,26 @@ optlen = auio->uio_iov[i + 1].iov_len; opt->name = malloc(namelen, M_MOUNT, M_WAITOK); opt->value = NULL; + opt->len = optlen; + + /* +* Do this early, so jumps to "bad" will free the current +* option +*/ + TAILQ_INSERT_TAIL(opts, opt, link); + memTotal += sizeof (struct vfsopt) + optlen + namelen; + + /* +* Avoid consuming too much memory, and attempts to overflow +* memTotal +*/ + if (memTotal > maxMemTotal || + optlen > maxMemTotal || + namelen > maxMemTotal) { + error = EINVAL; + goto bad; + } + if (auio->uio_segflg == UIO_SYSSPACE) { bcopy(auio->uio_iov[i].iov_base, opt->name, namelen); } else { @@ -264,7 +286,11 @@ if (error) goto bad; } - opt->len = optlen; + /* Ensure names are null-terminated strings */ + if (opt->name[namelen - 1] != '\0') { + error = EINVAL; + goto bad; + } if (optlen != 0) { opt->value = malloc(optlen, M_MOUNT, M_WAITOK); if (auio->uio_segflg == UIO_SYSSPACE) { @@ -277,7 +303,6 @@ goto bad; } } - TAILQ_INSERT_TAIL(opts, opt, link); } vfs_sanitizeopts(opts); *options = opts; ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
ATAPI-CD corruption since GEOMification (& possible fix)
With a -current built after atapi-cd was changed over to GEOM, reads from a filesystem mounted on a CD device are being corrupted, with junk being inserted into the file from offset 63489 onwards. I had a quick look around atapi-cd.c, and I think I spotted the problem: applying this patch certainly stopped the corruption I was seeing. Anyone else seeing this? Can someone verify that this is indeed the correct fix? My CD device probes as: acd1: CDRW at ata1-slave PIO4 If its of any interest. Index: atapi-cd.c === RCS file: /usr/cvs/FreeBSD-CVS/src/sys/dev/ata/atapi-cd.c,v retrieving revision 1.152 diff -u -r1.152 atapi-cd.c --- atapi-cd.c 7 Nov 2003 08:31:09 - 1.152 +++ atapi-cd.c 8 Nov 2003 21:06:15 - @@ -1018,7 +1018,7 @@ u_int pos, size = cdp->iomax - cdp->iomax % bp->bio_to->sectorsize; struct bio *bp2; - for (pos = 0; pos < bp->bio_length; pos += bp->bio_length) { + for (pos = 0; pos < bp->bio_length; pos += size) { if (!(bp2 = g_clone_bio(bp))) { bp->bio_error = ENOMEM; break; -- Peter Edwards. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Fwd: propgagate_priority() crashes: recursive msleep() ??
(Aplogies if this message is a duplicate: The original is AWOL for quite a while now) Hi, I'm getting a crash in propagate priority, as mentioned by a few people recently. Bug reports and comments about it seemed to have dropped off, so given that I can reliably reproduce it, I was trying to work out why it's going on. One thing I found quite odd was the following stack trace. It appears that msleep() is being called recursively via cursig() calling stopevent. When msleep calls cursig(), it has temporarily dropped Giant. Surely this is bogus? (This is from a a kernel updated in the last few hours) #0 sched_switch (td=0xc4b30780) at /scratch/src/sys/kern/sched_4bsd.c:606 #1 0xc050d8db in mi_switch () at /scratch/src/sys/kern/kern_synch.c:514 #2 0xc050cf7f in msleep (ident=0xc4dc2bc8, mtx=0xc4dc2b04, priority=92, wmesg=0x0, timo=0) at /scratch/src/sys/kern/kern_synch.c:255 #3 0xc0534255 in stopevent (p=0xc4dc2a98, event=2, val=2) at /scratch/src/sys/kern/sys_process.c:740 #4 0xc0509362 in issignal (td=0xc4b30780) at /scratch/src/sys/kern/kern_sig.c:2082 #5 0xc0504eb8 in cursig (td=0xc4b30780) at /scratch/src/sys/sys/signalvar.h:227 #6 0xc050d0f2 in msleep (ident=0xc4dc2a98, mtx=0xc4dc2b04, priority=348, wmesg=0x0, timo=0) at /scratch/src/sys/kern/kern_synch.c:294 #7 0xc04eb82f in wait1 (td=0xc4b30780, uap=0xddcd6d10, compat=0) at /scratch/src/sys/kern/kern_exit.c:766 #8 0xc04eab90 in wait4 (td=0x0, uap=0x0) at /scratch/src/sys/kern/kern_exit.c:548 #9 0xc06241d0 in syscall (frame= {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 134899628, tf_esi = 134912305, tf_ebp = -1077943784, tf_isp = -573739660, tf_ebx = 772, tf_edx = 135012352, tf_ecx = 13, tf_eax = 7, tf_trapno = 12, tf_err = 2, tf_eip = 134525375, tf_cs = 31, tf_eflags = 646, tf_esp = -1077943812, tf_ss = 47}) at /scratch/src/sys/i386/i386/trap.c:1010 ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Who needs these silly statfs changes...
Bernd Walter wrote: On Thu, Nov 13, 2003 at 12:54:18AM -0800, Kris Kennaway wrote: On Thu, Nov 13, 2003 at 06:44:25PM +1100, Peter Jeremy wrote: On Wed, Nov 12, 2003 at 06:04:00PM -0800, Kris Kennaway wrote: ...my sparc machine reports that my i386 nfs server has 15 exabytes of free space! enigma# df -k Filesystem 1K-blocks Used Avail Capacity Mounted on rot13:/mnt2 56595176 54032286 18014398507517260 0%/rot13/mnt2 18014398507517260 = 2^54 - 1964724. and 2^54KB == 2^64 bytes. Is it possible that rot13:/mnt2 has negative free space? (ie it's into the 8-10% reserved area). Yes, that's precisely what it is..the bug is either in df or the kernel (I suspect the latter, i.e. something in the nfs code). And it's nothing new - I'm seeing this since several years now. The NFS protocols have unsigned fields where statfs has signed equivalents: NFS can't represent negative available disk space ( Without the knowledge of the underlying filesystem on the server, negative free space is a little nonsensical anyway, I suppose) The attached patch stops the NFS server assigning negative values to unsigned fields in the statfs response, and works against my local solaris box. Seem reasonable? Index: nfs_serv.c === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/sys/nfsserver/nfs_serv.c,v retrieving revision 1.137 diff -u -r1.137 nfs_serv.c --- nfs_serv.c 24 Oct 2003 18:36:49 - 1.137 +++ nfs_serv.c 14 Nov 2003 13:27:42 - @@ -3812,7 +3812,7 @@ tval = (u_quad_t)sf->f_bfree; tval *= (u_quad_t)sf->f_bsize; txdr_hyper(tval, &sfp->sf_fbytes); - tval = (u_quad_t)sf->f_bavail; + tval = sf->f_bavail > 0 ? (u_quad_t)sf->f_bavail : 0; tval *= (u_quad_t)sf->f_bsize; txdr_hyper(tval, &sfp->sf_abytes); sfp->sf_tfiles.nfsuquad[0] = 0; @@ -3827,7 +3827,8 @@ sfp->sf_bsize = txdr_unsigned(sf->f_bsize); sfp->sf_blocks = txdr_unsigned(sf->f_blocks); sfp->sf_bfree = txdr_unsigned(sf->f_bfree); - sfp->sf_bavail = txdr_unsigned(sf->f_bavail); + sfp->sf_bavail = txdr_unsigned(sf->f_bavail > 0 ? + sf->f_bavail : 0); } nfsmout: if (vp) ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Fwd: propgagate_priority() crashes: recursive msleep() ??
John Baldwin wrote: On 14-Nov-2003 Peter Edwards wrote: (Aplogies if this message is a duplicate: The original is AWOL for quite a while now) Hi, I'm getting a crash in propagate priority, as mentioned by a few people recently. Bug reports and comments about it seemed to have dropped off, so given that I can reliably reproduce it, I was trying to work out why it's going on. One thing I found quite odd was the following stack trace. It appears that msleep() is being called recursively via cursig() calling stopevent. When msleep calls cursig(), it has temporarily dropped Giant. Surely this is bogus? (This is from a a kernel updated in the last few hours) #0 sched_switch (td=0xc4b30780) at /scratch/src/sys/kern/sched_4bsd.c:606 #1 0xc050d8db in mi_switch () at /scratch/src/sys/kern/kern_synch.c:514 #2 0xc050cf7f in msleep (ident=0xc4dc2bc8, mtx=0xc4dc2b04, priority=92, wmesg=0x0, timo=0) at /scratch/src/sys/kern/kern_synch.c:255 #3 0xc0534255 in stopevent (p=0xc4dc2a98, event=2, val=2) at /scratch/src/sys/kern/sys_process.c:740 #4 0xc0509362 in issignal (td=0xc4b30780) at /scratch/src/sys/kern/kern_sig.c:2082 #5 0xc0504eb8 in cursig (td=0xc4b30780) at /scratch/src/sys/sys/signalvar.h:227 #6 0xc050d0f2 in msleep (ident=0xc4dc2a98, mtx=0xc4dc2b04, priority=348, wmesg=0x0, timo=0) at /scratch/src/sys/kern/kern_synch.c:294 #7 0xc04eb82f in wait1 (td=0xc4b30780, uap=0xddcd6d10, compat=0) at /scratch/src/sys/kern/kern_exit.c:766 #8 0xc04eab90 in wait4 (td=0x0, uap=0x0) at /scratch/src/sys/kern/kern_exit.c:548 #9 0xc06241d0 in syscall (frame= {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 134899628, tf_esi = 134912305, tf_ebp = -1077943784, tf_isp = -573739660, tf_ebx = 772, tf_edx = 135012352, tf_ecx = 13, tf_eax = 7, tf_trapno = 12, tf_err = 2, tf_eip = 134525375, tf_cs = 31, tf_eflags = 646, tf_esp = -1077943812, tf_ss = 47}) at /scratch/src/sys/i386/i386/trap.c:1010 Are you using gdb or something else that does ptrace? Jeff has pointed out why pp panics here, because this thread owns the sigacts lock while asleep. However, doing a double sleep like this is very bogus and bad. G. I was using "truss": the actual command I ran was # truss mount unreachablehost:/mnt /mnt (where "unreachablehost" was the IP address of a host I had no route to) IIRC, the panicing thread was in softclock (possibly handling the terminal ^C, not sure), the mount command was waiting on the mount_nfs child to finish, and I assume the mount_nfs child was waiting in vain for a response it was never going to get. But, I suppose any traced process arriving in msleep (or cursig) is problematic. Silly question: Could the STOPEVENT stuff in issignal() just be delayed until userret()? I thought that was done for some other similar circumstances. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Who needs these silly statfs changes...
Bruce Evans wrote: On Fri, 14 Nov 2003, Peter Edwards wrote: Bernd Walter wrote: On Thu, Nov 13, 2003 at 12:54:18AM -0800, Kris Kennaway wrote: On Thu, Nov 13, 2003 at 06:44:25PM +1100, Peter Jeremy wrote: On Wed, Nov 12, 2003 at 06:04:00PM -0800, Kris Kennaway wrote: ...my sparc machine reports that my i386 nfs server has 15 exabytes of free space! enigma# df -k Filesystem 1K-blocks Used Avail Capacity Mounted on rot13:/mnt2 56595176 54032286 18014398507517260 0% /rot13/mnt2 18014398507517260 = 2^54 - 1964724. and 2^54KB == 2^64 bytes. Is it possible that rot13:/mnt2 has negative free space? (ie it's into the 8-10% reserved area). Yes, that's precisely what it is..the bug is either in df or the kernel (I suspect the latter, i.e. something in the nfs code). And it's nothing new - I'm seeing this since several years now. The NFS protocols have unsigned fields where statfs has signed equivalents: NFS can't represent negative available disk space ( Without the knowledge of the underlying filesystem on the server, negative free space is a little nonsensical anyway, I suppose) The attached patch stops the NFS server assigning negative values to unsigned fields in the statfs response, and works against my local solaris box. Seem reasonable? The client attampts to fix this by pretending that the unsigned fields are signed. -current tries to do more to support file system sizes larger that 1TB, but the code for this is not even wrong except it may be wrong enough to break the negative values. See my reply to one of the PRs for more details. I just got around to testing the patch in that reply: %%% Index: nfs_vfsops.c === RCS file: /home/ncvs/src/sys/nfsclient/nfs_vfsops.c,v Your patch to nfs_vfsops won't apply to my Solaris kernel :-) The protocol says "abytes" is unsigned, so the server shouldn't be lying by sending a huge positive value for available space on a full filesystem. No? ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Who needs these silly statfs changes...
>On Fri, 14 Nov 2003, Peter Edwards wrote: > >> Bruce Evans wrote: >> >> > On Fri, 14 Nov 2003, Peter Edwards wrote: > >> >> The NFS protocols have unsigned fields where statfs has signed >> >> equivalents: NFS can't represent negative available disk space ( Without >> >> the knowledge of the underlying filesystem on the server, negative free >> >> space is a little nonsensical anyway, I suppose) >> >> >> >> The attached patch stops the NFS server assigning negative values to >> >> unsigned fields in the statfs response, and works against my local >> >> solaris box. Seem reasonable? >> > >> > The client attampts to fix this by pretending that the unsigned fields >> > are signed. -current tries to do more to support file system sizes larger >> > that 1TB, but the code for this is not even wrong except it may be wrong >> > enough to break the negative values. See my reply to one of the PRs >> > for more details. >> > >> > I just got around to testing the patch in that reply: >> > ... >> >> Your patch to nfs_vfsops won't apply to my Solaris kernel :-) >> The protocol says "abytes" is unsigned, so the server shouldn't be lying >> by sending a huge positive value for available space on a full >> filesystem. No? > >Possibly not, but the protocol is broken if it actually requires that. What makes you say that? I would think the utility of negative counts for disk sizes and available spaces is marginal. Solaris, POSIX, and NFS seem to get on fine without it. What am I (and they) missing? >The "free" fields are signed in struct statfs so that they can be negative. >However, this is broken in POSIX's struct statvfs (all count fields have >type fsblkcnt_t or fsfilcnt_t and these are specified to be unsigned). >Is Solaris bug for bug compatible with that? I'm away from any solaris boxes at the moment, but I know that "df" certainly reports huge free space when it sees the high bit set in the asize attribute of the NFS response. I'm not sure that this is directly relevant to NFS, though. Whatever the operating system's representation of FSSTATres is little to do with proper implementation of the protocol. I've no idea what Win32 represents this data in, but I'm sure it's very different from statv?fs. It'll provide and consume NFS services, though. > >Anyway, my patch is mainly supposed to fix the scaling. The main bug >in the initial scaling patch was that the huge positive values were >scaled before they were interpreted as negative values, so they became >not so huge but still preposterous values that could not be interpreted >as negative values. Ok, Understood. > >The type pun to negative values is in most versions of BSD: > [snip code snippets and bug] That's great for interacting with other BSDs, but it still abusing the protocol. As filesystems with approaching 2^64 bytes become possible it probably has more of an impact. I understand that this is proably not a big enough problem to break historic behaviour for, of course. >More changes are needed here to catch up with the recent changes to struct >statfs in FreeBSD. The casts to long are now just wrong since the block >count fields don't have type long. Sure. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Who needs these silly statfs changes...
(CC's trimmed, I'm sure I'm boring people at this stage.) >Peter Edwards wrote: >> >>>On Wed, Nov 12, 2003 at 06:04:00PM -0800, Kris Kennaway wrote: >> >>>>...my sparc machine reports that my i386 nfs server has 15 exabytes >of >> >>>>free space! > >[ ... ] > >> The NFS protocols have unsigned fields where statfs has signed >> equivalents: NFS can't represent negative available disk space ( Without >> the knowledge of the underlying filesystem on the server, negative free >> space is a little nonsensical anyway, I suppose) >> >> The attached patch stops the NFS server assigning negative values to >> unsigned fields in the statfs response, and works against my local >> solaris box. Seem reasonable? > >I disagree. > >The intent of the negative number from df is to subtract the amount >used from the total amount available, in order to get the amount >remaining. I just don't see how you can possibly infer from the NFS spec that "abytes" is anything other than an unsigned quantity. I just think assuming the client will interpret a massive value as probably negative is a bit of a leap of faith. >This is an artifact of implementation on the server, and should >not be second-guessed by the client. If the server tells the client that there are 2^64 - 1 bytes remaining on the server, it's not second guessing anything by presenting that to the user. > >The problem in this case is on the client, not the server, in not >doing the conversion as an unsigned operation. The place for the >subtraction to occur is in the "df" program. In other words, the >statfs->f_bavail should be recalculated locally from the values >of statfs->f_blocks and statfs->f_bfree, not used directly out of >the (unsigned) NFS values... or the values should be converted to >signed values coming out of NFS prior to their sign extension to >the size type. (Note that NFS also gives a "fbytes", indicating the number of free bytes, as opposed to "available to a particular user") "bavail" can really only be worked out by the server. The server is reserving a percentage for non-root user. The client can't work out what that reserve percentage is. > > >On a slightly related note, the standards mandated interfaces say >that the values should be fsblkcnt_t, which must be an unsigned >integer type. This coordinates well with my point of the sign >conversion on legacy interface needing to happen at presentation >time. > Maybe we're talking across each other. That's my main point: the server shouldn't put huge values in an unsigned field and expect the client to interpret them in a way that the spec sets no precedent for. >Also, if you read the ISO C99 standard, you'll see that on an >ILP32 system, there is no way to legitimately define an integer >type in excess of 32 bits, unless long is larger than 32 bits >(see section 3.6), so defining these things as 64 bits without >compiler changes is wrong anyway. As far as implementing NFS is concerned, that's probably not relevant: It doesn't have to be implemented in ISO C. The FreeBSD compiler provides a 64-bit integer type that its implementation is free to use :-) ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: imgact_gzip.c
On Sat, 2003-06-07 at 10:13, Poul-Henning Kamp wrote: > In message <[EMAIL PROTECTED]>, David Yeske writes: >> imgact_gzip.c seems to be pretty stale. Has anyone considered fixing this? If >> this were fixed >> then kldload() / linker_load_module() could deal with a gzipped .ko file, and >> gzipped elf >> executables would work also? > > At least originally imgact_gzip.c was heavily a.out aware. Interesting. Making imgact_gzip elf-aware would not make the kernel capable of loading gzipped modules, only executables. There's a separate link_elf.c that the kernel uses for linking ELF images into itself (rather than activating ELF executables at exec() time, with imgact_elf) I've been fiddling a little with compressed data in the kernel already, and was able to hack together a patch for link_elf pretty quickly. The "quickly" means that there's no boot loader support, and the gzip handling is quite braindamaged, extracting the entire zipped file into allocated memory before parsing the ELF structure. This is mainly because the ELF parsing bits of link_elf assume they can make random access to the file. It'd take a bit of rework to make it work with a serial data stream. The whole thing's very rough around the edges, but I can gzip most of /boot/kernel/*.ko, and load the gzipped versions. I can polish this up, and/or add gzipped executable support, if there's any interest in reviewing or committing it. The patch adds "GZLOADER" and "INFLATE" options for the kernel, removing "GZIP" (which was busted anyway, and considered "inflate.c" to be part of the ELF support, while it's pretty much a standalone decompressor.) There's a "COMPAT_GZAOUT" option added, but it's just as bust as GZIP was before. E&OE. Patch may crash your kernel, delete your data, make your cat unwell, etc. Cheers, Peter.Index: conf/files === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/sys/conf/files,v retrieving revision 1.791 diff -u -r1.791 files --- conf/files 9 Jun 2003 19:25:06 - 1.791 +++ conf/files 11 Jun 2003 12:48:40 - @@ -1011,7 +1011,7 @@ isofs/cd9660/cd9660_vnops.coptional cd9660 kern/imgact_elf.c standard kern/imgact_shell.cstandard -kern/inflate.c optional gzip +kern/inflate.c optional inflate kern/init_main.c standard kern/init_sysent.c standard kern/kern_acct.c standard Index: conf/files.i386 === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/sys/conf/files.i386,v retrieving revision 1.445 diff -u -r1.445 files.i386 --- conf/files.i386 31 May 2003 17:06:19 - 1.445 +++ conf/files.i386 11 Jun 2003 12:51:27 - @@ -407,7 +407,7 @@ isa/syscons_isa.c optionalsc isa/vga_isa.c optionalvga kern/imgact_aout.c optionalcompat_aout -kern/imgact_gzip.c optionalgzip +kern/imgact_gzip.c optionalcompat_gzaout libkern/divdi3.c standard libkern/moddi3.c standard libkern/qdivrem.c standard Index: conf/options === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/sys/conf/options,v retrieving revision 1.393 diff -u -r1.393 options --- conf/options18 May 2003 03:46:30 - 1.393 +++ conf/options11 Jun 2003 12:54:32 - @@ -603,3 +603,8 @@ # options for hifn driver HIFN_DEBUG opt_hifn.h HIFN_RNDTEST opt_hifn.h + +# options for gzip/"inflate" related functionality +INFLATEopt_inflate.h +COMPAT_GZAOUT opt_gzaout.h +GZLOADER opt_gzloader.h Index: kern/link_elf.c === RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/sys/kern/link_elf.c,v retrieving revision 1.73 diff -u -r1.73 link_elf.c --- kern/link_elf.c 12 May 2003 15:08:10 - 1.73 +++ kern/link_elf.c 11 Jun 2003 13:24:50 - @@ -28,6 +28,7 @@ #include "opt_ddb.h" #include "opt_mac.h" +#include "opt_gzloader.h" #include #include @@ -42,6 +43,10 @@ #include #include +#ifdef GZLOADER +#include +#endif + #include #ifdef GPROF #include @@ -98,9 +103,40 @@ #endif } *elf_file_t; +struct vnreader { + struct vnode *vnodep; + struct thread *thread; +}; + +#ifdef GZLOADER +#define MAXGZPAGES (1024 * 1024 / PAGE_SIZE) // Allow modules up to 1MB (uncompressed) + +struct gzreader { + /* reading from gzipped file. */ + int error; + struct vnode *vn; + unsigned char *inPage; + struct thread *td; + int inPageSize; + int inPageOffset; + off_t inFileOffset; + int inPageCount; + + /* gzip context */ + struct inflate inflator; + + /* Writing to inflated output */ + int outPa
Re: NFS problem
Hi, > > All the files are 0-sized, dates are set back to the epoch and > > directories are seen as files. Exporting ufs2 filesystems works as > > expected. I've had problems like this exporting CDs via NFS to solaris. Sorry the details are murky, but if its the same problem, there's a work-around. Check the dmesg output: does it complain about an "RRIP field" from the cd9660 code? From the source, I think it was "RRIP without PX field?" The CDs in question were official Sun CDs with Solaris applications (which, of course, doesn't mean their properly compliant to a standard, just that it's likely others will run into the same problem) If this is the issue, then mounting it with NFS v2 actually fixed the problem for me: I assume the richer operations from v3 were tickling a problem not noticed with v2. -- Peter ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Looking for comments on a new utility...
Solaris has something similar in /usr/proc/bin/ptree. One of the things it lets you do is specify _which_ user to use. Isn't the kvm_*() interface somewhat frowned upon? Is there anything missing from /proc that you need kvm_* for? -- Cheers, Peter. Juli Mallett wrote: > Hej, > > As some of you may have noticed, I've done some poking of ps(1) lately, and > this has brought attention of people who have ideas for things that they > would like to see done to ps(1) :) The most notable request was for a > feature I've missed having in our ps(1) for a while, the ability to get a > tree of processes printed so you can tell who is whose child, etc. > > ps(1)'s internals, however, didn't seem quite right to me, but after about > 10 minutes reading kvm(3) manpages and recalling some tricks with recursive > programming to produce an N-level tree with as many as N-1 elements, I had > come up with a simple utility to print out a "process tree". > > You can find the code here: > http://people.freebsd.org/~jmallett/.proctree/proctree.c > > And some example output from a cluster machine here: > http://people.freebsd.org/~jmallett/.proctree/proctree.out > > Lots of people have given feedback that they don't care much for the \_ > formatting of the tree, and I'm willing to look at patches that provide > noticably more readable output. > > I'd actually like to hear what information otherwise could better be > included along with associated login, pid, cpu, etc. > > And I'd really like to hear thoughts about inclusion of this into the tree. > Does anyone hold the opinion that it absolutely cannot be included? Does > anyone have any suggestions to make the code better? > > I'm asking you guys, the CURRENT userbase, since you are users who obviously > seem to take more of an interest in FreeBSD's future, etc. :) > > Thanks, > juli. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: different packing of structs in kernel vs. userland ?
Hi, > He's making the valid point that for: > > struct foo *fee; > > It's possible that: > > sizeof(struct foo) != (((char *)&fee[1]) - ((char *)&fee[0])) Wouldn't that mean .. struct X *xarr = malloc(sizeof (struct X) * arrayLen); wouldn't produce a useable array of struct X of length arrayLen? That can't be right. -- Peter Edwards. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Bug unmounting readonly NTFS partitions uncovered by GEOM
ntfs_unmount has had this bug since v 1.1: there's a "ronly" variable that should be used to detect if the mount is read-only and affect the flags passed to the VOP_CLOSE of the device vnode accordingly. It's never set to anything other than zero, but the matching VOP_OPEN in ntfs_mount() gets it right. GEOM notices the mismatch between the VOP_OPEN and VOP_CLOSE on the device, and panics in g_access_rel() when unmounting a read-only mounted NTFS partition. Does someone want to commit the obvious patch? petere@celery$ cvs -R diff -u sys/fs/ntfs cvs diff: Diffing sys/fs/ntfs Index: sys/fs/ntfs/ntfs_vfsops.c === RCS file: /usr/FreeBSD-CVS/src/sys/fs/ntfs/ntfs_vfsops.c,v retrieving revision 1.47 diff -u -r1.47 ntfs_vfsops.c --- sys/fs/ntfs/ntfs_vfsops.c 27 Sep 2002 18:27:06 - 1.47 +++ sys/fs/ntfs/ntfs_vfsops.c 13 Oct 2002 15:30:01 - @@ -508,6 +508,7 @@ vinvalbuf(ntmp->ntm_devvp, V_SAVE, NOCRED, td, 0, 0); + ronly = (mp->mnt_flag & MNT_RDONLY) != 0; error = VOP_CLOSE(ntmp->ntm_devvp, ronly ? FREAD : FREAD|FWRITE, NOCRED, td); -- Peter Edwards To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Floating point problems
Hi, There was some discussion about issues with interactions between the floating point context and signal handling in a thread a week or so ago, and a suggestion that someone try and get a simple test that would fail. I was surprised how easy it was: The following program just spins calculating the value of 6.0 / 3.0, and traps SIGINT. If you run it on -current (as of a few hours ago), 99% of the time, hitting ctl-C will cause the program to exit with an error. A 4.5 kernel never causes any problems. I'm pretty sure this is what's causing the stalls and crashes in X. I've taken stack traces of crashes, and from "spinning" processes, and I can spot NaNs on the stack that shouldn't be there, etc. I just thought it might be useful to send this, to firmly take the blame away from the X server, and give someone a way of reproducing the bug. fptest.c: #include #include #include #include int count = 0; void sigint(int sig) { write(2, "INT\n", 4); count++; } int main(int argc, char **argv) { double x = 6.0; double y = 3.0; double d; double err; struct sigaction sa; sa.sa_handler = sigint; sa.sa_flags = 0; sigemptyset(&sa.sa_mask); sigaction(SIGINT, &sa, 0); while (count < 30) { d = x / y; err = 2.0 - d; if (err != 0.0) { fprintf(stderr, "err %f!\n", err); exit(-1); } } } -- Peter Edwards. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Floating point problems
Well, that's certainly fixed the problems my test app had. As for X: I was regularly able to hurt X by clicking randomly on the "transfers" window in Opera, and switching between it and other internal frames: symptoms included SEGVs, minute-long hangs, etc. Invoking such rain-dances failed to produce any positive results after about 5 mins., which is 3-4 times longer than its ever been up before under the same stress. I'll report back in about 24 hours either way, but I think that's cured it. -- Peter. Bruce Evans <[EMAIL PROTECTED]> wrote: > > On Thu, 24 Oct 2002, Peter Edwards wrote: > > > There was some discussion about issues with interactions between the floating > > point context and signal handling in a thread a week or so ago, and a suggestion > > that someone try and get a simple test that would fail. I was surprised how > > easy it was: The following program just spins calculating the value of 6.0 / > > 3.0, and traps SIGINT. > > > > If you run it on -current (as of a few hours ago), 99% of the time, hitting > > ctl-C will cause the program to exit with an error. A 4.5 kernel never causes > > any problems. > > > > I'm pretty sure this is what's causing the stalls and crashes in X. I've taken > > stack traces of crashes, and from "spinning" processes, and I can spot NaNs on > > the stack that shouldn't be there, etc. > > Thanks. This makes the main bug clear. The PCB_NPXINITDONE bit in the > state was not being restored. This was confusing to debug because gdb > doesn't understand this bug so it shows the state that should have been > restored until npxdna() unrestores it consistently. Try this fix. > > %%% > Index: npx.c > === > RCS file: /home/ncvs/src/sys/i386/isa/npx.c,v > retrieving revision 1.133 > diff -u -2 -r1.133 npx.c > --- npx.c 20 Oct 2002 17:30:30 - 1.133 > +++ npx.c 24 Oct 2002 14:20:33 - > @@ -1004,4 +1007,5 @@ > bcopy(addr, &td->td_pcb->pcb_save, sizeof(*addr)); > } > + curthread->td_pcb->pcb_flags |= PCB_NPXINITDONE; > } > > %%% > > Bruce > > -- Peter Edwards. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Floating point problems
I can also confirm that my X server has been rock solid since applying the patch. I think Bruce deserves a lollipop. Juli Mallett <[EMAIL PROTECTED]> wrote: > > * De: Bruce Evans <[EMAIL PROTECTED]> [ Data: 2002-10-24 ] > [ Subjecte: Re: Floating point problems ] > > Thanks. This makes the main bug clear. The PCB_NPXINITDONE bit in the > > state was not being restored. This was confusing to debug because gdb > > doesn't understand this bug so it shows the state that should have been > > restored until npxdna() unrestores it consistently. Try this fix. > > FWIW, this fixes every reproducable hang I've had with X and related. > > Thanks! > juli. > -- > Juli Mallett <[EMAIL PROTECTED]> | FreeBSD: The Power To Serve > Will break world for fulltime employment. | finger [EMAIL PROTECTED] > http://people.FreeBSD.org/~jmallett/ | Support my FreeBSD hacking! > -- Peter Edwards. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Contemplating THIS change to signals.
I mentioned something similar for a different reason. Go look at the last part of the following message in the recent -hackers archives: > Subject: ptrace bug? > MessageId: <[EMAIL PROTECTED]> (this was for -stable, BTW) Having the suspend for the ptrace()ing parent done in issignal is a pain when issignal is called multiple times by the debuggee before getting back to userland. The debugger sees wait() return more than once reporting the same signal depending on where the victim process was when it stopped, and the code path back to userland. It's particularly noticable when the ptrace(PT_CONTINUE) tries to continue the process with a signal, and the signal arrives back at the debugger. The debugger has no idea wheather it sees the signal because it tried to send it, or the same signal raised for some other reason. The results of this can be seen in GDB frequently when you use the "signal" command. I suggested moving the ptrace handling to postsig(), but anything along the lines of what you are mentioning seems like an improvement to me, and I'm sure you're much more likely to know what you're doing :-) I only delved in here briefly to try and work out some of the non-obvious behaviour of ptrace(). Oh, and while your mucking with issignal(), any chance of looking at the bug raised (and the patch) in the start of that message, and in PR kern/35175? :-) -- Peter Julian Elischer wrote: > > Maybe this should be in -arch.. I couldn;t make my mind up, > but.. > > There is some behaviour in signals which seems > 1/ un-neccesary > 2/ potentially dangerous. > in addition it is > 3/ Definitly incompatible with KSEs. > > I am hoping that someone can give me a good reason why it is done, and > failing that, I'm hoping people can give comments on my thoughts. > > The behavious in question was inherrited from BSD4.4-LITE2 > > When the sleep code (tsleep,msleep, cv{stuff}) > checks to see if there is a pending signal that might cause the sleep > to abort, it calls CURSIG() which calls issignal, > which in turn might decide to actually suspend the process. > (if the user hit ^Z for example) > > This is fine when CURSIG is called from userret(), because we are on the > user boundary, however calling it from the sleep() > call seems a rather UN-NICE thing to do. > > One could argue that it is safe because you are not allowed to sleep > while holding resources (um is it not possible to sleep > while holding a vnode?) but it seems that it is possible to hit ^Z > at teh right moment while something is holding some resource > (during what it expects to be a very short term sleep,) and end up > blocking the whole system. > > I would argue that a process can be considered to be suspended even while > it is running in kernel space. My definition of a suspended process > would be one that id not running any user code. it is not making any > headway on the userland program. This I put it to the group that > it is sufficient to only suspend a process when it is crossing the user > boundary. (returning to user space) > > My suggestion is to remove teh code in issignal() that perfoms the > blocking actions and create a separate function that does that action. > I would then call that function from userret() immediatly after the call > to issignal(). The result would be that > suspended processes would still not reach userland, but processes would > not have to option of suspending indefinitly at sleep(). > > The signal would still cut short the sleep, but the process would be > allowed to proceed to the user boundary, at which point it would > be suspended as before. > > If anyone has any reasons they think this is a bad idea, then please speak > up. Neithe Matt (Dillon) nor I can see that stopping in msleep > is required, and both of us are in fact un-easy with it. > > In a THREADED world it gets even more complicated, because > the SUSPENDED state is a PER_PROCESS state, which means that > you are not suspented until ALL THERADS have left userland > and been counted as 'suspended'. > Having some threads stopped 'near' msleep and others stopped at the > userland boundary is asking for trouble in my opinion. > > I can not think of any downside to making the suspension > (whether from ptrace, or a signal) only occur at the user boundary. > > If I hear NO arguments I'll take it that no-one can think of any reasons > to not change the code. If yuo have a reason PLEASE speak up so that we > can discuss it and try figure out whether it is real or can be > gotten around in some manner. > > Julian > > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: comparing executables
>Compare them without the ELF headers, a section at a time, so >that the timestamps are irrelevent. >From what I recall, there _are_ no timestamps in ELF images, and compiling the same executable multiple times locally here seems to bear out the fact: "cmp" on two successive outputs is identical. The only timestamps that could be there would be in the ident/which strings buried in the code, which would appear in the ELF section payload: so stripping off the headers won't help. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: vn.ko load/unload/mount = panic
Hi, After a (very) quick look at the source it looks like there's a missing cdevsw_remove() missing from the MOD_UNLOAD/MOD_SHUTDOWN event handling I haven't time to test it, but try this: *** vn.c.oldWed Apr 26 16:23:03 2000 --- vn.cWed Apr 26 16:24:06 2000 *** *** 762,767 --- 762,768 case MOD_UNLOAD: /* fall through */ case MOD_SHUTDOWN: + cdevsw_remove(&vn_cdevsw); for (;;) { vn = SLIST_FIRST(&vn_list); if (!vn) Maxim Sobolev wrote: > > Hi, > > I've already submitted this crash report earlier but it seems that developers > in -current list are too busy discussing whether Matt allowed to commit his SMP > work into 4.0 to pay attention to "ordinary" panic reports :-(. Following is > slightly simplified course of actions which is known to produce kernel panic on > both 4.0 and 5.0: > > root@notebook# kldstat > Id Refs AddressSize Name > 12 0xc010 1c2f48 kernel > 21 0xc02c3000 30c8 splash_bmp.ko > root@notebook# mount /dev/vn0c /mnt > mount: Device not configured > root@notebook# kldload /modules/vn.ko > root@notebook# kldstat > Id Refs AddressSize Name > 13 0xc010 1c2f48 kernel > 21 0xc02c3000 30c8 splash_bmp.ko > 31 0xc0823000 3000 vn.ko > root@notebook# kldunload -i 3 > root@notebook# mount /dev/vn0c /mnt > [BINGO] > Fatal trap 12: page fault while in kernel mode > [...] > > -Maxim > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: vn.ko load/unload/mount = panic
Sorry, I think that fix is incomplete (though it'll prolly stop the crashes). I think there should be a destroy_dev() call for each created device in the MOD_UNLOAD case also. I'll make a patch and send-pr it once I get back to my home machine, unless someone more experienced feels the need to do it. -- Peter. "Peter Edwards (local)" wrote: > > Hi, > After a (very) quick look at the source it looks like there's a missing > cdevsw_remove() missing from the MOD_UNLOAD/MOD_SHUTDOWN event handling > I haven't time to test it, but try this: > > *** vn.c.oldWed Apr 26 16:23:03 2000 > --- vn.cWed Apr 26 16:24:06 2000 > *** > *** 762,767 > --- 762,768 > case MOD_UNLOAD: > /* fall through */ > case MOD_SHUTDOWN: > + cdevsw_remove(&vn_cdevsw); > for (;;) { > vn = SLIST_FIRST(&vn_list); > if (!vn) > > Maxim Sobolev wrote: > > > > Hi, > > > > I've already submitted this crash report earlier but it seems that developers > > in -current list are too busy discussing whether Matt allowed to commit his SMP > > work into 4.0 to pay attention to "ordinary" panic reports :-(. Following is > > slightly simplified course of actions which is known to produce kernel panic on > > both 4.0 and 5.0: > > > > root@notebook# kldstat > > Id Refs AddressSize Name > > 12 0xc010 1c2f48 kernel > > 21 0xc02c3000 30c8 splash_bmp.ko > > root@notebook# mount /dev/vn0c /mnt > > mount: Device not configured > > root@notebook# kldload /modules/vn.ko > > root@notebook# kldstat > > Id Refs AddressSize Name > > 13 0xc010 1c2f48 kernel > > 21 0xc02c3000 30c8 splash_bmp.ko > > 31 0xc0823000 3000 vn.ko > > root@notebook# kldunload -i 3 > > root@notebook# mount /dev/vn0c /mnt > > [BINGO] > > Fatal trap 12: page fault while in kernel mode > > [...] > > > > -Maxim > > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > > with "unsubscribe freebsd-current" in the body of the message > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
buildworld breakage in getconf
Compiling 5.0-CURRENT on 4.0-STABLE generates problems in getconf: ===> usr.bin/getconf gperf -t -L ANSI-C -C -k 1,2,7-10,21,'$' /usr/current/src/usr.bin/getconf/confstr.gperf >confstr.c /* starting time is 11:48:16 */ gperf: unrecognized option `-L' usage: gperf [-acCdDef[num]gGhHijkKlnNoprsStTv]. (type gperf -h for help) *** Error code 1 Stop in /usr/current/src/usr.bin/getconf. *** Error code 1 Stop in /usr/current/src/usr.bin. *** Error code 1 Stop in /usr/current/src. *** Error code 1 Stop in /usr/current/src. *** Error code 1 Stop in /usr/current/src. It seems the makefile is using the /usr/bin/gperf, which doesn't have the -L option on 4.0-STABLE -- Peter. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message