ffs_blkfree: freeing free block (was Re: Panic (pmap))
In message <[EMAIL PROTECTED]>, Matthew Dillon writes: >:(Mind you, we're seeing "freeing free block" panics on a NFS server >:with full disks on 3.4). >: >: David. > >With softupdates turned on? Softupdates has known problems when a >disk runs out of space. No, softupdates is not enabled on this filesystem. Other filesystems on the same machine that are used lightly and are not full do have softupdates enabled though. (we have v1.21.2.3 of ffs_balloc.c anyway) The panics have been occurring every few days recently, and their timing always corresponds within a few minutes to the filesystem becoming full (/var/log/messages will have a flurry of 'file system full' messages just before the reboot). An example trace is below. Let me know if any further information would be useful. Ian #0 boot (howto=256) at ../../kern/kern_shutdown.c:285 #1 0xc0148311 in panic (fmt=0xc0234c11 "ffs_blkfree: freeing free block") at ../../kern/kern_shutdown.c:446 #2 0xc01ca29f in ffs_blkfree (ip=0xc2303000, bno=18512, size=8192) at ../../ufs/ffs/ffs_alloc.c:1328 #3 0xc01ccab2 in ffs_indirtrunc (ip=0xc2303000, lbn=-2061, dbn=15492288, lastbn=-1, level=1, countp=0xcb2aab2c) at ../../ufs/ffs/ffs_inode.c:498 #4 0xc01cc528 in ffs_truncate (vp=0xcb3f1cc0, length=0, flags=0, cred=0xc21a8f84, p=0xcb254700) at ../../ufs/ffs/ffs_inode.c:315 #5 0xc01d8ca9 in ufs_setattr (ap=0xcb2aac98) at ../../ufs/ufs/ufs_vnops.c:493 #6 0xc01db11d in ufs_vnoperate (ap=0xcb2aac98) at ../../ufs/ufs/ufs_vnops.c:2300 #7 0xc01956f8 in nfsrv_setattr (nfsd=0xc21a8f00, slp=0xc20c4d00, procp=0xcb254700, mrq=0xcb2aae34) at vnode_if.h:275 #8 0xc01aec96 in nfssvc_nfsd (nsd=0xcb2aae94, argp=0x8072b64 "", p=0xcb254700) at ../../nfs/nfs_syscalls.c:652 #9 0xc01ae5e1 in nfssvc (p=0xcb254700, uap=0xcb2aaf94) at ../../nfs/nfs_syscalls.c:342 #10 0xc0204be3 in syscall (frame={tf_es = 39, tf_ds = -1078001625, tf_edi = 16, tf_esi = 4, tf_ebp = -1077944908, tf_isp = -886394908, tf_ebx = 0, tf_edx = 3, tf_ecx = 3, tf_eax = 155, tf_trapno = 12, tf_err = 2, tf_eip = 134519880, tf_cs = 31, tf_eflags = 662, tf_esp = -1077945308, tf_ss = 39}) at ../../i386/i386/trap.c:1100 #11 0xc01fa16c in Xint0x80_syscall () #12 0x80480e9 in ?? () To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: ffs_blkfree: freeing free block (was Re: Panic (pmap))
In message <[EMAIL PROTECTED]>, Matthew Dillon writes: >There must still be a bug in there somewhere, unrelated to softupdates. > >Try turning off the vfs.ffs.doreallocblks via sysctl and see if that >stops the crashes, that will help narrow down where to search for the >problem. I think I've found it. Here's an easy way to repeat the problem to start with: mount_mfs -T fd1440 none /mnt cd /mnt dd if=/dev/zero bs=4k of=test seek=1036 count=1 dd if=/dev/zero bs=4k of=test1 count=1 dd if=/dev/zero bs=4k of=test2 rm test1 dd if=/dev/zero bs=4k of=test seek=8000 count=1 echo > test It looks as if the problem is in ffs_balloc(), and occurs as follows: - ffs_balloc() is asked to allocate a doubly indirect block. - The first-level indirection block already exists - The second-level indirection block does not exist, but is successfully allocated. - This block is linked into the first-level indirection block by the line: bap[indirs[i - 1].in_off] = nb; - Allocation of the data block fails. - All allocated blocks are then released, but there is still a link in the first-level indirection block to what is now a free block. The fix should be relatively straightforward - either the code should avoid linking new indirection blocks until all allocations succeed, or it should back out the changes on failure. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: ffs_blkfree: freeing free block (was Re: Panic (pmap))
In message <[EMAIL PROTECTED]>, Ian Dowse writes: > >The fix should be relatively straightforward - either the code should >avoid linking new indirection blocks until all allocations succeed, >or it should back out the changes on failure. Here's one patch that seems to fix the problem. It may not be the best approach though? Ian --- ffs_balloc.c.orig Thu Feb 24 00:44:32 2000 +++ ffs_balloc.cThu Feb 24 01:45:46 2000 @@ -71,7 +71,7 @@ int flags; struct fs *fs; ufs_daddr_t nb; - struct buf *bp, *nbp; + struct buf *bp, *nbp, *allocbp; struct vnode *vp; struct indir indirs[NIADDR + 2]; ufs_daddr_t newb, *bap, pref; @@ -197,6 +197,7 @@ --num; nb = ip->i_ib[indirs[0].in_off]; allocib = NULL; + allocbp = NULL; allocblk = allociblk; if (nb == 0) { pref = ffs_blkpref(ip, lbn, 0, (ufs_daddr_t *)0); @@ -272,6 +273,17 @@ } } bap[indirs[i - 1].in_off] = nb; + if (allocib == NULL) { + /* +* Writing bp would link in the newly allocated +* blocks; hold off in case of allocation failure +* later. +*/ + allocib = &bap[indirs[i - 1].in_off]; + allocbp = bp; + continue; + } + /* * If required, write synchronously, otherwise use * delayed write. @@ -316,8 +328,7 @@ bp->b_flags |= B_CLUSTEROK; bdwrite(bp); } - *ap->a_bpp = nbp; - return (0); + goto success; } brelse(bp); if (flags & B_CLRBUF) { @@ -330,6 +341,22 @@ nbp = getblk(vp, lbn, fs->fs_bsize, 0, 0); nbp->b_blkno = fsbtodb(fs, nb); } +success: + if (allocbp != NULL) { + /* +* It is safe to write allocbp now. +* +* If required, write synchronously, otherwise use +* delayed write. +*/ + if (flags & B_SYNC) { + bwrite(allocbp); + } else { + if (allocbp->b_bufsize == fs->fs_bsize) + allocbp->b_flags |= B_CLUSTEROK; + bdwrite(allocbp); + } + } *ap->a_bpp = nbp; return (0); fail: @@ -349,8 +376,11 @@ ffs_blkfree(ip, *blkp, fs->fs_bsize); deallocated += fs->fs_bsize; } - if (allocib != NULL) + if (allocib != NULL) { *allocib = 0; + if (allocbp != NULL) + brelse(allocbp); + } if (deallocated) { #ifdef QUOTA /* To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Best NIC for FBSD (was: Buffer Problems and hangs in 4.0-CURRENT..)
In message <[EMAIL PROTECTED]>, Mike Smith writes: >> fxp0: The Intel driver is by far the highest preformance model, >> beats the 3com (second best) hands down with much lower CPU >> overhead. > >Do you actually have any numbers to quantify this? There's nothing in >the driver architecture nor any of my testing that would suggest this is >actually the case at this point. The FreeBSD fxp driver does a lot to reduce the number of transmit interrupts; only 1/120 of transmitted packets result in interrupts. See the code relating to FXP_CXINT_THRESH. Assuming an even balance of transmitted and received packets, this should reduce the total number of interrupts by nearly 50%. I don't know if drivers for other cards do (or even can) use this approach. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: XFree86-3.3.5/kde-1.1.2/Current today.
In message <[EMAIL PROTECTED]>, Edwin Culp writes: > >libraries, I decided as a last resort to substitute the SiS-6326 that worked >flawlessly on 3.3.4. I changed it for a Matrox G200 and ran xf86config and >it worked perfectly with no other changes and everything else the same. >Looks like the SiS-6326 has a strange problem with XFree86-3.3.5. I´ll try >some Options, in the XF86Config file tomorrow and then report it to >XFree86.org. Does anyone else have one of these running with 3.3.5? I had what sound like similar problems with a SiS card (sorry, I don't have the model number handy right now) and 3.3.5 on 3.3-STABLE the other day - solid black patches where fonts etc should be. Turning on the option "no_bitblt" did the trick, but you really notice the speed reduction. I just sent back the card and got an ATI Rage Pro instead - I'd never used SiS cards before so I assumed they weren't properly supported by the XFree people. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Can't make installworld :(
In message <[EMAIL PROTECTED]>, Hosta s Red writes: >I can't make installworld for some time with following message: > >vm/vm_object.h -> vm/vm_object.ph >vm/vm_page.h -> vm/vm_page.ph >vm/vm_pageout.h -> vm/vm_pageout.ph >vm/vm_pager.h -> vm/vm_pager.ph >vm/vm_param.h -> vm/vm_param.ph >vm/vm_prot.h -> vm/vm_prot.ph >vm/vm_zone.h -> vm/vm_zone.ph >vm/vnode_pager.h -> vm/vnode_pager.ph >*** Error code 1 I've seen this before. h2ph will return a non-zero exit status if it failed to open _any_ of the files listed on the command line. This will typically happen if you have a dangling symbolic link somewhere in /usr/include. The error message indicating exactly which files h2ph couldn't open will be somewhere among all the 'XX.h -> XX.ph' messages. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: panic on mount
In message <[EMAIL PROTECTED]>, John Baldwin writes: > >It looks like the mutex is really held since the mtx_assert before >witness_unlock didn't trigger. You can try turning witness off for the time >being as a workaround. I'm not sure why witness would be broken, however. Revision 1.41 of sys/mutex.h seems to be the culprit. Before 1.41, the defined(LOCK_DEBUG) and !defined(LOCK_DEBUG) cases were identical except that with LOCK_DEBUG defined, the function versions of _mtx_*lock_* were used. After 1.41, the !defined(LOCK_DEBUG) case misses all the MPASS/KASSERT/LOCK_LOG/WITNESS bits. A simple workaround that seems to stop the panics is below. Ian Index: mutex.h === RCS file: /dump/FreeBSD-CVS/src/sys/sys/mutex.h,v retrieving revision 1.41 diff -u -r1.41 mutex.h --- mutex.h 2001/09/22 21:19:55 1.41 +++ mutex.h 2001/09/26 00:46:09 @@ -238,7 +238,7 @@ #define mtx_unlock(m) mtx_unlock_flags((m), 0) #define mtx_unlock_spin(m) mtx_unlock_spin_flags((m), 0) -#ifdef LOCK_DEBUG +#if 1 #definemtx_lock_flags(m, opts) \ _mtx_lock_flags((m), (opts), LOCK_FILE, LOCK_LINE) #definemtx_unlock_flags(m, opts) \ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: mdmfs mount_mfs compatibility bug?
In message <[EMAIL PROTECTED]>, Jos Backus writes: >> >> This was fixed some time ago, I thought. Are you up to date? > >There was a commit to mdmfs.c in August. >This is with yesterday's -current, sorry, should have mentioned that. The "mount -t mfs" case doesn't work with mdmfs, because mount(8) uses the filesystem name, not the mount_xxx program as argv[0]. I had guessed this would be a problem when I read the commit message for revision 1.7 of mdmfs.c, but then I forgot to mention it to Dima. Here is a patch that should help - it makes mdmfs accept "mount_mfs" or "mfs" to trigger compatibility mode instead of mount_*. Ian Index: mdmfs.8 === RCS file: /dump/FreeBSD-CVS/src/sbin/mdmfs/mdmfs.8,v retrieving revision 1.8 diff -u -r1.8 mdmfs.8 --- mdmfs.8 16 Aug 2001 07:43:16 - 1.8 +++ mdmfs.8 29 Sep 2001 23:50:29 - @@ -304,9 +304,10 @@ flag, or by starting .Nm -with -.Li mount_ -at the beginning of its name +with the name +.Li mount_mfs +or +.Li mfs (as returned by .Xr getprogname 3 ) . In this mode, only the options which would be accepted by Index: mdmfs.c === RCS file: /dump/FreeBSD-CVS/src/sbin/mdmfs/mdmfs.c,v retrieving revision 1.7 diff -u -r1.7 mdmfs.c --- mdmfs.c 16 Aug 2001 02:40:29 - 1.7 +++ mdmfs.c 29 Sep 2001 22:58:05 - @@ -116,8 +116,9 @@ newfs_arg = strdup(""); mount_arg = strdup(""); - /* If we were started as mount_*, imply -C. */ - if (strncmp(getprogname(), "mount_", 6) == 0) + /* If we were started as mount_mfs or mfs, imply -C. */ + if (strcmp(getprogname(), "mount_mfs") == 0 || + strcmp(getprogname(), "mfs") == 0) compat = true; while ((ch = getopt(argc, argv, To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: mdmfs mount_mfs compatibility bug?
In message <[EMAIL PROTECTED]>, Dima Dorfman writes: > >The problem with this is that in a bikeshed far, far in the past, some >people wanted to me able to call it "mount_md" instead of "mount_mfs". >Of course, we could allow "mfs" and "md", but that seems rather ugly >(what if someone wants "fish"?). I'd rather see mount(8) use >mount_xxx, although if we think that would break something, your patch >is probably the best solution. I can't think of any good reason not to change mount(8), but I also think that mdmfs only needs to support the weird mount_mfs defaults when invoked with a name of "mount_mfs" or "mfs". People can call it mount_fish if they like and it will work fine, just with the mdmfs rather than mount_mfs defaults. The non-compatibility defaults are better defaults anyway, so they should probably be used in all cases except those that are necessary for compatibility with mount_mfs. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
modules/ed/../../dev/if_ed.c:40: opt_ed.h: No such file or directory
Apologies for this - I missed out a file in a commit earlier. Fixed now. Any other (non-module) complaints about opt_ed.h can be cured by rerunning config. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Multiple NFS server problems with Solaris 8 clients
In message <[EMAIL PROTECTED]>, BSD User writes: >Actually, upon instrumenting some code, it looks like RELEASE-4.4 gets it >mostly right. It ejects a PROG_UNAVAIL call which causes the Solaris 8 >client to back off. The correct message would seem to be PROC_UNAVAIL, >but I would take PROG_UNAVAIL if I could get -current to eject it. I think PROG_UNAVAIL is correct; the packet trace that Thomas provided shows an RPC request with a program ID of 100227 which is not the NFS program ID. Try the patch below. Peter's NFS revamp changed the semantics of the nfsm_reply() macro, and nfsrv_noop() was not updated to match. Previously nfsm_reply would set 'error' to 0 when nd->nd_flag did not have ND_NFSV3 set, and much of the code that uses nfsrv_noop to generate errors ensured that nd->nd_flag was zero. Now nfsm_reply never sets 'error' to 0, so it needs to be done explicitly. Server op functions must return 0 in order for a reply to be sent to the client. Ian Index: nfs_serv.c === RCS file: /home/iedowse/CVS/src/sys/nfsserver/nfs_serv.c,v retrieving revision 1.107 diff -u -r1.107 nfs_serv.c --- nfs_serv.c 2001/09/28 04:37:08 1.107 +++ nfs_serv.c 2001/10/25 16:19:33 @@ -4000,6 +4000,7 @@ else error = EPROCUNAVAIL; nfsm_reply(0); + error = 0; nfsmout: return (error); } To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: vmware fails on -current
In message <[EMAIL PROTECTED]>, "Georg-W. Koltermann" writes : >I also tried to update /compat/linux/dev/vmnet1 to match the >/dev/vmnet1, and that got me just a litte bit farther. I now get >"Could not get address for /dev/vmnet1: Invalid argument >Failed to configure ethernet0." I added some printf's to >linux_ioctl.c, and it seems the linux_ioctl_socket() gets a device >name which is "", i.e. the empty string. There was a discussion about this and workaround patches for RELENG_4 on -emulation. I'll try to organise with DES to get something committed soon. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: vmware fails on -current
In message <[EMAIL PROTECTED]>, CHOI Junho writes: > >Hmm.. I have experienced another problem(-current of 19 Nov.) with >vmware. When it runs it comes up with the following dialog: > > "Encountered an error while initializing the ethernet address. > You probably have an old vnet driver. Try installing a newer version > Failed to configure ethernet0" Hi, could you try to get a ktrace of what it is doing just before this happens? Run ktrace -i vmware as root (you may need to copy your ~/.vmware to ~root first). Then use "linux_kdump -n" (/usr/ports/devel/linux_kdump) and look for any ioctls that it does immediately before giving that error message. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: vmware fails on -current
In message <[EMAIL PROTECTED]>, CHOI Junho writes: > >I'll try. Oh, I forget to say I appiled des's linux_ioctl patch. > Ah, that's different then. I assumed from the error that you had revision 1.76 of linux_ioctl.c, but if that patch applied then you don't. Try updating your sources again; revision 1.76 is des's patch with a few problems fixed. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Missing stack frames in kgdb/ddb traces
I noticed recently two problems with gdb/ddb traces that involve an interrupt frame (both of these are in i386-specific code, but maybe similar issues exist on other architectures): The first is that kgdb sometimes messes up a stack frame that includes an interrupt, e.g in the trace below, the cpu_idle() frame is corrupted. #7 0xc0325246 in siointr1 (com=0xc092a400) at machine/cpufunc.h:63 #8 0xc0325137 in siointr (arg=0xc092a400) at ../../../isa/sio.c:1859 #9 0x8 in ?? () #10 0xc01ff391 in idle_proc (dummy=0x0) at ../../../kern/kern_idle.c:99 #11 0xc01ff210 in fork_exit (callout=0xc01ff370 , arg=0x0, frame=0xc40ffd48) at ../../../kern/kern_fork.c:785 This is because gdb was never updated when cpl was removed from the interrupt frame (ddb was changed in i386/i386/db_trace.c rev 1.37). The following patch seems to fix it: Index: gnu/usr.bin/binutils/gdb/i386/kvm-fbsd.c === RCS file: /dump/FreeBSD-CVS/src/gnu/usr.bin/binutils/gdb/i386/kvm-fbsd.c,v retrieving revision 1.27 diff -u -r1.27 kvm-fbsd.c --- gnu/usr.bin/binutils/gdb/i386/kvm-fbsd.c19 Sep 2001 18:42:19 - 1.27 +++ gnu/usr.bin/binutils/gdb/i386/kvm-fbsd.c7 Oct 2001 19:45:28 - @@ -176,7 +176,7 @@ return (read_memory_integer (fr->frame + 8 + oEIP, 4)); case tf_interrupt: - return (read_memory_integer (fr->frame + 16 + oEIP, 4)); + return (read_memory_integer (fr->frame + 12 + oEIP, 4)); case tf_syscall: return (read_memory_integer (fr->frame + 8 + oEIP, 4)); Secondly, fast interrupts do not have an XresumeN style of symbol, so neither gdb nor ddb treat their frames as interrupt frames. This causes the frame listed as XfastintrN to gobble up the frame that was executing at the time of the interrupt, which is especially annoying when a serial console is being used to debug an infinite loop in the kernel. The following patch adds an XresumefastN to fast interrupt handlers, which allows gdb and ddb to correctly see the missing frame. The name Xresumefast is chosen because it involves no ddb or gdb changes (they just check for a name beginning with "Xresume"). Any comments? Ian Index: sys/i386/isa/icu_vector.s === RCS file: /dump/FreeBSD-CVS/src/sys/i386/isa/icu_vector.s,v retrieving revision 1.29 diff -u -r1.29 icu_vector.s --- sys/i386/isa/icu_vector.s 12 Sep 2001 08:37:34 - 1.29 +++ sys/i386/isa/icu_vector.s 7 Oct 2001 19:48:06 - @@ -60,6 +60,7 @@ mov %ax,%es ; \ mov $KPSEL,%ax ; \ mov %ax,%fs ; \ +__CONCAT(Xresumefast,irq_num): ; \ FAKE_MCOUNT((12+ACTUALLY_PUSHED)*4(%esp)) ; \ movlPCPU(CURTHREAD),%ebx ; \ inclTD_INTR_NESTING_LEVEL(%ebx) ; \ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Multiple NFS server problems with Solaris 8 clients
> >The last one is a know problem. There is a (unfinished) patch available to >solve this problem. Thomas Moestl <[EMAIL PROTECTED]> is still working on >some issues of the patch. Please contact him if you like to know more. > >Here is the URL for the patch: > >http://home.teleport.ch/freebsd/userland/nfsd-loop.diff That patch is a bit out of date, because Peter removed a big chunk of kerberos code from nfsd since. I was actually just looking at this problem again, so I include an updated version of Thomas's patch below. This version also removes entries from the children[] array when a slave nfsd dies to avoid the possibility of accidentally killing unrelated processes. The issue that remains open with the patch is that currently if a slave nfsd dies, then all nfsds will shut down. This is because nfssvc() in the master nfsd returns 0 when the master nfsd receives a SIGCHLD. This behaviour is probably reasonable enough, but the way it happens is a bit odd. Thomas, I'll probably commit this within the next few days if you have no objections, and if you don't get there before me. The exiting behaviour can be resolved later if necessary. Ian Index: nfsd.c === RCS file: /dump/FreeBSD-CVS/src/sbin/nfsd/nfsd.c,v retrieving revision 1.21 diff -u -r1.21 nfsd.c --- nfsd.c 20 Sep 2001 02:18:06 - 1.21 +++ nfsd.c 14 Oct 2001 20:19:18 - @@ -52,6 +52,8 @@ #include #include #include +#include +#include #include #include @@ -64,6 +66,7 @@ #include #include +#include #include #include #include @@ -86,12 +89,16 @@ intnfsdcnt;/* number of children */ void cleanup(int); +void child_cleanup(int); void killchildren(void); -void nonfs (int); -void reapchild (int); -intsetbindhost (struct addrinfo **ia, const char *bindhost, struct addrinfo hints); -void unregistration (void); -void usage (void); +void nfsd_exit(int); +void nonfs(int); +void reapchild(int); +intsetbindhost(struct addrinfo **ia, const char *bindhost, + struct addrinfo hints); +void start_server(int); +void unregistration(void); +void usage(void); /* * Nfs server daemon mostly just a user context for nfssvc() @@ -126,13 +133,12 @@ fd_set ready, sockbits; fd_set v4bits, v6bits; int ch, connect_type_cnt, i, len, maxsock, msgsock; - int nfssvc_flag, on = 1, unregister, reregister, sock; + int on = 1, unregister, reregister, sock; int tcp6sock, ip6flag, tcpflag, tcpsock; - int udpflag, ecode, s; - int bindhostc = 0, bindanyflag, rpcbreg, rpcbregcnt; + int udpflag, ecode, s, srvcnt; + int bindhostc, bindanyflag, rpcbreg, rpcbregcnt; char **bindhost = NULL; pid_t pid; - int error; if (modfind("nfsserver") < 0) { /* Not present in kernel, try loading it */ @@ -141,8 +147,8 @@ } nfsdcnt = DEFNFSDCNT; - unregister = reregister = tcpflag = 0; - bindanyflag = udpflag; + unregister = reregister = tcpflag = maxsock = 0; + bindanyflag = udpflag = connect_type_cnt = bindhostc = 0; #defineGETOPT "ah:n:rdtu" #defineUSAGE "[-ardtu] [-n num_servers] [-h bindip]" while ((ch = getopt(argc, argv, GETOPT)) != -1) @@ -313,8 +319,6 @@ daemon(0, 0); (void)signal(SIGHUP, SIG_IGN); (void)signal(SIGINT, SIG_IGN); - (void)signal(SIGSYS, nonfs); - (void)signal(SIGUSR1, cleanup); /* * nfsd sits in the kernel most of the time. It needs * to ignore SIGTERM/SIGQUIT in order to stay alive as long @@ -324,40 +328,31 @@ (void)signal(SIGTERM, SIG_IGN); (void)signal(SIGQUIT, SIG_IGN); } + (void)signal(SIGSYS, nonfs); (void)signal(SIGCHLD, reapchild); - openlog("nfsd:", LOG_PID, LOG_DAEMON); + openlog("nfsd", LOG_PID, LOG_DAEMON); - for (i = 0; i < nfsdcnt; i++) { + /* If we use UDP only, we start the last server below. */ + srvcnt = tcpflag ? nfsdcnt : nfsdcnt - 1; + for (i = 0; i < srvcnt; i++) { switch ((pid = fork())) { case -1: syslog(LOG_ERR, "fork: %m"); - killchildren(); - exit (1); + nfsd_exit(1); case 0: break; default: children[i] = pid; continue; } - + (void)signal(SIGUSR1, child_cleanup); setproctitle("server"); - nfssvc_flag = NFSSVC_NFSD; - nsd.nsd_nfsd = NULL; - while (nfssvc(nfssvc_flag, &nsd) < 0) { - if (errno) { - syslog(LOG_ERR, "n
Re: cvs commit: src/sys/kern subr_diskmbr.c
In message <[EMAIL PROTECTED]>, Peter Wemm writes : >The problem is, that you **are** using fdisk tables, you have no choice. >DD mode included a *broken* fdisk table that specified an illegal geometry. ... >This is why it is called dangerous. BTW, I presume you are aware of the way sysinstall creates DD MBRs; it does not use the 5 sector slice 4 method, but sets up slice 1 to cover the entire disk including the MBR, with c/h/s entries corresponding to the real start and end of the disk, e.g: cylinders=3544 heads=191 sectors/track=53 (10123 blks/cyl) ... The data for partition 1 is: sysid 165,(FreeBSD/NetBSD/386BSD) start 0, size 35885168 (17522 Meg), flag 80 (active) beg: cyl 0/ head 0/ sector 1; end: cyl 1023/ head 190/ sector 53 The data for partition 2 is: The data for partition 3 is: The data for partition 4 is: Otherwise the disk layout is the same as disklabel's DD. I suspect that this approach is much less illegal than disklabel's MBRs although I do remember seeing a HP PC that disliked it. I wonder if a reasonable compromise is to make disklabel use this system for DD disks instead of the bogus 5 sector slice? Obviously, it should also somehow not install a partition table unless boot1 is being used as the MBR, and the fdisk -I method should be preferred. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: change to ZALLOC(9) man page
In message <[EMAIL PROTECTED]>, Ju lian Elischer writes: >By my reading of the code I would like to make the following changes >to the documentation for the zone(9) man page; Yes! Please do. I must have read that page about 10 times and been annoyed at its misleading information, but I never got around to fixing it. There's one spelling type s/mentionned/mentioned/ and maybe "type stable" should be hyphenated. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
mountd(8) leaving filesystems exported
There are quite a few assumptions in mountd(8) about the layout of the per-filesystem mount(2) `data' struct, which make the code quite ugly. It uses a union to ensure that it supplies a large enough structure to mount(2), but regardless of the filesystem type, it always initialises the UFS version. One nasty bug is that the code for un-exporting filesystems checks to see if the filesystem is among a list of supported types, but the code for exporting doesn't. This list of supported filesystems does not include ext2fs or hpfs, so you can successfully export these filesystems, but they remain exported even when the /etc/exports entry is removed and mountd is restarted or sent a SIGHUP, and no errors are logged... The patch below should address this issue by checking the same list of filesystems in both cases, and adding ext2fs and hpfs to the filesystem list. It also avoids the need to assume that all xxx_args have the export_args in the same place by storing the offsets in a table. I am aware that there is work ongoing in the area of mount(2), so maybe the patch is overkill at this time. Any comments? Ian Index: mountd.c === RCS file: /dump/FreeBSD-CVS/src/sbin/mountd/mountd.c,v retrieving revision 1.59 diff -u -r1.59 mountd.c --- mountd.c20 Sep 2001 02:15:17 - 1.59 +++ mountd.c15 Dec 2001 00:10:47 - @@ -76,6 +76,7 @@ #include #include #include +#include #include #include #include @@ -157,6 +158,29 @@ nfsfh_t fhr_fh; }; +/* Union of mount(2) `data' structs for supported filesystems. */ +union mountdata { + struct ufs_args ua; + struct iso_args ia; + struct msdosfs_args da; + struct ntfs_args na; +}; + +/* Find the offset into the mountdata union of a filesystem's export_args. */ +struct ea_off { + char *fsname; + int exportargs_off; +} ea_offtable[] = { + {"ufs", offsetof(union mountdata, ua.export)}, + {"ifs", offsetof(union mountdata, ua.export)}, + {"ext2fs", offsetof(union mountdata, ua.export)}, + {"cd9660", offsetof(union mountdata, ia.export)}, + {"msdosfs", offsetof(union mountdata, da.export)}, + {"ntfs", offsetof(union mountdata, na.export)}, + {"hpfs", offsetof(union mountdata, ua.export)}, /* XXX */ + {NULL, 0} +}; + /* Global defs */ char *add_expdir __P((struct dirlist **, char *, int)); void add_dlist __P((struct dirlist **, struct dirlist *, @@ -191,6 +215,7 @@ void huphandler(int sig); intmakemask(struct sockaddr_storage *ssp, int bitlen); void mntsrv __P((struct svc_req *, SVCXPRT *)); +struct export_args *mountdata_to_eap __P((union mountdata *, struct statfs *)); void nextfield __P((char **, char **)); void out_of_mem __P((void)); void parsecred __P((char *, struct xucred *)); @@ -884,6 +909,8 @@ void get_exportlist() { + union mountdata args; + struct export_args *eap; struct exportlist *ep, *ep2; struct grouplist *grp, *tgrp; struct exportlist **epp; @@ -918,26 +945,16 @@ /* * And delete exports that are in the kernel for all local * file systems. -* XXX: Should know how to handle all local exportable file systems -* instead of just "ufs". */ num = getmntinfo(&fsp, MNT_NOWAIT); for (i = 0; i < num; i++) { - union { - struct ufs_args ua; - struct iso_args ia; - struct msdosfs_args da; - struct ntfs_args na; - } targs; - - if (!strcmp(fsp->f_fstypename, "ufs") || - !strcmp(fsp->f_fstypename, "msdosfs") || - !strcmp(fsp->f_fstypename, "ntfs") || - !strcmp(fsp->f_fstypename, "cd9660")) { - targs.ua.fspec = NULL; - targs.ua.export.ex_flags = MNT_DELEXPORT; + eap = mountdata_to_eap(&args, fsp); + if (eap != NULL) { + /* This is a filesystem that supports NFS exports. */ + bzero(&args, sizeof(args)); + eap->ex_flags = MNT_DELEXPORT; if (mount(fsp->f_fstypename, fsp->f_mntonname, - fsp->f_flags | MNT_UPDATE, (caddr_t)&targs) < 0 && + fsp->f_flags | MNT_UPDATE, &args) < 0 && errno != ENOENT) syslog(LOG_ERR, "can't delete exports for %s: %m", @@ -1711,23 +1728,23 @@ int dirplen; struct statfs *fsb; { + union mountdata args; struct statfs fsb1; struct addrinfo *ai; struct export_args *eap; char *cp = NULL; int done; char savedc = '\0'; - union { - struct ufs_args ua; - struct i
Re: mountd(8) leaving filesystems exported
In message <[EMAIL PROTECTED]>, Terry Lambert writes: >> One nasty bug is that the code for un-exporting filesystems checks >> to see if the filesystem is among a list of supported types, but >> the code for exporting doesn't. This list of supported filesystems >> does not include ext2fs or hpfs, so you can successfully export >> these filesystems, but they remain exported even when the /etc/exports >> entry is removed and mountd is restarted or sent a SIGHUP, and no >> errors are logged... > >This is actually the wrong way to go about this. I'll agree with this much anyway :-) Ignoring for now how the exports are managed in the kernel, it is really bad that mountd needs to know about individual filesystems in order to NFS export them. The export interface also does not allow the export list to be replaced atomically, so all of the exports fail briefly when mountd reloads them on receipt of a SIGHUP. There is apparently work ongoing to improving the mount(2) interface (I forget who is doing this). Hopefully this should make it much easier to arrange for mountd to change the export lists in a filesystem-independent manner, even if exports are still managed per-filesystem in the kernel. However for this bug (ext2fs and hpfs filesystems cannot be un-exported once they have been exported) I am just looking for a quick solution for now, but I have already put some thought into improving the mountd-kernel interface, which is something I really want to see fixed. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Netatalk broken in current? Lock order reversal?
In message <[EMAIL PROTECTED]>, Emiel Kollof writes: > >Oh, on another note, is someone working at that netatalk breakage? Who >do I have to discipline for that? :-) Could you try the following patch in src/sys/netatalk? The problem was caused by the -fno-common compiler option that was added to the kernel build flags recently. This compiles for me, but I haven't checked that it actually works. Ian Index: ddp_input.c === RCS file: /dump/FreeBSD-CVS/src/sys/netatalk/ddp_input.c,v retrieving revision 1.12 diff -u -r1.12 ddp_input.c --- ddp_input.c 13 Feb 2000 03:31:58 - 1.12 +++ ddp_input.c 16 Jan 2002 01:30:50 - @@ -27,8 +27,6 @@ static struct ddpstat ddpstat; static struct routeforwro; -const int atintrq1_present = 1, atintrq2_present = 1; - static void ddp_input(struct mbuf *, struct ifnet *, struct elaphdr *, int); /* Index: ddp_usrreq.c === RCS file: /dump/FreeBSD-CVS/src/sys/netatalk/ddp_usrreq.c,v retrieving revision 1.22 diff -u -r1.22 ddp_usrreq.c --- ddp_usrreq.c17 Nov 2001 03:07:08 - 1.22 +++ ddp_usrreq.c16 Jan 2002 01:32:34 - @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -547,6 +548,8 @@ { atintrq1.ifq_maxlen = IFQ_MAXLEN; atintrq2.ifq_maxlen = IFQ_MAXLEN; +atintrq1_present = 1; +atintrq2_present = 1; mtx_init(&atintrq1.ifq_mtx, "at1_inq", MTX_DEF); mtx_init(&atintrq2.ifq_mtx, "at2_inq", MTX_DEF); } To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: boot floppy problems...
In message, Mike Brancato wr ites: >oh, well. They say something along the lines of >"Disk error: lba is 0x9 (should be 0x10)" >or similar. then it trys to boot the kernel twice using the loader, but >fails with the path 0:fd(0,a)/kernel Hmm, the error is actually "Disk error 0x9 (lba=0x10)". I think this is my fault. Error 9 is "data boundary error (attempted DMA across 64K boundary or >80h sectors)", so by changing the buffers to being static in revision 1.35 of boot2.c, I broke the guarantee that single transfers don't cross a 64k boundary, which is important for floppies :-( I'll fix this shortly. Thanks for pointing out the problem! Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: boot floppy problems...
In message, Mike Brancato wr ites: >no problem. >keep up the good work. > >mike Ok, it's fixed now. If you'd like to try it, there's an updated version of the kern.flp from today's -CURRENT snapshot at: http://www.maths.tcd.ie/~iedowse/FreeBSD/kern.flp Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Fw: Stop annoying message of lnc
In message <[EMAIL PROTECTED]>, Mike Smith writes: > >I don't quite understand Paul's reasoning, though; it's not actually >useful to unload/reload parts of a device's bus attachment without >unloading/reloading all the downstream parts of the driver. > >I think the fix should probably be committed and the driver turned into a >single monolithic module. Yes, Paul essentially agreed to my doing this as an interim measure until ifconfig is "fixed" to use the module file name rather than the module name when loading drivers. I'll commit the change in a few hours after I have tested that it works. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: growfs
In message <[EMAIL PROTECTED]>, Andrea Campi writes: > >Anyway, that was not my point. If I reboot into single-user, and am thus sure >to have the / fs in a clean, consistent state, should I expect growfs to work >in a safe way? If so, we should document it. I think it is still unlikely to be completely safe. The kernel may panic if it finds inconsistencies in the filesystem, and I'm sure that growfs (temporarily) introduces some very serious inconsistencies while it is running. Also, when growfs completes, the kernel's idea of the filesystem is quite different from the parameters actually set on the disk. If the kernel was to panic half-way through a growfs operation, or if growfs died, say because the kernel failed to fault in some pages from the growfs executable, you could end up with a very confused filesystem! Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
reboot(8) delay between SIGTERM and SIGKILL
I have noticed that reboot(8) sometimes appears not to wait long enough before sending the final SIGKILL to all processes. On a system that has a lot of processes swapped out, some processes such as the X server may get a SIGKILL before they have had a chance to perform their exit cleanup. The patch below causes reboot to wait up to 60 seconds for paging activity to end before sending the SIGKILLs. It does this by monitoring the sysctl `vm.stats.vm.v_swappgsian', and extending the default 5-second delay if page-in operations are observed. On my laptop (64Mb, IDE disk) with a number of big apps running, it can take around 20 seconds for all the paging to die down after the SIGTERMs are sent. I know the choice of sysctl to monitor is slightly arbitrary, but it seems to have the right overall effect. Does anyone have any objections to my committing this? Ian Index: reboot.c === RCS file: /dump/FreeBSD-CVS/src/sbin/reboot/reboot.c,v retrieving revision 1.9 diff -u -r1.9 reboot.c --- reboot.c1999/11/21 21:52:40 1.9 +++ reboot.c2001/03/19 17:01:37 @@ -47,6 +47,7 @@ #include #include +#include #include #include #include @@ -58,6 +59,7 @@ #include void usage __P((void)); +u_int get_pageins __P((void)); int dohalt; @@ -152,13 +154,22 @@ /* * After the processes receive the signal, start the rest of the * buffers on their way. Wait 5 seconds between the SIGTERM and -* the SIGKILL to give everybody a chance. +* the SIGKILL to give everybody a chance. If there is a lot of +* paging activity then wait longer, up to a maximum of approx +* 60 seconds. */ sleep(2); if (!nflag) sync(); - sleep(3); + for (i = 0; i < 20; i++) { + u_int old_pageins; + old_pageins = get_pageins(); + sleep(3); + if (get_pageins() == old_pageins) + break; + } + for (i = 1;; ++i) { if (kill(-1, SIGKILL) == -1) { if (errno == ESRCH) @@ -189,4 +200,19 @@ (void)fprintf(stderr, "usage: %s [-dnpq]\n", dohalt ? "halt" : "reboot"); exit(1); +} + +u_int +get_pageins() +{ + u_int pageins; + size_t len; + + len = sizeof(pageins); + if (sysctlbyname("vm.stats.vm.v_swappgsin", &pageins, &len, NULL, 0) + != 0) { + warnx("v_swappgsin"); + return (0); + } + return pageins; } To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
fsdb broken in -current
The last set of changes to fsck_ffs moved the initialisation of dev_bsize to sblock_init(), but this is not called by fsdb(8) so fsdb dies almost immediately with a floating exception. I'm just going to commit the obvious fix, which is to have fsdb call sblock_init() also. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: kernel core
In message <[EMAIL PROTECTED]>, Warner Losh writes: >: >: Yes, but until such time as we do that we should warn people in UPDATING at >: least. >: > >OK, but you won't like the UPDATING entry. The bug actually looks fairly simple to fix. ffs_reload() isn't checking if the new superblock fields are zero, so if an old fsck zeros them out between a read-oly mount and a read-write remount, then we get a division by zero or something later. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: kernel core
In message <[EMAIL PROTECTED]>, John Baldwin writes: > > >Fair enough, I guess ffs_reload() should just sanity check the values. Any >takers? You could try this (untested). I have to run now, but I can test it later as it's easy enough to reproduce. Ian Index: ffs_vfsops.c === RCS file: /dump/FreeBSD-CVS/src/sys/ufs/ffs/ffs_vfsops.c,v retrieving revision 1.146 diff -u -r1.146 ffs_vfsops.c --- ffs_vfsops.c2001/04/17 05:37:51 1.146 +++ ffs_vfsops.c2001/04/23 22:15:55 @@ -427,6 +427,11 @@ brelse(bp); mp->mnt_maxsymlinklen = fs->fs_maxsymlinklen; ffs_oldfscompat(fs); + /* An old fsck may have clobbered these fields, so recheck them. */ + if (fs->fs_avgfilesize <= 0)/* XXX */ + fs->fs_avgfilesize = AVFILESIZ; /* XXX */ + if (fs->fs_avgfpdir <= 0) /* XXX */ + fs->fs_avgfpdir = AFPDIR; /* XXX */ /* * Step 3: re-read summary information from disk. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: kernel core
In message <[EMAIL PROTECTED]>, Ian Dowse writes: >You could try this (untested). I have to run now, but I can test it >later as it's easy enough to reproduce. Almost, but I missed the fs_contigdirs field, which was the real culprit. An updated patch is below; this seems to stop the panics for me. I'll just run this by Kirk first, and commit it if he has no objections. There probably does need to be something in UPDATING saying that after the dirpref changes have been used, running a pre-dirpref version of fsck may generate some serious-looking warnings that are actually harmless. I think some people were seeing: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE Is that correct? And was a "fsck -b 32 /dev/xxx" required to fix it or did fsck correct the problem itself? Ian Index: ffs_vfsops.c === RCS file: /dump/FreeBSD-CVS/src/sys/ufs/ffs/ffs_vfsops.c,v retrieving revision 1.146 diff -u -r1.146 ffs_vfsops.c --- ffs_vfsops.c2001/04/17 05:37:51 1.146 +++ ffs_vfsops.c2001/04/23 23:37:14 @@ -421,12 +421,18 @@ */ newfs->fs_csp = fs->fs_csp; newfs->fs_maxcluster = fs->fs_maxcluster; + newfs->fs_contigdirs = fs->fs_contigdirs; bcopy(newfs, fs, (u_int)fs->fs_sbsize); if (fs->fs_sbsize < SBSIZE) bp->b_flags |= B_INVAL | B_NOCACHE; brelse(bp); mp->mnt_maxsymlinklen = fs->fs_maxsymlinklen; ffs_oldfscompat(fs); + /* An old fsck may have zeroed these fields, so recheck them. */ + if (fs->fs_avgfilesize <= 0)/* XXX */ + fs->fs_avgfilesize = AVFILESIZ; /* XXX */ + if (fs->fs_avgfpdir <= 0) /* XXX */ + fs->fs_avgfpdir = AFPDIR; /* XXX */ /* * Step 3: re-read summary information from disk. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Strangeness with newsyslog/wtmp
In message <[EMAIL PROTECTED]>, "Thomas D. Dean" write s: >I notice that my /var/log/wtmp has strange renewal times. I don't >know when it was not like this. newsyslog.conf is set to renew this >once per week. What is causing this? -rw-rw-r-- 1 root wheel 27 Apr 15 12:00 /var/log/wtmp.3.gz -rw-rw-r-- 1 root wheel 244 Apr 13 15:52 /var/log/wtmp.4.gz -rw-rw-r-- 1 root wheel 176 Apr 8 12:12 /var/log/wtmp.5.gz -rw-rw-r-- 1 root wheel 148 Apr 3 10:51 /var/log/wtmp.6.gz -rw-rw-r-- 1 root wheel 280 Mar 30 21:16 /var/log/wtmp.7.gz Gzip by default preserves the last-modified time of a file when gzipping, so these times are actually the times at which the wtmp file was previously modified before being rotated. Try "ls -lc", which will show up the rotation time. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: wierdness with mountd
In message <[EMAIL PROTECTED]>, John Polstra writes: >In article <[EMAIL PROTECTED]>, >Matthew Jacob <[EMAIL PROTECTED]> wrote: >> May 28 10:21:43 farrago mountd[217]: can't delete exports for /tmp >> May 28 10:21:43 farrago mountd[217]: can't delete exports for /usr/obj > >I've been seeing this too, on a -current system from around May 5. This sounds like there are stale entries in /var/db/mountdtab, but I'm not familiour enough with the purpose of mountdtab to know why this is happening. I'll look into this further over the next few days; for now maybe try cleaning out mountdtab manually? Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: wierdness with mountd
In message <[EMAIL PROTECTED]>, John writes: >Looking in /usr/src/sbin/mountd/mountd.c, under line 930 >shows the following: > >num = getmntinfo(&fsp, MNT_NOWAIT); > >and then runs through a loop 'num' times trying to >delete any export for each entry. Thanks, you're right - this has nothing to do with mountdtab or mounttab. The commit that caused these messages to appear is phk's centralisation of the kernel netexport structure: REV:1.149 ffs_vfsops.c2001/04/25 07:07:50 phk Move the netexport structure from the fs-specific mountstructure to struct mount. ... Doing a MNT_DELEXPORT mount used to be a no-op if there were no exports, but now it returns EINVAL. Maybe that should be changed to ENOENT or something, so that mountd can detect it as a 'normal' error? (untested patch below). Ian Index: sys/kern/vfs_export.c === RCS file: /dump/FreeBSD-CVS/src/sys/kern/vfs_export.c,v retrieving revision 1.310 diff -u -r1.310 vfs_export.c --- sys/kern/vfs_export.c 2001/04/26 20:47:14 1.310 +++ sys/kern/vfs_export.c 2001/05/29 09:28:43 @@ -207,7 +207,7 @@ nep = mp->mnt_export; if (argp->ex_flags & MNT_DELEXPORT) { if (nep == NULL) - return (EINVAL); + return (ENOENT); if (mp->mnt_flag & MNT_EXPUBLIC) { vfs_setpublicfs(NULL, NULL, NULL); mp->mnt_flag &= ~MNT_EXPUBLIC; Index: sbin/mountd/mountd.c === RCS file: /dump/FreeBSD-CVS/src/sbin/mountd/mountd.c,v retrieving revision 1.51 diff -u -r1.51 mountd.c --- sbin/mountd/mountd.c2001/05/25 08:14:02 1.51 +++ sbin/mountd/mountd.c2001/05/29 09:31:43 @@ -903,6 +903,7 @@ struct xucred anon; char *cp, *endcp, *dirp, *hst, *usr, *dom, savedc; int len, has_host, exflags, got_nondir, dirplen, num, i, netgrp; + int error; dirp = NULL; dirplen = 0; @@ -949,10 +950,11 @@ !strcmp(fsp->f_fstypename, "cd9660")) { targs.ua.fspec = NULL; targs.ua.export.ex_flags = MNT_DELEXPORT; - if (mount(fsp->f_fstypename, fsp->f_mntonname, - fsp->f_flags | MNT_UPDATE, - (caddr_t)&targs) < 0) - syslog(LOG_ERR, "can't delete exports for %s", + error = mount(fsp->f_fstypename, fsp->f_mntonname, + fsp->f_flags | MNT_UPDATE, (caddr_t)&targs); + if (error && error != ENOENT) + syslog(LOG_ERR, + "can't delete exports for %s: %m", fsp->f_mntonname); } fsp++; To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: wierdness with mountd
In message <[EMAIL PROTECTED]>, Ian Dowse writes: >error? (untested patch below). I braino'd that patch (error vs. errno), but I have just committed a working version that should stop the mountd warnings. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: cvs commit: src/sbin/fsck_ffs setup.c
In message <[EMAIL PROTECTED]>, Ian Dowse writes: >iedowse 2001/05/29 13:45:09 PDT > > Modified files: >sbin/fsck_ffssetup.c > Log: > Ignore the new superblock fields fs_pendingblocks and fs_pendinginodes > when comparing with the alternate superblock. These fields are used > for temporary in-core information only. This should fix the "VALUES > IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE" error from > fsck_ffs that has been seen a lot recently. Note that this will not fix the softupdates freelist corruption problem that people have been reporting. It seems that Kirk is away for at least another week, so if Tor's suggested fix for that works, then it should probably be committed in the meantime. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dirpref and RELENG_4 fsck
In message <[EMAIL PROTECTED]>, Alfred Perlstein wri tes: >> Was it determined that the fsck corruption problems which were seen >> with fsck after the introduction of the dirpref changes do not affect >> RELENG_4? I haven't seen any MFC of changes to the RELENG_4 fsck >> code, and I'm kind of worried now that I've reverted my current system >> back to RELENG_4 :-) > >Afaik the problem was that fsck would wipe certain stats info that >dirpref would use, however I think the kernel detects absurd values >and will reinit them. Yes, there was a problem in the -current kernel that could cause crashes if fsck erased some dirpref-related values in the superblock. However, the main issue affecting moving filesystems back and forth between RELENG_4 and -current is that fsck in RELENG_4 does not know about the new superblock fields (dirpref, pending*, snapshots?). It will detect a mismatch between the master and alternate superblocks and give an error. Once you fix that error (I think you just say yes to the "LOOK FOR ALTERNATE SUPERBLOCKS?" question), then everything should be fine. One final annoyance is that using an alternate superblock will undo any changes made by tunefs, unless the '-A' flag had been used with tunefs originally. Typically this will result in soft-updates getting disabled. RELENG_4's fsck could probably be updated to deal with this a bit better, but I don't think it can do the right thing if any snapshots exist, so the error may be a good thing. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: MSDOS filesystem mounting...
In message <[EMAIL PROTECTED]>, Jim Bryant writes: > > 6:20:34pm wahoo(113): mount_msdos /dev/da0s1 /ms-dog >mount_msdos: vfsload(msdos): No such file or directory Try "mount_msdosfs" instead of "mount_msdos". The latter is probably a stale binary left on your system from before the rename that took place last month. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Load average synchronisation and phantom loads
There are a few PRs and a number of messages in the mailing list archives that describe a problem where the load average occasionally remains at 1.0 or greater even though top(1) reports that the CPU is nearly 100% idle. The PRs I could find in a quick search are kern/21155, kern/23448 and kern/27334. The most probable cause for this effect is a synchonisation between the load measurement and processes that periodically run for short amounts of time. The load average is based on samples of the number of running processes taken at exact 5-second intervals. If some other process regularly runs with a period that divides into 5 seconds, that process may always be seen as running even though it may only run for a tiny fraction of the available CPU time. A very likely candidate process is bufdaemon; it sleeps for 1 second at a time, so if it happens to get scheduled in the same tick as the load measurement and before the load measurement, it will always be seen as running. The patch below causes the samples of running processes to be somewhat randomised; instead of being taken every 5 seconds, the gap now varies in the range 4 to 6 seconds, so that synchronisation should no longer occur. Would there be any objections to my committing this? Two comments on the patch: - This patch removes the SSLEEP case in loadav(), because in the existing code, p->p_slptime has always just been incremented in schedcpu() so this case never made a difference. To keep the same load average behaviour when loadav() is called at different times, this case needs to be removed. - The load average calculation now has really nothing to do with the VM system, so it could be moved elsewhere. I've just left it in vm_meter.c because that's where it's always been. Ian Index: vm/vm_meter.c === RCS file: /dump/FreeBSD-CVS/src/sys/vm/vm_meter.c,v retrieving revision 1.57 diff -u -r1.57 vm_meter.c --- vm/vm_meter.c 2001/07/04 19:00:12 1.57 +++ vm/vm_meter.c 2001/07/15 20:54:38 @@ -53,8 +53,11 @@ #include #include +static void loadav_init(void); + struct loadavg averunnable = { {0, 0, 0}, FSCALE }; /* load average, of runnable procs */ +static struct callout loadav_callout; struct vmmeter cnt; @@ -75,19 +78,17 @@ * 1, 5 and 15 minute intervals. */ static void -loadav(struct loadavg *avg) +loadav(void *arg) { int i, nrun; + struct loadavg *avg; struct proc *p; + avg = (struct loadavg *)arg; sx_slock(&allproc_lock); - for (nrun = 0, p = LIST_FIRST(&allproc); p != 0; p = LIST_NEXT(p, p_list)) { + for (nrun = 0, p = LIST_FIRST(&allproc); p != 0; +p = LIST_NEXT(p, p_list)) { switch (p->p_stat) { - case SSLEEP: - if (p->p_pri.pri_level > PZERO || - p->p_slptime != 0) - continue; - /* FALLTHROUGH */ case SRUN: if ((p->p_flag & P_NOLOAD) != 0) continue; @@ -100,15 +101,24 @@ for (i = 0; i < 3; i++) avg->ldavg[i] = (cexp[i] * avg->ldavg[i] + nrun * FSCALE * (FSCALE - cexp[i])) >> FSHIFT; + + /* +* Schedule the next update to occur in 5 seconds, but add a +* random variation to help avoid synchronisation with +* processes that run at regular intervals. +*/ + callout_reset(&loadav_callout, hz * 4 + (int)(random() % (hz * 2)), + loadav, arg); } -void -vmmeter() +static void +loadav_init() { - - if (time_second % 5 == 0) - loadav(&averunnable); + callout_init(&loadav_callout, 0); + loadav(&averunnable); } +SYSINIT(loadav, SI_SUB_PSEUDO, SI_ORDER_ANY, loadav_init, NULL) + SYSCTL_UINT(_vm, VM_V_FREE_MIN, v_free_min, CTLFLAG_RW, &cnt.v_free_min, 0, ""); Index: vm/vm_extern.h === RCS file: /dump/FreeBSD-CVS/src/sys/vm/vm_extern.h,v retrieving revision 1.47 diff -u -r1.47 vm_extern.h --- vm/vm_extern.h 2000/03/13 10:47:24 1.47 +++ vm/vm_extern.h 2001/07/15 20:36:14 @@ -84,7 +84,6 @@ int vm_mmap __P((vm_map_t, vm_offset_t *, vm_size_t, vm_prot_t, vm_prot_t, int, void *, vm_ooffset_t)); vm_offset_t vm_page_alloc_contig __P((vm_offset_t, vm_offset_t, vm_offset_t, vm_offset_t)); void vm_set_page_size __P((void)); -void vmmeter __P((void)); struct vmspace *vmspace_alloc __P((vm_offset_t, vm_offset_t)); struct vmspace *vmspace_fork __P((struct vmspace *)); void vmspace_exec __P((struct proc *)); Index: kern/kern_synch.c === RCS file: /dump/FreeBSD-CVS/src/sys/kern/kern_synch.c,v retrieving revision 1.148 diff -u -r1.148 kern_synch.c --- kern/kern_synch.c 2001/07/06 01:16:42 1.148 +++ kern/k
Re: disklabel broken again?
In message, Matthew Jacob writ es: > >Sometime in the last few days, disklabel -Brw auto seems to have stopped >working for me on alpha It used to be the thing of: >Now I get: > >dd if=/dev/zero of=/dev/da5 bs=1024k count=10 >... >disklabel -Brw da5 auto >disklabel: No space left on device I think this can happen when there is an existing label on the disk, but I forget the exact conditions. Try dd'ing a few k of zeros on to the disk and run the disklabel again? Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: disklabel broken again?
In message <[EMAIL PROTECTED]>, Ian Dowse writes: >In message <Pine.BSF.4.21.0107122138260.61694-10@beppo>, Matthew Jacob wri >t >es: >>dd if=/dev/zero of=/dev/da5 bs=1024k count=10 >>... >>disklabel -Brw da5 auto >>disklabel: No space left on device > >I think this can happen when there is an existing label on the >disk, but I forget the exact conditions. Try dd'ing a few k of >zeros on to the disk and run the disklabel again? Whoops, I'm not awake. Ignore that! :-) Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Load average synchronisation and phantom loads
In message <[EMAIL PROTECTED]>, Bruce Ev ans writes: > >I think that is far too much variation. 5 seconds is hard-coded into >the computation of the load average (constants in cexp[]), so even a >variation of +-1 ticks breaks the computation slightly. I have not changed the mean inter-sample time from 5 seconds (*), so is this really a problem? There will be a slight time-warping effect in the load calculation, but even for the shorter 5-minute timescale, this will average out to typically no more than a few percent (i.e the "5 minutes" will instead normally be approx 4.8 to 5.2 minutes). Apart from a marginally more wobbley xload display, this should not make any real difference. If the variation was much smaller than it is in the proposed patch, you could get a noticable drifting in and out of phase with processes that have a regular run-pause pattern. Obviously this is a much bigger problem when the sample period is fixed like it is now, but I wanted to minimise the possibility of this effect while keeping the inter-update time "relatively" constant. The alternative that I considered was to sample the processes once during every 5-second interval, but to place the sampling point randomly over the interval. That probably results in a better synchronisation-avoidance behaviour. However, to incorporate the sample into the load average requires either waiting until the end of the interval, or updating the load average at the time of sampling. The former introduces a new delay into the load average computation, and the latter results in a lot of very noticable jitter on the inter-sample interval. (*) Actually, I have changed the mean by 0.5 ticks, but that is a bug that I will fix. The "4 + random() % (hz * 2)" should be "4 + random() % (hz * 2 + 1)" instead. >Not another SYSINIT (all SYSINITs are evil IMO). SI_SUB_PSEUDO is >bogus here -- there are no pseudo ttys here. sched_setup() is a >good place to do this initialization. John Baldwin suggested moving the load average calculation into kern_synch.c, so it would certainly make sense to initialise it from sched_setup() then. This seems like a good idea to me; does that sound OK? Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Load average synchronisation and phantom loads
In message <[EMAIL PROTECTED]>, Bruce Ev ans writes: >On Tue, 17 Jul 2001, Ian Dowse wrote: >> effect in the load calculation, but even for the shorter 5-minute >> timescale, this will average out to typically no more than a few >> percent (i.e the "5 minutes" will instead normally be approx 4.8 >> to 5.2 minutes). Apart from a marginally more wobbley xload display, >> this should not make any real difference. > >It should average out to precisely the same as before if you haven't >changed the mean (mean = average :-). The real difference may be >small, but I think it is an unnecessary regression. I meant the "5-minute average" that is computed; it will certainly not be precicely the same as before, though it will be similar. >from 0 very fast. Even with a large variation, the drift might not be >fast enough. Actually, it's not too bad with a +-1 second variation, which is why I chose a value that large. If you plot 60 samples (60 is the number of 5-second intervals in the 5-minute load average timescale) you get a relatively good dispersion of points throughout the 5-second interval. Try pasting the following into gnuplot a few times: plot [] [-2.5:2.5] \ "1){}select(undef,undef,undef,2.5)}' (5-second period, 50% duty cycle) then the interference pattern resulting from a +-1 tick variation has a period that is typically days long! Of course the interference pattern caused by the above script has an infinitely long period with the old load average calculation; it always causes an additional load of 1.0 even though the %CPU usage is approx 50%. >> The alternative that I considered was to sample the processes once >> during every 5-second interval, but to place the sampling point >> randomly over the interval. That probably results in a better > >I rather like this. With immediate update, It's almost equivalent to >your current method with a random variation of between -5 and 5 seconds >instead of between -1 and 1 seconds. Your current method doesn't >really reduce the jitter -- it just concentrates it into a smaller >interval. When I tried this approach (with immediate update), I didn't like the jumpyness of the load average. Instead of the relatively smooth decay that I'm used to, the way it sometimes changed twice in short succession and sometimes did not change for nearly 10 seconds was quite noticable. I'd be quite happy to go with the delayed version of this, though it does mean having two timer routines, and storing the `nrun' somewhere between samples and updates. >hopefully rare. Use a (small) random variation to reduce phase effects >for such processes. I think there are none in the kernel. I would try >using the following magic numbers: > >sample interval = 5.02 seconds (approx) (not 5.01, so that the random > variation never gives a multiple > of 1.00) >random variation = 0+-0.01 seconds (approx) >cexp[] = adjusted for 5.02 instead of 5.00 See above. I really want to try and avoid _any_ significant synchronisation effects, not just those that are caused by the kernel or by applications that happen to have a run pattern with a period of N * 1.0 seconds. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: NFS client unable to recover from server crash
In message <[EMAIL PROTECTED]>, Maxim Sobolev writes: >I found that after introduction of the new RPC NFS client is no longer >able to recover from server crash (both cluent and server are 5-CURRENT >systems). After a well known `nfs server not responding' message, client >hangs and even though server comes back in a minute or two it doesn't >recover and just sits in this state forvewer. All unmount requests gets >stuck in the kernel, so as a processes that accessing files from that >mount point. This doesn't looks like a right thing and obviously should >be fixed before 5.0-RELEASE. I've seen some similar effects, but I don't think it has anything to do with the new RPC code, as that only runs at mount time. It would be useful if you could use tcpdump to see if any requests are being transmitted, and if they are getting responses. Also try running kdgb on the client to get a kernel backtrace of the stuck processes. Is this a UDP or TCP based mount? If you are feeling brave, you could also try the patch below. It is a selection of changes to the kernel NFS code that I have built up over the last few months. I don't think it could solve the hangs, but it should improve the chance of interruptible mounts accepting ^C while waiting, and (just added the other day) umount -f should work while the server is down even if processes are hung. Ian Index: nfs.h === RCS file: /dump/FreeBSD-CVS/src/sys/nfs/nfs.h,v retrieving revision 1.59 diff -u -r1.59 nfs.h --- nfs.h 2001/04/17 20:45:21 1.59 +++ nfs.h 2001/07/20 13:19:51 @@ -633,6 +633,7 @@ struct mbuf *)); intnfs_adv __P((struct mbuf **, caddr_t *, int, int)); void nfs_nhinit __P((void)); +void nfs_nmcancelreqs __P((struct nfsmount *)); void nfs_timer __P((void*)); intnfsrv_dorec __P((struct nfssvc_sock *, struct nfsd *, struct nfsrv_descript **)); Index: nfs_nqlease.c === RCS file: /dump/FreeBSD-CVS/src/sys/nfs/nfs_nqlease.c,v retrieving revision 1.59 diff -u -r1.59 nfs_nqlease.c --- nfs_nqlease.c 2001/05/01 08:13:14 1.59 +++ nfs_nqlease.c 2001/05/01 14:29:22 @@ -952,7 +952,9 @@ } /* - * Called for client side callbacks + * Called for client side callbacks. + * NB: We are responsible for freeing `mrep' in all cases, but note + * that anything that does a 'goto nfsmout' frees it for us. */ int nqnfs_callback(nmp, mrep, md, dpos) @@ -982,8 +984,10 @@ nfsd->nd_md = md; nfsd->nd_dpos = dpos; error = nfs_getreq(nfsd, &tnfsd, FALSE); - if (error) + if (error) { + m_freem(mrep); return (error); + } md = nfsd->nd_md; dpos = nfsd->nd_dpos; if (nfsd->nd_procnum != NQNFSPROC_EVICTED) { Index: nfs_socket.c === RCS file: /dump/FreeBSD-CVS/src/sys/nfs/nfs_socket.c,v retrieving revision 1.66 diff -u -r1.66 nfs_socket.c --- nfs_socket.c2001/05/01 08:13:14 1.66 +++ nfs_socket.c2001/07/20 13:45:01 @@ -144,7 +144,8 @@ */ #defineNFS_CWNDSCALE 256 #defineNFS_MAXCWND (NFS_CWNDSCALE * 32) -static int nfs_backoff[8] = { 2, 4, 8, 16, 32, 64, 128, 256, }; +#define NFS_NBACKOFF 8 +static int nfs_backoff[NFS_NBACKOFF] = { 2, 4, 8, 16, 32, 64, 128, 256, }; int nfsrtton = 0; struct nfsrtt nfsrtt; struct callout_handle nfs_timer_handle; @@ -299,11 +300,17 @@ splx(s); } if (nmp->nm_flag & (NFSMNT_SOFT | NFSMNT_INT)) { - so->so_rcv.sb_timeo = (5 * hz); - so->so_snd.sb_timeo = (5 * hz); + so->so_rcv.sb_timeo = (2 * hz); + so->so_snd.sb_timeo = (2 * hz); } else { - so->so_rcv.sb_timeo = 0; - so->so_snd.sb_timeo = 0; + /* +* We would normally set the timeouts to 0 (never time out) +* for non-interruptible mounts. However, nfs_nmcancelreqs() +* can still prematurely terminate requests, so avoid +* waiting forever. +*/ + so->so_rcv.sb_timeo = 10 * hz; + so->so_snd.sb_timeo = 10 * hz; } /* @@ -1400,10 +1407,18 @@ for (rep = nfs_reqq.tqh_first; rep != 0; rep = rep->r_chain.tqe_next) { nmp = rep->r_nmp; if (rep->r_mrep || (rep->r_flags & R_SOFTTERM)) - continue; - if (nfs_sigintr(nmp, rep, rep->r_procp)) { - nfs_softterm(rep); continue; + /* +* Test for signals on interruptible mounts. We try to +* maintain normal (uninterruptible) semantics while the +* server is up, but respond quickly to signals when it +
Re: NFS client unable to recover from server crash
In message <[EMAIL PROTECTED]>, Matt Dillon writes: > Ian, please don't do this. The whole point of having an uninterruptable > mount is so the client can survive a server reboot or network failure. > Doing this destroys uninterruptable semantics. Firstly, I have no intention of committing this patch anytime soon, but I think you are mistaken in what it does. Unless I messed up, it will never allow "uninterruptible" mounts to be interrupted by signals or to time out. I set the socket timeout to 10 seconds, but that will have no effect because the code will simply loop around and retry again. It is nfs_sigintr() that detects signals, and it returns immediately unless the NFSMNT_INT mount flag is set. Similarly, the request only times out if rep->r_rexmit >= r_retry, but unless it is a "soft" nfs mount, r_rexmit is clamped at NFS_MAXREXMIT, and r_retry is set to NFS_MAXREXMIT + 1, so this can never happen. The only effect of changing that timeout value (again assuming I have not misread the code) is to allow any request that does get marked R_SOFTTERM to time out within a finite period. For hard mounts, the _only_ way that this can happen is via the new nfs_nmcancelreqs() which is called when you do a forced unmount. No, I haven't gone mad and decided to make all NFS mounts soft to "fix" all NFS problems :-) Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: filesystem errors
In message <[EMAIL PROTECTED]>, Michael Harnois writes: >I'm tearing my hair out trying to find a filesystem error that's >causing me a panic: ufsdirhash_checkblock: bad dir inode. > >When I run fsck from a single user boot, it finds no errors. > >When I run it on the same filesystem mounted, it finds errors: but, of >course, it then can't correct them [Kirk, I'm cc'ing you because here the dirhash code sanity checks found a directory entry with d_ino == 0 that was not at the start of a DIRBLKSIZ block. This doesn't happen normally, but it seems from this report that fsck does not correct this. Is it a basic filesystem assumption that d_ino == 0 can only happen at the start of a directory block, or is it something the code should tolerate?] Interesting - this is an error reported by the UFS_DIRHASH code that you enabled in your kernel config. A sanity check that the dirhash code is performing is failing. These checks are designed to catch bugs in the dirhash code, but in this case I think it may be a bug that fsck is not finding this problem, or else my sanity tests are too strict. A workaround is to turn off the sanity checks with: sysctl vfs.ufs.dirhash_docheck=0 or to remove UFS_DIRHASH from your kernel config. You could also try to find the directory that is causing the problems. Copy the following script to a file called dircheck.pl, and try running: chmod 755 dircheck.pl find / -fstype ufs -type d -print0 | xargs ./dircheck.pl That should show up any directories that would fail that dirhash sanity check - there will probably just be one or two that resulted from some old filesystem corruption. Ian #!/usr/local/bin/perl while (defined($dir = shift)) { unless (open(DIR, "$dir")) { print STDERR "$dir: $!\n"; next; } $b = 0; my(%dir) = (); while (sysread(DIR, $dat, 512) == 512) { $off = 0; while (length($dat) > 0) { ($dir{'d_ino'}, $dir{'d_reclen'}, $dir{'d_type'}, $dir{'d_namlen'}) = unpack("LSCC", $dat); $dir{'d_name'} = substr($dat, 8, $dir{'d_namlen'}); $minreclen = (8 + $dir{'d_namlen'} + 1 + 3) & (~3); $gapinfo = ($dir{'d_reclen'} == $minreclen) ? "" : sprintf("[%d]", $dir{'d_reclen'} - $minreclen); if ($dir{'d_ino'} == 0 && $off != 0) { printf("%s off %d ino %d reclen 0x%x type 0%o" . " namelen %d name '%s' %s\n", $dir, $off, $dir{'d_ino'}, $dir{'d_reclen'}, $dir{'d_type'}, $dir{'d_namlen'}, $dir{'d_name'}, $gapinfo); } if ($dir{'d_reclen'} > length($dat)) { die "reclen too long!\n"; } $dat = substr($dat, $dir{'d_reclen'}); $off += $dir{'d_reclen'}; } $b++; } } To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: filesystem errors
In message <[EMAIL PROTECTED]>, Michael Harnois writes: > >The only result it generated was > >/usr/home/mdharnois off 120 ino 0 reclen 0x188 type 010 namelen 14 name '.fetc >hmail.pid' [368] > >and that file is destroted and recreated every couple of minutes. It's the directory (/usr/home/mdharnois), not the file that is the problem. If you recreate the directory: cd /usr/home mv mdharnois mdharnois.old mkdir mdharnois chown mdharnois:mdharnois mdharnois # (or whatever) mv mdharnois.old/* mdharnois/ mv mdharnois.old/.[a-zA-Z0-9]* mdharnois/ rmdir mdharnois.old this problem should go away permanently. Even just creating loads of files in the existing directory might be enough to reuse the bit of the directory that has d_ino == 0. Running ./dircheck.pl /usr/home/mdharnois will check if there is still a problem. However, I'd like to know if this is something that fsck should detect and correct automatically. It is an odd case, because the ffs filesystem code never creates directory entries like this, but I think it will not object to them if it finds them. This kind of ambiguity is probably a bad thing. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
SIGCHLD changes causing fault on nofault entry panics
The panics in exit1() that have been reported on -stable appear to be caused by these commits: REV:1.92.2.4kern_exit.c 2001/07/25 17:21:46 dillon REV:1.72.2.7kern_sig.c 2001/07/25 17:21:46 dillon MFC kern_exit.c 1.131, kern_sig.c 1.125 - bring SIGCHLD SIG_IGN signal handling in line with other operating systems. These probably correspond to similar panics seen in -current, but I haven't checked the details. In the vmcore I just got, the panic occurred in the following fragment in exit1(), when dereferencing p_sigacts (which is p_procsig->ps_sigacts). I guess there is a race here if the parent is exiting or something? + if ((p->p_pptr->p_procsig->ps_flag & PS_NOCLDWAIT) + || p->p_pptr->p_sigacts->ps_sigact[_SIG_IDX(SIGCHLD)] == SIG_IGN) { Matt, I will just back out these changes from RELENG_4 shortly until the issue is resolved. The change was non-essential and quite contained, so it's probably better than waiting for a fix. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: filesystem errors
In message <[EMAIL PROTECTED]>, Michael Harnois writes: > >I don't have sufficient technical knowledge to know which of you is >right; I would just ask that filesystem corruption caused by >restarting from a hung system not cause a panic . I removed the extra sanity check yesterday, so if you have revision 1.3 of ufs_dirhash.c you won't see that panic again. I didn't realise that fsck actually causes these directory entries, but just the fact that it leaves them intact meant that the sanity check was bad. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: rpc.umtall dumps core on each startup/shutdown
In message <[EMAIL PROTECTED]>, Maxim Sobolev writes: >I found that the rpc.umntall program from time to time starts dumping a core >at each startup/shutdown. Removal of /var/db/mountab helps for certain It seems to be a bug in the rpc library (thank $deity for libefence when tracking down such bugs :-). The rpcbind client code in libc keeps a cache of DNS lookups, but it is missing a strdup() when it copies a string from the cache. Investigating this has shown up a few bugs I introduced to rpc.umtall in my last set of changes, so I'll fix those too. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Panic on 8/10 -current: sleeping process owns a mutex
In message <[EMAIL PROTECTED]>, Doug Barton writes: >Immediately prior to the crash I was getting a lot of these on the console: > >Aug 12 01:00:52 Master /boot/kernel/kernel: >/usr/local/src/sys/kern/kern_synch.c:377: sleeping with "mountlist" locke >d from /usr/local/src/sys/kern/vfs_syscalls.c:548 This should be fixed by revision 1.198 of vfs_syscalls.c. It could only occur during unmount(), which is why it didn't show up more often: iedowse 2001/08/20 12:16:31 PDT Modified files: sys/kern vfs_syscalls.c Log: Avoid sleeping while holding a mutex in dounmount(). This problem has existed for a long time, but I made it worse a few months ago by by adding calls to VFS_ROOT() and checkdirs() in revision 1.179. Also, remove the LK_REENABLE flag in the lockmgr() call; this flag has been ignored by the lockmgr code for 4 years. This was the only remaining mention of it apart from its definition. Reviewed by: jhb Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Panic with latest current/UFS_DIRHASH
In message <[EMAIL PROTECTED]>, Ollivier Robert writes: > >The interesting thing is that I also get that with my old 17th Jul. >kernel... except that the panic message is > >"ufsdirhash_checkblock: bad dir inode" > >It is always in the following part of installworld: That's interesting - the "bad dir inode" bit in particular. I'll look into this in more detail later. My first guess is that there is a logic flaw in the dirhash code that only triggers when dirhash comes across a directory entry that has had its inode zeroed by fsck. The kernel filsystem code only ever places unused directory entries at the start of a directory block (free space that is not at the start of a block is merged into an exesting entry). However, fsck can mark any entry as unused, resulting in the unfortunate situation that fsck can put the filesystem into a state that cannot be produced by any combination of kernel filesystem operations. That introduces quite some potential for obscure bugs that only occur after an fsck run... Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
fsck setting d_ino == 0 (was Re: filesystem errors)
In message <[EMAIL PROTECTED]>, Kirk McKusick writ es: >FFS will never set a directory ino == 0 at a location other >than the first entry in a directory, but fsck will do so to >get rid of an unwanted entry. The readdir routines know to >skip over an ino == 0 entry no matter where in the directory >it is found, so applications will never see such entries. >It would be a fair amount of work to change fsck to `do the >right thing', as the checking code is given only the current >entry with which to work. I am of the opinion that you >should simply accept that mid-directory block ino == 0 is >acceptable rather than trying to `fix' the problem. Bleh, well I guess not too surprisingly, there is a case in ufs_direnter() (ufs_lookup.c) where the kernel does the wrong thing when a mid-block entry has d_ino == 0. The result can be serious directory corruption, and the bug has been there since the Lite/2 merge: # fetch http://www.maths.tcd.ie/~iedowse/FreeBSD/dirbug_img.gz Receiving dirbug_img.gz (6745 bytes): 100% 6745 bytes transferred in 0.0 seconds (4.69 MBps) # gunzip dirbug_img.gz # mdconfig -a -t vnode -f dirbug_img md0 # fsck_ffs /dev/md0 ** /dev/md0 ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 20 files, 1 used, 2638 free (14 frags, 328 blocks, 0.5% fragmentation) # mount /dev/md0 /mnt # touch /mnt/ff12 # umount /mnt # fsck_ffs /dev/md0 ** /dev/md0 ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames DIRECTORY CORRUPTED I=2 OWNER=root MODE=40755 SIZE=512 MTIME=Aug 21 22:28 2001 DIR=/ SALVAGE? [yn] The bug is that when compressing directory blocks, the code trusts the DIRSIZ() macro to calculate the amount of data to be bcopy'd when moving a directory entry. If d_ino is zero, DIRSIZ() cannot be trusted, so random bytes in unused portions of the directory determine how much gets copied. I think it is very unlikely in practice for the value returned by DIRSIZ() to be harmful, but fsck certainly doesn't check it so this bug can be triggered after other types of corruption have been repaired by fsck. I just found this while looking for a dirhash bug - the dirhash code didn't check for d_ino == 0 when compressing directories, so it would freak when it couldn't find the entry to move. The patch below should fix both these issues, and it makes it clearer that DIRSIZ() is not used when d_ino == 0. Any comments welcome. The patch is a bit larger than it needs to be, but that directory compression code is so hard to understand that I think it is worth clarifying it slightly :-) Ian Index: ufs_lookup.c === RCS file: /FreeBSD/FreeBSD-CVS/src/sys/ufs/ufs/ufs_lookup.c,v retrieving revision 1.52 diff -u -r1.52 ufs_lookup.c --- ufs_lookup.c2001/08/18 03:08:48 1.52 +++ ufs_lookup.c2001/08/21 23:59:09 @@ -869,26 +869,38 @@ * dp->i_offset + dp->i_count would yield the space. */ ep = (struct direct *)dirbuf; - dsize = DIRSIZ(OFSFMT(dvp), ep); + dsize = ep->d_ino ? DIRSIZ(OFSFMT(dvp), ep) : 0; spacefree = ep->d_reclen - dsize; for (loc = ep->d_reclen; loc < dp->i_count; ) { nep = (struct direct *)(dirbuf + loc); - if (ep->d_ino) { - /* trim the existing slot */ - ep->d_reclen = dsize; - ep = (struct direct *)((char *)ep + dsize); - } else { - /* overwrite; nothing there; header is ours */ - spacefree += dsize; + + /* Trim the existing slot (NB: dsize may be zero). */ + ep->d_reclen = dsize; + ep = (struct direct *)((char *)ep + dsize); + + loc += nep->d_reclen; + if (nep->d_ino == 0) { + /* +* A mid-block unused entry. Such entries are +* never created by the kernel, but fsck_ffs +* can create them (and it doesn't fix them). +* +* Add up the free space, and initialise the +* relocated entry since we don't bcopy it. +*/ + spacefree += nep->d_reclen; + ep->d_ino = 0; + dsize = 0; + continue; } dsize = DIRSIZ(OFSFMT(dvp), nep); spacefree += nep->d_reclen - dsize; #ifdef UFS_DIRHASH if (dp->i_dirhash != NULL) - uf
Re: Panic with latest current/UFS_DIRHASH
In message <[EMAIL PROTECTED]>, Ollivier Robert writes: >According to Ollivier Robert: >> Just upgraded my laptop to the latest current and during installworld, got >> this panic: >> >> panic: ufsdirhash_findslot: 'ka_JP.Shift_JIS' not found Thanks for the bug report - see my other mail to -current for further details, but the quick answer is that dirhash has a bug that is triggered by the odd directory entries that fsck sometimes leaves behind. This short patch should fix it: Ian Index: ufs_lookup.c === RCS file: /FreeBSD/FreeBSD-CVS/src/sys/ufs/ufs/ufs_lookup.c,v retrieving revision 1.52 diff -u -r1.52 ufs_lookup.c --- ufs_lookup.c2001/08/18 03:08:48 1.52 +++ ufs_lookup.c2001/08/22 00:27:17 @@ -884,7 +884,7 @@ dsize = DIRSIZ(OFSFMT(dvp), nep); spacefree += nep->d_reclen - dsize; #ifdef UFS_DIRHASH - if (dp->i_dirhash != NULL) + if (dp->i_dirhash != NULL && nep->d_ino) ufsdirhash_move(dp, nep, dp->i_offset + loc, dp->i_offset + ((char *)ep - dirbuf)); #endif To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Kernel Panic from Yesterday's CVSup
In message <[EMAIL PROTECTED]>, John Baldwin writes: >> malloc(48,c0238100,0,c65feb80,0) at malloc+0x2a >> exit1(c65feb80,0,0,c6623f78,c01fc852) at exit1+0x1b1 >> kthread_suspend(0,c0279a40,0,c022d1ec,a2) at kthread_suspend >> ithd_loop(0,c6623fa8) at ithd_loop+0x56 >> fork_exit(c01fc7fc,0,c6623fa8) at fork_exit+0x8 >> fork_trampoline() at fork_trampoline+0x8 >> db> witness_list >> "Giant" (0xc0279a40) locked at ../../i386/isa/ithread.c:162 > >Erm, ithd_loop() doesn't call kthread_suspend(). *sigh*. Something >else is rather messed up here I'm afraid. Note that the return address into kthread_suspend is kthread_suspend+0x0. Since the call to exit1() in kthread_exit is the very last operation in kthread_exit, you'd expect the return address on the stack to be at the start of the next function... Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: panic: lockmgr: draining against myself
In message <[EMAIL PROTECTED]>, ryan beasley writes: > >panic: lockmgr: draining against myself I've just checked in revision 1.426 of vfs_subr.c which may solve this, but I was not able to reproduce it myself. Could you or anybody else who has seen this panic try the above revision to see if it helps? Note that you will need HEAD rather than RELENG_5_0 to get this change. There is probably a better approach to solve the VOP_INACTIVE recursion problem than the one I used though - I think maybe having a vnode flag that remembers whether a VOP_INACTIVE call is necessary would be more general than the VI_DOINGINACT flag. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Problem with umount and fsid?
In message <[EMAIL PROTECTED]>, Nate Lawson writes: >I get an error when umounting a FAT filesystem on a USB flash drive. It >appears the device is properly unmounted. Is this a case that needs to be >fixed in our fsid code? It happens every time I unmount this device. >laptop# umount /thumb >umount: unmount of /thumb failed: No such file or directory >umount: retrying using path instead of file system ID >laptop# mount | grep da0 >laptop# Thanks for the report - in theory this should only occur if you have a kernel from before July 1st but a newer userland. Assuming that's not the case, I must have overlooked something. Could you update to the latest sbin/mount, and then post the output of: mount -v | grep /thumb truss umount /thumb Thanks, Ian ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: PLIP transmit timeouts -- any solutions?
In message <[EMAIL PROTECTED]>, Christopher Nehren writes: > >--=-7MVWKH2AJ0lqXf3q30++ >Content-Type: text/plain >Content-Transfer-Encoding: quoted-printable > >I currently have a PLIP link to an old laptop running Linux (I tried to >install FreeBSD, but it freezes at the USB detection -- yes, I tried Try the following patch. I can't remember if all the changes in this are necessary, but I think I found it fixed problems when interoperating with a Linux-like PLIP implementation. If I remember correctly, the PLIP implementation I saw used the data bits that came in the very first read that had the correct handshake signal, whereas FreeBSD readers do one extra read after the handshake to ensure that the signal is stable (i.e. that implementation used an "unsafe" read and a "safe" write, whereas FreeBSD's uses a "safe" read and an "unsafe" write). This patch causes both read and write to be "safe". The removal of the use of ctxmitl[] seems to be unnecessary. Ian Index: if_plip.c === RCS file: /dump/FreeBSD-CVS/src/sys/dev/ppbus/if_plip.c,v retrieving revision 1.28 diff -u -r1.28 if_plip.c --- if_plip.c 4 Mar 2003 23:19:54 - 1.28 +++ if_plip.c 12 Mar 2003 07:09:43 - @@ -409,12 +409,14 @@ static __inline int clpoutbyte (u_char byte, int spin, device_t ppbus) { - ppb_wdtr(ppbus, ctxmitl[byte]); + ppb_wdtr(ppbus, byte & 0xf); + ppb_wdtr(ppbus, (byte & 0xf) | 0x10); while (ppb_rstr(ppbus) & CLPIP_SHAKE) if (--spin == 0) { return 1; } - ppb_wdtr(ppbus, ctxmith[byte]); + ppb_wdtr(ppbus, ((byte & 0xf0) >> 4) | 0x10); + ppb_wdtr(ppbus, ((byte & 0xf0) >> 4)); while (!(ppb_rstr(ppbus) & CLPIP_SHAKE)) if (--spin == 0) { return 1; ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Fixing -pthreads (Re: ports and -current)
In message <[EMAIL PROTECTED]>, Daniel Eischen writes: >On Wed, 24 Sep 2003, Scott Long wrote: >> PTHREAD_LIBS is a great tool for the /usr/ports mechanism, but doesn't >> mean anything outside of that. > >That just meant it makes it easier to maintain ports so that >they are PTHREAD_LIBS compliant (they would break when linked). >I know it has no bearing on 3rd party stuff. Just to throw one further approach out on the table, below is a patch that makes gcc read from a file to determine what library to associate with the -pthread flag. It's a hack of course, and probably neither correct or optimal. If you want to make -pthread mean libkse, create an /etc/pthread.libs that looks like: -lc_r: -lkse -lc_r_p:-lkse_p I haven't been following the discussion in any detail - this is just another possibility that is worth mentioning as it could retain compatibility for users that want -pthread to mean use the default thread library. Ian Index: gcc.c === RCS file: /dump/FreeBSD-CVS/src/contrib/gcc/gcc.c,v retrieving revision 1.36 diff -u -r1.36 gcc.c --- gcc.c 11 Jul 2003 04:45:39 - 1.36 +++ gcc.c 24 Sep 2003 15:37:14 - @@ -331,6 +331,7 @@ static const char *if_exists_spec_function PARAMS ((int, const char **)); static const char *if_exists_else_spec_function PARAMS ((int, const char **)); +static const char *thread_lib_override_spec_function PARAMS ((int, const char **)); /* The Specs Language @@ -1440,6 +1441,7 @@ { { "if-exists", if_exists_spec_function }, { "if-exists-else", if_exists_else_spec_function }, + { "thread-lib-override", thread_lib_override_spec_function }, { 0, 0 } }; @@ -7335,4 +7337,46 @@ return argv[0]; return argv[1]; +} + +/* thread-lib-override built-in spec function. + + Override a thread library according to /etc/pthread.libs */ + +static const char * +thread_lib_override_spec_function (argc, argv) + int argc; + const char **argv; +{ + static char buf[256]; + FILE *fp; + int n; + + /* Must have exactly one argument. */ + if (argc != 1) +return NULL; + + fp = fopen ("/etc/pthread.libs", "r"); + if (fp == NULL) +return argv[0]; + + while (fgets (buf, sizeof(buf), fp) != NULL) +{ + n = strlen (buf); + while (n > 0 && buf[n - 1] == '\n') + buf[--n] = '\0'; + if (buf[0] == '#' || buf[0] == '\0') + continue; + n = strlen (argv[0]); + if (strncmp (buf, argv[0], n) != 0 || n >= sizeof (buf) || buf[n] != ':') + continue; + n++; + while (buf[n] != '\0' && isspace ((unsigned char)buf[n])) + n++; + fclose (fp); + + return &buf[n]; +} + fclose (fp); + return argv[0]; } Index: config/freebsd-spec.h === RCS file: /dump/FreeBSD-CVS/src/contrib/gcc/config/freebsd-spec.h,v retrieving revision 1.14 diff -u -r1.14 freebsd-spec.h --- config/freebsd-spec.h 21 Sep 2003 07:59:16 - 1.14 +++ config/freebsd-spec.h 24 Sep 2003 15:38:11 - @@ -160,8 +160,8 @@ #if __FreeBSD_version >= 500016 #define FBSD_LIB_SPEC "\ %{!shared: \ -%{!pg: %{pthread:-lc_r} -lc} \ -%{pg: %{pthread:-lc_r_p} -lc_p} \ +%{!pg: %{pthread:%:thread-lib-override(-lc_r)} -lc}\ +%{pg: %{pthread:%:thread-lib-override(-lc_r_p)} -lc_p}\ }" #else #define FBSD_LIB_SPEC "\ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: INPCB panic....
In message <[EMAIL PROTECTED]>, Sam Leffler writes: >On Monday 10 November 2003 11:37 am, Larry Rosenman wrote: >> I removed my wi0 card (with DHCLIENT running), and got the following panic >> on a -CURRENT from yesterday: > >Thanks. Working on it... FYI, I've been using the following patch locally which seems to trigger the printf sometimes when wi0 is ejected. Without the patch, it used to dereference a stale struct ifnet and crash. I have an approx 1 week old kernel, so this particular problem may have been fixed already. Ian Index: in_pcb.c === RCS file: /dump/FreeBSD-CVS/src/sys/netinet/in_pcb.c,v retrieving revision 1.125 diff -u -r1.125 in_pcb.c --- in_pcb.c1 Nov 2003 07:30:07 - 1.125 +++ in_pcb.c3 Nov 2003 00:52:41 - @@ -564,10 +564,12 @@ * destination, in case of sharing the cache with IPv6. */ ro = &inp->inp_route; - if (ro->ro_rt && - (ro->ro_dst.sa_family != AF_INET || -satosin(&ro->ro_dst)->sin_addr.s_addr != faddr.s_addr || -inp->inp_socket->so_options & SO_DONTROUTE)) { + if (ro->ro_rt && ((ro->ro_rt->rt_flags & RTF_UP) == 0 || + ro->ro_dst.sa_family != AF_INET || + satosin(&ro->ro_dst)->sin_addr.s_addr != faddr.s_addr || + inp->inp_socket->so_options & SO_DONTROUTE)) { + if ((ro->ro_rt->rt_flags & RTF_UP) == 0) + printf("clearing non-RTF_UP route\n"); RTFREE(ro->ro_rt); ro->ro_rt = (struct rtentry *)0; } ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: cvs commit: src/sbin/umount umount.c
In message <[EMAIL PROTECTED]>, Rudolf Cejka writes: >> If the unmount by file system ID fails, don't warn before retrying >> a non-fsid unmount if the file system ID is all zeros. This is a ... >Hello and thanks for fixing this! I had a plan to report this, but you >were faster :o) I'm interested in this area - please, can you tell, what >do you plan to do in your more complete fix? > >When I looked at this issue, I thought about some things: > >* Why is f_fsid zeroed for non-root users at all? Is there any reason? As I understand it, the main reason for hiding file system IDs from non-root users is beacuse file system IDs are used as part of NFS file handles on an NFS server, so hiding them makes it harder to guess a valid file handle. If you know the file system ID and an inode number, then you would only need to guess the 32-bit inode generation number. OpenBSD started zeroing out file system IDs for non-root users a long time ago, and while FreeBSD mostly followed suit, I think it was only with Kirk's 64-bit statfs changes a few days ago that we have started doing this consistently (we had missed getfsstat() before). I was planning to return a filesystem ID of {st_dev, 0} to non-root users, where st_dev is the device number that is already returned by the stat(2) system call. This requires a few changes, because currently st_dev comes from va_fsid in struct vattr, which is not directly accessible at the time a file system is mounted. Since many userland applications depend on st_dev being persistent and unique, I think it makes more sense to have it as part of struct mount instead of struct vattr. Some additional changes are required to guarantee the uniqueness of st_dev's and file system IDs (including {st_dev, 0} ones), and then unmount(2) needs to accept these user-visible IDs. In fact, maybe {st_dev, 0} could be returned to root too, but that might possibly break some NFS-related utilities. >* There are small typos in umount.c: Thanks - fixed locally, but there's no urgency to commit before 5.2. >* Do you understand, why there is line in umount.c:376 > getmntentry(NULL, NULL, &sfs->f_fsid, REMOVE) > ? I'm not sure, but if it is needed for some reason, > I think that there should be used different getmntentry() according > to the used unmount() method, like in this patch: I think umount(8) first gets a list of all mounted file systems and then uses that list to resolve a mountpoint or device node into a a struct statfs. When unmounting all file systems, it needs to ignore any file systems that it has already unmounted, or it might attempt to unmount the same file system twice. If the unmount call fails, it should still do the REMOVE operation so that it will at least attempt an unmount on each file system. You're right that this will not work correctly with zeroed file system IDs (it worked before Kirk's commit last week, but wasn't supposed to). In practice can it ever make things worse than the uniqueness problems caused by non-root users not having no file system ID? I can't think of any examples offhand. >* /usr/src/sbin/mount/mount.c: If user uses mount -v, it prints false > zeroed fsids - isn't it better to print just non-zero fsids, so that > nobody is "mystified"? I have created two patches - I do not know > which do you consider as a better: Yes, I guess now that getfsstat(2) also zeros the IDs for non-root, there isn't much point in printing them. Ian ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kernel panic trying to utilize a da(4)/umass(4) device with ohci(4)
In message <[EMAIL PROTECTED]>, "Brian F. Feldman" writes: >Jeez, it's been broken a year and it's almost 5.2-RELEASE now. Does anyone >have ANY leads on these problems? I know precisely nothing about how my USB >hardware is supposed to work, but this OHCI+EHCI stuff definitely doesn't, >and it's really not uncommon at all. Is it unbroken in NetBSD currently? I had some success with this patch: Index: usb_mem.c === RCS file: /dump/FreeBSD-CVS/src/sys/dev/usb/usb_mem.c,v retrieving revision 1.5 diff -u -r1.5 usb_mem.c --- usb_mem.c 4 Oct 2003 22:13:21 - 1.5 +++ usb_mem.c 27 Oct 2003 15:39:03 - @@ -142,7 +142,8 @@ s = splusb(); /* First check the free list. */ for (p = LIST_FIRST(&usb_blk_freelist); p; p = LIST_NEXT(p, next)) { - if (p->tag == tag && p->size >= size && p->align >= align) { + if (p->tag == tag && p->size >= size && p->size < size * 2 && + p->align >= align) { LIST_REMOVE(p, next); usb_blk_nfree--; splx(s); It seems that since the conversion to busdma, the USB code can end up attempting to use contigmalloc() to allocate multi-page regions from an interrupt thread(!). The above doesn't fix that; it just prevents successful large (e.g 64k) contiguous allocations from being wasted when a much smaller amount of space is needed. With the above, I was able to use ohci+ehci fairly reliably on a Soekris box with a large USB2 disk attached via a cardbus USB2 adaptor. I also have a few other local patches that may help too - some of them are below: Ian Index: ohci.c === RCS file: /dump/FreeBSD-CVS/src/sys/dev/usb/ohci.c,v retrieving revision 1.132 diff -u -r1.132 ohci.c --- ohci.c 24 Aug 2003 17:55:54 - 1.132 +++ ohci.c 21 Sep 2003 15:28:27 - @@ -1405,12 +1405,13 @@ if (std->flags & OHCI_ADD_LEN) xfer->actlen += len; if (std->flags & OHCI_CALL_DONE) { + ohci_free_std(sc, std); /* XXX */ xfer->status = USBD_NORMAL_COMPLETION; s = splusb(); usb_transfer_complete(xfer); splx(s); - } - ohci_free_std(sc, std); + } else + ohci_free_std(sc, std); } else { /* * Endpoint is halted. First unlink all the TDs @@ -2246,6 +2247,7 @@ usb_uncallout(xfer->timeout_handle, ohci_timeout, xfer); usb_transfer_complete(xfer); splx(s); + return; } if (xfer->device->bus->intr_context || !curproc) Index: usbdi.c === RCS file: /dump/FreeBSD-CVS/src/sys/dev/usb/usbdi.c,v retrieving revision 1.82 diff -u -r1.82 usbdi.c --- usbdi.c 24 Aug 2003 17:55:55 - 1.82 +++ usbdi.c 21 Sep 2003 15:28:29 - @@ -751,6 +751,7 @@ pipe, xfer, pipe->methods)); /* Make the HC abort it (and invoke the callback). */ pipe->methods->abort(xfer); + KASSERT(SIMPLEQ_FIRST(&pipe->queue) != xfer, ("usbd_ar_pipe")); /* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */ } pipe->aborting = 0; @@ -763,8 +764,9 @@ { usbd_pipe_handle pipe = xfer->pipe; usb_dma_t *dmap = &xfer->dmabuf; + usbd_status status; int repeat = pipe->repeat; - int polling; + int polling, xfer_flags; SPLUSBCHECK; @@ -835,30 +837,33 @@ xfer->status = USBD_SHORT_XFER; } - if (xfer->callback) - xfer->callback(xfer, xfer->priv, xfer->status); - -#ifdef DIAGNOSTIC - if (pipe->methods->done != NULL) + /* Copy any xfer fields in case the xfer goes away in the callback. */ + status = xfer->status; + xfer_flags = xfer->flags; + /* +* For repeat operations, call the callback first, as the xfer +* will not go away and the "done" method may modify it. Otherwise +* reverse the order in case the callback wants to free or reuse +* the xfer. +*/ + if (repeat) { + if (xfer->callback) + xfer->callback(xfer, xfer->priv, status); pipe->methods->done(xfer); - else - printf("usb_transfer_complete: pipe->methods->done == NULL\n"); -#else - pipe->methods->done(xfer); -#endif - - if ((xfer->flags & USBD_SYNCHRONOUS) && !polling) - wakeup(xfer); + } else { +
Re: NFS -current
In message <[EMAIL PROTECTED]>, Patric Mrawek writes: >On several clients (-DP1, -DP2, 4-stable) mounting a nfs-share >(mount_nfs -i -U -3 server:/nfs /mnt) and then copying data from that >share to the local disk (find -x -d /mnt | cpio -pdumv /local) results >in lost NFS-mount. > >client kernel: nfs server server:/nfs: not responding 10 > 9 I'm not sure what you mean by a "lost" mount. Do all further accesses to the filesystem hang? It is normal enough to get the above 'not responding' errors occasionally on a busy fileserver, but only if they are almost immediately followed by 'is alive again' messages. If the filesystem stops working and doesn't recover, could you run `tcpdump -nepX -s 1600 udp port 2049' when it hangs and record a few packets? Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: NFS -current
In message <[EMAIL PROTECTED]>, Terry Lambert writes: >Ian Dowse wrote: >> It is normal enough to get the above 'not responding' errors >> occasionally on a busy fileserver, but only if they are almost >> immediately followed by 'is alive again' messages. > >This is arguably a bug in the FreeBSD UDP packet reassembly code ... Actually, I was referring here to an effect that occurs when the time taken by the server to complete requests varies in a particular way. The NFS client may observe a large number of requests all answered within a few milliseconds, so it starts using short timeouts. Then for some reason (usually a long list of outstanding disk-intensive operations), the server takes a few seconds to complete the next request. Within this time, the client repeatedly times out, retransmits the request, backs off and repeats, and in extreme cases it is possible for the client to reach the limit that triggers the "not responding" warning. Ian ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: msgbuf cksum mismatch (read 933e3, calc 93fbe)
In message <[EMAIL PROTECTED]>, Paolo Pisati writes: > >What does it mean? > >It's the first row in my today's kernel. You can safely ignore it. Some BIOSes don't clear the RAM during a reboot, so when booting up, FreeBSD attempts to pick up the kernel message buffer from before the reboot (this can be very handy if the reboot was caused by a panic). The above message indicates that the message buffer from the last boot initially appeared to be intact, but its checksum didn't match the contents, so it was cleared. I guess the message could be changed to cause less alarm, or it could be hidden behind bootverbose; I just thought it useful to indicate that the previous message buffer was mostly there in case somebody who really needs it preserved wants to disable the check. The behaviour here could possibly also be made a loader.conf tunable, but I didn't test whether tunables can be used that early in the boot process. Ian ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Heads up: checking in change to ata-card.c
In message <[EMAIL PROTECTED]>, "M. Warner Losh" writes: >Here's a better patch, basesd on wpaul's input. Bill, can you try it >an see if it works for you? If so, i would be better to commit this >one. If not, I'll work with you to fix it. FYI, I have a no-name ("PCMCIA"/"CD-ROM") drive that also requires failure of the second IO range to be made non-fatal. How about just deleting the `else' clause as in the patch below? It seems that this can only affect CD-ROM drives that were otherwise not working, so it should be fairly safe. Ian Index: ata-card.c === RCS file: /dump/FreeBSD-CVS/src/sys/dev/ata/ata-card.c,v retrieving revision 1.14 diff -u -r1.14 ata-card.c --- ata-card.c 17 Jun 2003 12:33:53 - 1.14 +++ ata-card.c 26 Jun 2003 23:00:01 - @@ -131,10 +131,6 @@ start + ATA_ALTOFFSET, ATA_ALTIOSIZE); } } -else { - bus_release_resource(dev, SYS_RES_IOPORT, rid, io); - return ENXIO; -} /* allocate the altport range */ rid = ATA_ALTADDR_RID; ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: -current panic in suser_cred()
In message <[EMAIL PROTECTED]>, Wesley M organ writes: >At some point between 20 Jun and (by my best guest) 22 Jun there has been >a problem introduced somewhere... How much more vague can you get? :)... >#12 0xc025dab5 in chkiq (ip=0xc3a5c400, change=4294967295, cred=0x0, >flags=0)#13 0xc025b57f in ufs_inactive (ap=0xdb467be0) >at ../../../ufs/ufs/ufs_inode.c:132 The UFS2 changes or something else probably broke quotas - try removing the "options QUOTA" from your kernel config for now. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: cvs commit: src/sys/i386/i386 pmap.c
In message <[EMAIL PROTECTED]>, "Andrew R. Reite r" writes: >"Too many pages were prefaulted in pmap_object_init_pt, thus > the wrong physical page was entered in the pmap for the virtual > address where the .dynamic section was supposed to be." > > Submitted by: tegge Pointy hat to: iedowse Sorry for the breakage, and thank's Tor for tracking it down. Somehow my testing (mainly in a netbooted environment) didn't show this up, and I failed to spot the bug even when I re-read the patches after the breakage reports appeared yesterday. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: KSE status report
In message <[EMAIL PROTECTED]>, Ju lian Elischer writes: >The big problem at the moment is that something in the >source tree as a whole, and probably something that came in with KSE >is stopping us from successfully compiling a working libc_r. >(a bit ironic really). Is the new (elm)->field.tqe_next = (void *)-1; in TAILQ_REMOVE a likely candidate? That could easily tickle old bugs in other code. The libc_r code does use a lot of TAILQ macros. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: additional queue macro
In message <[EMAIL PROTECTED]>, Jonathan Lemon writes: >Essentially, this provides a traversal of the tailq that is safe >from element removal, while being simple to drop in to those sections >of the code that need updating, as evidenced in the patch below. Note that this of course is not "safe from element removal" in general; it is just safe when you remove any element other than the next element, whereas TAILQ_FOREACH is safe when you remove any element other than the current one. For example it would not be safe to call a callback that could potentially remove arbitrary elements. It may be clearer in this case just to expand the macro in the code so that it is more obvious what assumptions can be made. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dump(8) is hosed
In message <[EMAIL PROTECTED]>, Don Lewis writes: > >I was finally finally able to reproduce this by creating a large file >before doing the dump. Dump(8) is *very* hosed. The UFS2 import broke >it's ability to follow multiple levels of indirect blocks. Thanks for tracking this down! One thing is that the code was using the static pointers to avoid having to malloc and free blocks every time. Keeping an array of NIADDR pointers and using `ind_level' as the index is an alternative (patch below) - I doubt the performance difference is noticable but it avoids having to remember the free() before each return. I'll commit your printf format changes first anyway - thanks! Ian Index: traverse.c === RCS file: /dump/FreeBSD-CVS/src/sbin/dump/traverse.c,v retrieving revision 1.19 diff -u -r1.19 traverse.c --- traverse.c 21 Jun 2002 06:17:57 - 1.19 +++ traverse.c 7 Jul 2002 10:44:55 - @@ -275,10 +275,13 @@ { int ret = 0; int i; - static caddr_t idblk; + static caddr_t idblks[NIADDR]; + caddr_t idblk; - if (idblk == NULL && (idblk = malloc(sblock->fs_bsize)) == NULL) + if (idblks[ind_level] == NULL && + (idblks[ind_level] = malloc(sblock->fs_bsize)) == NULL) quit("dirindir: cannot allocate indirect memory.\n"); + idblk = idblks[ind_level]; bread(fsbtodb(sblock, blkno), idblk, (int)sblock->fs_bsize); if (ind_level <= 0) { for (i = 0; *filesize > 0 && i < NINDIR(sblock); i++) { @@ -501,10 +505,13 @@ dmpindir(ino_t ino, ufs2_daddr_t blk, int ind_level, off_t *size) { int i, cnt; - static caddr_t idblk; + static caddr_t idblks[NIADDR]; + caddr_t idblk; - if (idblk == NULL && (idblk = malloc(sblock->fs_bsize)) == NULL) + if (idblks[ind_level] == NULL && + (idblks[ind_level] = malloc(sblock->fs_bsize)) == NULL) quit("dmpindir: cannot allocate indirect memory.\n"); + idblk = idblks[ind_level]; if (blk != 0) bread(fsbtodb(sblock, blk), idblk, (int) sblock->fs_bsize); else To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: -CURRENT trashes disk label
In message <[EMAIL PROTECTED]>, David Schultz writes: >I just made world on -CURRENT (cvsup a few hours ago), booted >using a new GENERIC kernel and ran mergemaster. Before I >installed world, I mounted the root partition for my more stable >development environment (4.6-RELEASE) to copy my firewall rules >over. In summary: ># fsck /dev/ad1s1a >** /dev/ad1s1a >BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE >/dev/ad1s1a: INCOMPLETE LABEL: type 4.2BSD fsize 0, frag 0, cpg 0, size 524288 You just need to "fsck -b16 /dev/ad1s1a", or alternatively upgrade to the latest -STABLE fsck, which fixes this issue. There are a few new superblock fields in use in -CURRENT that trigger some unnecessary fsck sanity checks. The other thing that causes scary-looking errors when moving disks back and forth between -CURRENT and -STABLE is when the snapshot used by -CURRENT's fsck gets left behind if you reboot during the background fsck and then boot -STABLE. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: bdwrite: buffer is not busy
In message <[EMAIL PROTECTED]>, "Andrey A. Chernov" writes: >I see this panic constantly during last month or two, UP machine, no >softupdates. Anybody else saw it too? Any ideas? The "buffer is not busy" panic is usually a secondary panic that occurs while trying to sync the disks after a different panic. If possible, try to get the first panic message, or ideally a stack trace. I think (but I've never checked for sure) that the "buffer is not busy" panics occur because of the following code in lockmgr(), combined with later sanity checks: if (panicstr != NULL) { mtx_unlock(lkp->lk_interlock); return (0); } This basically causes all lockmgr locks to be unconditionally and immediately granted after a panic without actually marking the lock as locked. Not surprisingly, this causes any lock state sanity checks later to fail. The original intention was probably to avoid deadlocking while syncing the disks, but a virtually guaranteed secondary panic isn't helpful either. It might be worth checking if a "return (EBUSY);" or a "lkp->lk_flags |= LK_HAVE_EXCL; lkp->lk_lockholder = pid;" in here would do better. The alternative is to make "kern.sync_on_panic=0" the default. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: NEWCARD support for Linksys Ethernet broken?
In message <[EMAIL PROTECTED]>, "M. Warner Losh" writes: >Used to work for me, but something seems to have busted it in recent >versions of the kernel. So NEWCARD appears to be broken for ata, sio >and ed. Wonderful. These all used to work at one point in the past. I think the "ed" problem is that without pccardd, the 0x8 flag is no longer being passed to the driver, so it doesn't try probing it as a Linksys card (I haven't checked for sure, but that would be consistent with it being detected as an NE2000). Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: do we still need ufs/ffs/ffs_softdep_stub.c ?
In message <[EMAIL PROTECTED]>, Luigi Rizzo writes: >Hi, >just got the following panic with today's -current sources and >an oldish config file (one not having "options SOFTUPDATES"): >panic(c026ecc1,c66e1b94,c01ff565,c1cda000,0) at panic+0x7c >softdep_slowdown(c1cda000,0,0,,2) at softdep_slowdown+0xd >ffs_truncate(c1cda000,0,0,c00,0) at ffs_truncate+0x81 >so the question is, do we still need ffs_softdep_stub.c ? In any >case, getting an explicit panic does not really sound right... The bug is in ffs_truncate() - it should not be calling softdep functions on non-softdep filesystems. The panic is there to catch exactly this kind of bug. I think the following patch should fix it. Ian Index: ffs_inode.c === RCS file: /dump/FreeBSD-CVS/src/sys/ufs/ffs/ffs_inode.c,v retrieving revision 1.81 diff -u -r1.81 ffs_inode.c --- ffs_inode.c 19 Jul 2002 07:29:38 - 1.81 +++ ffs_inode.c 3 Aug 2002 11:05:43 - @@ -173,7 +173,7 @@ * soft updates below. */ needextclean = 0; - softdepslowdown = softdep_slowdown(ovp); + softdepslowdown = DOINGSOFTDEP(ovp) && softdep_slowdown(ovp); extblocks = 0; datablocks = DIP(oip, i_blocks); if (fs->fs_magic == FS_UFS2_MAGIC && oip->i_din2->di_extsize > 0) { To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Is anyone else having trouble with dump(8) on -current?
[replying to an old message] In message <[EMAIL PROTECTED]>, Alexander Leidi nger writes: >On 7 Mai, Benjamin Lewis wrote: >> | DUMP: slave couldn't reopen disk: Interrupted system call > >Try the attached patch. I also have a similar patch for restore. I don't >like the patch, I think I should use SA_RESTART with sigaction(), so >think about this patch as a proof of concept (if it solves your >problem). I was just looking at PR bin/18319 when I remembered this message. Many of the changes in your patch are not necessary I believe, as read(2) will restart after a signal by default. How about just fixing the open call that actually triggers the reported error? I suspect that many of the other cases are either impossible or extremely unlikely in practice. Could someone who can reproduce the "couldn't reopen disk" error try the following? Ian Index: tape.c === RCS file: /dump/FreeBSD-CVS/src/sbin/dump/tape.c,v retrieving revision 1.22 diff -u -r1.22 tape.c --- tape.c 8 Jul 2002 00:29:23 - 1.22 +++ tape.c 9 Aug 2002 22:28:45 - @@ -740,8 +740,11 @@ * Need our own seek pointer. */ (void) close(diskfd); - if ((diskfd = open(disk, O_RDONLY)) < 0) + while ((diskfd = open(disk, O_RDONLY)) < 0) { + if (errno == EINTR) + continue; quit("slave couldn't reopen disk: %s\n", strerror(errno)); + } /* * Need the pid of the next slave in the loop... To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Is anyone else having trouble with dump(8) on -current?
In message <[EMAIL PROTECTED]>, Alexander Leiding er writes: >Have a look at Message-ID: <[EMAIL PROTECTED]> >(should be in the archive of audit). Ah, I had forgotten about that -audit thread. >Short: open shouldn't be able to return EINTR in practice... > >My assumptions: > - Bruce hasn't made a mistake > - something broke in the kernel (either for a "short" period of > time, or it's still broken), so we should look for the real > problem instead I had a quick look yesterday, and I found a PCATCH tsleep call in diskopen(), though I do not know if this is the one that affects dump. Does open(2) need to loop on ERESTART? Currently it just maps ERESTART to EINTR and returns the error. We should fix this broken dump behaviour anyway - I don't think it matters too much for now whether it is fixed in userland or the kernel, as it will only affect the tiny set of applications that receive signals while opening a disk device at the same time as another open on the same device is occurring (I think). Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Is anyone else having trouble with dump(8) on -current?
In message <[EMAIL PROTECTED]>, Bruce Evans writes: > >I don't know how open() of a disk device can be interrupted by a signal >in practice. Most disk operations don't check for signals. Does the PCATCH tsleep in diskopen() that I mentioned seem a likely candidate? Anyway, below is a simple program that reproduces the EINTR error fairly reliably for me when run on disk devices. Ian #include #include #include #include #include void handler(int sig) { } int main(int argc, char **argv) { int fd, i; if (argc < 2) errx(1, "Usage: %s device", argv[0]); fork(); fork(); fork(); fork(); signal(SIGUSR1, handler); sleep(1); for (i = 0; i < 200; i++) { killpg(0, SIGUSR1); if ((fd = open(argv[1], O_RDONLY)) < 0) err(1, "%s", argv[1]); close(fd); } return 0; } To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: fsck cannot find superblock
In message <[EMAIL PROTECTED]>, Bruce Evans writes: >> * drop support for 4K block sizes completely, but this breaks >> backwards compatibility > >I use patches like the following for the sanity checks: I think there may be other problems that are triggered by using <8k blocks on -current too. Last time I tried 4k blocks (pre-UFS2), the snapshot code would cause a panic when trying to allocate a single 4k block to fit the 8k superblock (the machine then got stuck in a reboot-fsck-panic cycle until interrupted and manually fsck'd). Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself
In message <[EMAIL PROTECTED]>, Don Lewis writes: > >A potentially better solution just occurred to me. It looks like it >would be better if vrele() waited to decrement v_usecount until *after* >the call to VOP_INACTIVE() (and after the call to VI_LOCK()). If that >were done, nfs_inactive() wouldn't need to muck with v_usecount at all. I've looked at this before; I think some filesystems (ufs anyway) depend on v_usecount being 0 when VOP_INACTIVE() is called. The patch I have had lying around for quite a while is below. It adds a vnode flag to avoid recursion into the last-reference handling code in vrele/vput, which is the real problem. It also guarantees that a vnode will not be recycled during VOP_INACTIVE(), so the nfs code no longer needs to hold an extra reference in the first place. The flag manipulation code got a bit messy after Jeff's vnode flag splitting work, so the patch could probably be improved. Whatever way this is done, we should try to avoid adding more hacks to the nfs_inactive() code anyway. Ian Index: sys/vnode.h === RCS file: /home/iedowse/CVS/src/sys/sys/vnode.h,v retrieving revision 1.206 diff -u -r1.206 vnode.h --- sys/vnode.h 1 Sep 2002 20:37:21 - 1.206 +++ sys/vnode.h 11 Sep 2002 11:06:46 - @@ -220,6 +220,7 @@ #defineVI_DOOMED 0x0080 /* This vnode is being recycled */ #defineVI_FREE 0x0100 /* This vnode is on the freelist */ #defineVI_OBJDIRTY 0x0400 /* object might be dirty */ +#defineVI_INACTIVE 0x0800 /* VOP_INACTIVE is in progress */ /* * XXX VI_ONWORKLST could be replaced with a check for NULL list elements * in v_synclist. @@ -377,14 +378,14 @@ /* Requires interlock */ #defineVSHOULDFREE(vp) \ - (!((vp)->v_iflag & (VI_FREE|VI_DOOMED)) && \ + (!((vp)->v_iflag & (VI_FREE|VI_DOOMED|VI_INACTIVE)) && \ !(vp)->v_holdcnt && !(vp)->v_usecount && \ (!(vp)->v_object || \ !((vp)->v_object->ref_count || (vp)->v_object->resident_page_count))) /* Requires interlock */ #define VMIGHTFREE(vp) \ - (!((vp)->v_iflag & (VI_FREE|VI_DOOMED|VI_XLOCK)) && \ + (!((vp)->v_iflag & (VI_FREE|VI_DOOMED|VI_XLOCK|VI_INACTIVE)) && \ LIST_EMPTY(&(vp)->v_cache_src) && !(vp)->v_usecount) /* Requires interlock */ Index: nfsclient/nfs_node.c === RCS file: /home/iedowse/CVS/src/sys/nfsclient/nfs_node.c,v retrieving revision 1.55 diff -u -r1.55 nfs_node.c --- nfsclient/nfs_node.c11 Jul 2002 17:54:58 - 1.55 +++ nfsclient/nfs_node.c11 Sep 2002 11:06:46 - @@ -289,21 +289,7 @@ } else sp = NULL; if (sp) { - /* -* We need a reference to keep the vnode from being -* recycled by getnewvnode while we do the I/O -* associated with discarding the buffers unless we -* are being forcibly unmounted in which case we already -* have our own reference. -*/ - if (ap->a_vp->v_usecount > 0) - (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1); - else if (vget(ap->a_vp, 0, td)) - panic("nfs_inactive: lost vnode"); - else { - (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1); - vrele(ap->a_vp); - } + (void)nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1); /* * Remove the silly file that was rename'd earlier */ Index: kern/vfs_subr.c === RCS file: /home/iedowse/CVS/src/sys/kern/vfs_subr.c,v retrieving revision 1.401 diff -u -r1.401 vfs_subr.c --- kern/vfs_subr.c 5 Sep 2002 20:46:19 - 1.401 +++ kern/vfs_subr.c 11 Sep 2002 11:06:46 - @@ -829,7 +829,8 @@ for (count = 0; count < freevnodes; count++) { vp = TAILQ_FIRST(&vnode_free_list); - KASSERT(vp->v_usecount == 0, + KASSERT(vp->v_usecount == 0 && + (vp->v_iflag & VI_INACTIVE) == 0, ("getnewvnode: free vnode isn't")); TAILQ_REMOVE(&vnode_free_list, vp, v_freelist); @@ -1980,8 +1981,8 @@ KASSERT(vp->v_writecount < vp->v_usecount || vp->v_usecount < 1, ("vrele: missed vn_close")); - if (vp->v_usecount > 1) { - + if (vp->v_usecount > 1 || + ((vp->v_iflag & VI_INACTIVE) && vp->v_usecount == 1)) { vp->v_usecount--; VI_UNLOCK(vp); @@ -1991,13 +1992,20 @@ if (vp->v_usecount == 1) { vp->v_usecount--; /* -* We must call VOP_INACTIVE with the node l
Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself
In message <[EMAIL PROTECTED]>, Don Lewis writes: >After looking at ufs_inactive(), I'd like to add a fourth proposal And I've just remembered a fifth :-) I think the old BSD code had both an `open' count and a reference count. The open count is a count of the real users of the vnode (it is what ufs_inactive really wants to compare against 0), and the reference count is just for places that you don't want the vnode to be recycled or destroyed. This was probably lost when the encumbered BSD sources were rewritten. At the time I was looking at it last, I remember thinking that the open count would allow vrele/vput to keep the reference count at 1 during the VOP_INACTIVE() call, which is what you were proposing. It would also allow us to fix the problem of many places not matching each VOP_OPEN() with a VOP_CLOSE(). I suspect we could clean up a lot of related problems if the open count was brought back. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself
In message <[EMAIL PROTECTED]>, Terry Lambert writes: >Ian Dowse wrote: >> And I've just remembered a fifth :-) I think the old BSD code had >> both an `open' count and a reference count. The open count is a >> count of the real users of the vnode (it is what ufs_inactive really >> wants to compare against 0), and the reference count is just for >> places that you don't want the vnode to be recycled or destroyed. >> This was probably lost when the encumbered BSD sources were rewritten. > >No, this went away with the vnode locking changes; it was in the >4.4 code, for sure. I think references are the correct thing here, >and SunOS seems to agree, since that's how they implement, too. 8-). I seem to have mis-remembered the details anyway :-) It doesn't look as if there ever was ever the `open' count that I mentioned above. Maybe I was just thinking that it would be a good way to solve the issues of matching VOP_CLOSEs with VOP_OPENs, since there are many cases in the kernel that do not guarantee to do a VOP_CLOSE for each VOP_OPEN that was performed. Handling the dropping of a last reference is always tricky to get right when complex operations can be performed from the reference dropping function (especially where that includes adding and then removing references multiple times). It's even harder to do it in a way that continues to catch missing or extra references caused by bugs in the functions called when the reference count hits zero. For example, if you hold the reference count at 1 while calling the cleanup function, it allows that function to safely add and drop references, but if that cleanup function has a bug that drops one too many references then you end up recursing instead of detecting it as a negative reference count. I've found in some other code that it works reasonably well to leave the reference count at zero, but set a flag to stop further 1->0 transitions from retriggering the cleanup. Obviously other approaches will work too. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: NFS hang on rmdir(2) with 5.0-current client, server
In message <[EMAIL PROTECTED]>, Robe rt Watson writes: >> > It looks like the client is basically hung waiting for an RPC response. >> > I'd be glad to provide more debugging information if someone can point me >> > in the right direction. >> >> I haven't seen this seen this problem with a 5.0-CURRENT client and a >> 4.7-PRERELEASE server, so as near as I can tell the client side isn't >> totally hosed. > >Interesting observation: rm on files, and rmdir on empty directories >doesn't trigger it, but attempt to rmdir on a non-empty directory does. This an NFSv2 mount, and I think the problem is specific to NFSv2. Something like the following patch should fix it. I probably missed this in revision 1.112 when fixing some similar issues in other server op functions. See the commit message for that revision for further details. Ian Index: nfs_serv.c === RCS file: /dump/FreeBSD-CVS/src/sys/nfsserver/nfs_serv.c,v retrieving revision 1.123 diff -u -r1.123 nfs_serv.c --- nfs_serv.c 25 Sep 2002 02:39:39 - 1.123 +++ nfs_serv.c 3 Oct 2002 08:30:49 - @@ -2905,10 +2905,9 @@ if (dirp) diraft_ret = VOP_GETATTR(dirp, &diraft, cred, td); nfsm_reply(NFSX_WCCDATA(v3)); - if (v3) { + error = 0; + if (v3) nfsm_srvwcc_data(dirfor_ret, &dirfor, diraft_ret, &diraft); - error = 0; - } /* fall through */ nfsmout: To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [ GEOM tests ] disklabel warnings and vinum drives lost
In message <[EMAIL PROTECTED]>, Robe rt Watson writes: >However, here's a patch that makes Vinum use namei() to rely on devfs to >locate requested devices instead of parsing the device name and guessing >the device number (incorrectly with GEOM). Unfortunately, I almost >immediately run into a divide by zero due to a zero sector size. Jeff >Roberson mentioned to me he had a fix for this bug that he sent to Greg, >but that was never committed. The divide by zero problem seems to be caused by an interaction between two bugs: GEOM refuses to return the sector size because the flags passed to d_open in vinum's open_drive() do not include FREAD. Then vinum clobbers the ioctl's non-zero error code by calling close_drive() from init_drive(), so the latter ends up returning zero even though it failed. The next failure I get is: Can't write config to /dev/da1s1d, error 45 (EOPNOTSUPP) Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [ GEOM tests ] disklabel warnings and vinum drives lost
[CCs trimmed] >The divide by zero problem seems to be caused by an interaction >between two bugs: GEOM refuses to return the sector size because ... >The next failure I get is: > > Can't write config to /dev/da1s1d, error 45 (EOPNOTSUPP) This turns out to be vinum doing a DIOCWLABEL to make the label writable before writing its configuration, but GEOM does not support that ioctl I presume. It should be safe to ignore these DIOCWLABEL ioctl return values as the actual writing of the vinum data should give a suitable error if making the label writable fails and is important. The patch below is Robert's patch with all 3 other issues fixed, and together, this seems to be enough to make vinum work again. Ian Index: vinumio.c === RCS file: /dump/FreeBSD-CVS/src/sys/dev/vinum/vinumio.c,v retrieving revision 1.75 diff -u -r1.75 vinumio.c --- vinumio.c 21 Aug 2002 23:39:51 - 1.75 +++ vinumio.c 5 Oct 2002 02:40:18 - @@ -50,92 +50,25 @@ int open_drive(struct drive *drive, struct thread *td, int verbose) { -int devmajor; /* major devs for disk device */ -int devminor; /* minor devs for disk device */ -int unit; -char *dname; +struct nameidata nd; struct cdevsw *dsw;/* pointer to cdevsw entry */ +int error; -if (bcmp(drive->devicename, "/dev/", 5)) /* device name doesn't start with /dev */ - return ENOENT; /* give up */ if (drive->flags & VF_OPEN)/* open already, */ return EBUSY; /* don't do it again */ -/* - * Yes, Bruce, I know this is horrible, but we - * don't have a root filesystem when we first - * try to do this. If you can come up with a - * better solution, I'd really like it. I'm - * just putting it in now to add ammuntion to - * moving the system to devfs. - */ -dname = &drive->devicename[5]; -drive->dev = NULL; /* no device yet */ - -/* Find the device */ -if (bcmp(dname, "ad", 2) == 0) /* IDE disk */ - devmajor = 116; -else if (bcmp(dname, "wd", 2) == 0)/* IDE disk */ - devmajor = 3; -else if (bcmp(dname, "da", 2) == 0) - devmajor = 13; -else if (bcmp(dname, "vn", 2) == 0) - devmajor = 43; -else if (bcmp(dname, "md", 2) == 0) - devmajor = 95; -else if (bcmp(dname, "ar", 2) == 0) -devmajor = 157; -else if (bcmp(dname, "amrd", 4) == 0) { - devmajor = 133; - dname += 2; -} else if (bcmp(dname, "mlxd", 4) == 0) { - devmajor = 131; - dname += 2; -} else if (bcmp(dname, "idad", 4) == 0) { - devmajor = 109; - dname += 2; -} else if (bcmp(dname, "twed", 4) == 0) { /* 3ware raid */ - devmajor = 147; - dname += 2; -} else - return ENODEV; -dname += 2;/* point past */ - -/* - * Found the device. We can expect one of - * two formats for the rest: a unit number, - * then either a partition letter for the - * compatiblity partition (e.g. h) or a - * slice ID and partition (e.g. s2e). - * Create a minor number for each of them. - */ -unit = 0; -while ((*dname >= '0') /* unit number */ -&&(*dname <= '9')) { - unit = unit * 10 + *dname - '0'; - dname++; -} - -if (*dname == 's') { /* slice */ - if (((dname[1] < '1') || (dname[1] > '4')) /* invalid slice */ - ||((dname[2] < 'a') || (dname[2] > 'h'))) /* or invalid partition */ - return ENODEV; - devminor = ((unit & 31) << 3) /* unit */ - +(dname[2] - 'a') /* partition */ - +((dname[1] - '0' + 1) << 16) /* slice */ - +((unit & ~31) << 16); /* high-order unit bits */ -} else { /* compatibility partition */ - if ((*dname < 'a') || (*dname > 'h')) /* or invalid partition */ - return ENODEV; - devminor = (*dname - 'a') /* partition */ - +((unit & 31) << 3) /* unit */ - +((unit & ~31) << 16); /* high-order unit bits */ +NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, drive->devicename, +curthread); +error = namei(&nd); +if (error) + return (error); +if (!vn_isdisk(nd.ni_vp, &error)) { + NDFREE(&nd, 0); + return (error);
Re: [ GEOM tests ] disklabel warnings and vinum drives lost
In message <[EMAIL PROTECTED]>, Poul-Henning Kamp writes: >Make that _three_ bugs: vinum opens devices directly at the cdevsw >level, bypassing in the process the vnodes and specfs. Here is a patch that makes it use vn_open/vn_close/VOP_IOCTL, bringing it much closer to the way ccd(4) does things. I have only lightly tested this so far - I saw one problem where a md(4) vnode-backed device got stuck in mddestroy(), but I haven't tracked down if that is related (the md vnode was just a file on a vinum-backed filesystem). Ian Index: vinumio.c === RCS file: /dump/FreeBSD-CVS/src/sys/dev/vinum/vinumio.c,v retrieving revision 1.76 diff -u -r1.76 vinumio.c --- vinumio.c 5 Oct 2002 03:44:00 - 1.76 +++ vinumio.c 5 Oct 2002 14:12:56 - @@ -51,33 +51,26 @@ open_drive(struct drive *drive, struct thread *td, int verbose) { struct nameidata nd; -struct cdevsw *dsw;/* pointer to cdevsw entry */ -int error; +int flags; if (drive->flags & VF_OPEN)/* open already, */ return EBUSY; /* don't do it again */ -NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, drive->devicename, -curthread); -error = namei(&nd); -if (error) - return (error); -if (!vn_isdisk(nd.ni_vp, &error)) { - NDFREE(&nd, 0); - return (error); -} -drive->dev = udev2dev(nd.ni_vp->v_rdev->si_udev, 0); -NDFREE(&nd, 0); - -if (drive->dev == NULL)/* didn't find anything */ - return ENODEV; - -drive->dev->si_iosize_max = DFLTPHYS; -dsw = devsw(drive->dev); -if (dsw == NULL) - drive->lasterror = ENOENT; -else - drive->lasterror = (dsw->d_open) (drive->dev, FWRITE | FREAD, 0, NULL); +drive->devvp = NULL; +NDINIT(&nd, LOOKUP, FOLLOW, UIO_SYSSPACE, drive->devicename, td); +flags = FREAD | FWRITE; +drive->lasterror = vn_open(&nd, &flags, 0); +if (drive->lasterror == 0) { + (void) vn_isdisk(nd.ni_vp, &drive->lasterror); + if (drive->lasterror != 0 && vrefcnt(nd.ni_vp) > 1) + drive->lasterror = EBUSY; + VOP_UNLOCK(nd.ni_vp, 0, td); + NDFREE(&nd, NDF_ONLY_PNBUF); + if (drive->lasterror == 0) + drive->devvp = nd.ni_vp; + else + (void) vn_close(nd.ni_vp, flags, td->td_ucred, td); +} if (drive->lasterror != 0) { /* failed */ drive->state = drive_down; /* just force it down */ @@ -85,8 +78,11 @@ log(LOG_WARNING, "vinum open_drive %s: failed with error %d\n", drive->devicename, drive->lasterror); -} else +} else { + drive->dev = vn_todev(drive->devvp); + drive->dev->si_iosize_max = DFLTPHYS; drive->flags |= VF_OPEN;/* we're open now */ +} return drive->lasterror; } @@ -145,6 +141,9 @@ int init_drive(struct drive *drive, int verbose) { +struct thread *td; + +td = curthread; if (drive->devicename[0] != '/') { drive->lasterror = EINVAL; log(LOG_ERR, "vinum: Can't open drive without drive name\n"); @@ -154,17 +153,17 @@ if (drive->lasterror) return drive->lasterror; -drive->lasterror = (*devsw(drive->dev)->d_ioctl) (drive->dev, +drive->lasterror = VOP_IOCTL(drive->devvp, DIOCGSECTORSIZE, (caddr_t) & drive->sectorsize, FREAD, - curthread); + td->td_ucred, td); if (drive->lasterror == 0) - drive->lasterror = (*devsw(drive->dev)->d_ioctl) (drive->dev, + drive->lasterror = VOP_IOCTL(drive->devvp, DIOCGMEDIASIZE, (caddr_t) & drive->mediasize, FREAD, - curthread); + td->td_ucred, td); if (drive->lasterror) { if (verbose) log(LOG_WARNING, @@ -211,14 +210,16 @@ void close_locked_drive(struct drive *drive) { +struct thread *td; int error; +td = curthread; /* * If we can't access the drive, we can't flush * the queues, which spec_close() will try to * do. Get rid of them here first. */ -error = (*devsw(drive->dev)->d_close) (drive->dev, 0, 0, NULL); +error = vn_close(drive->devvp, FREAD | FWRITE, td->td_ucred, td); drive->flags &= ~VF_OPEN; /* no longer open */ if (drive->lasterror == 0) drive->lasterror = error; @@ -561,11 +562,13 @@ int error; int written_config;/* set when we first write the config to disk */ int driveno; +struct thread *td; struct drive *drive; /* point to current drive info */ struct vinum_hdr *vhdr;
Re: mozilla vs linux emulation in -current?
In message <[EMAIL PROTECTED]>, Peter Wemm writes: >Has anybody else noticed this in -current? Mozilla hangs for a minute or >so at regular intervals.. >16:07:31.896548 216.145.52.172.20167 > 0.0.0.0.16001: S 1175926117:1175926117( Sounds like something I may have broken... Need to sleep now, but you could try the following. I think in_pcbconnect() used to do some evil stuff where it would modify the supplied sockaddr, tcp_connect was depending on this, and I failed to notice it (in_pcbconnect maps a destination address of INADDR_ANY into a local IP, but we were throwing away the modified version). Commit if it works, and I'll look properly tomorrow. Sorry for the breakage. Ian Index: tcp_usrreq.c === RCS file: /dump/FreeBSD-CVS/src/sys/netinet/tcp_usrreq.c,v retrieving revision 1.83 diff -u -r1.83 tcp_usrreq.c --- tcp_usrreq.c21 Oct 2002 13:55:50 - 1.83 +++ tcp_usrreq.c24 Oct 2002 01:27:27 - @@ -876,14 +876,14 @@ if (oinp != inp && (otp = intotcpcb(oinp)) != NULL && otp->t_state == TCPS_TIME_WAIT && (ticks - otp->t_starttime) < tcp_msl && - (otp->t_flags & TF_RCVD_CC)) + (otp->t_flags & TF_RCVD_CC)) { + inp->inp_faddr = oinp->inp_faddr; + inp->inp_fport = oinp->inp_fport; otp = tcp_close(otp); - else + } else return EADDRINUSE; } inp->inp_laddr = laddr; - inp->inp_faddr = sin->sin_addr; - inp->inp_fport = sin->sin_port; in_pcbrehash(inp); /* Compute window scaling to request. */ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: mozilla vs linux emulation in -current?
In message <[EMAIL PROTECTED]>, Ian Dowse writes : >IP, but we were throwing away the modified version). Commit if it >works, and I'll look properly tomorrow. Sorry for the breakage. With the one compile error fixed, this seemed to make `telnet 0.0.0.0' work again, so I went ahead and checked it in. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Crash again
In message <[EMAIL PROTECTED]>, Vallo Kallaste writes: >The same kernel compiled for purposes of smbfs debugging crashed >again. I had X, make -j2 running and no smbfs mounts. For what it's >worth, the system did hang hard (no interrupts) some minutes before >the aforementioned crash and I had to reboot by hand. >From the trace, we are recursing on ufs_inactive() because it needs to grab and release a vnode reference from within vput()->VOP_INACTIVE(), so the second vput() causes another call to VOP_INACTIVE. This looks like something the VINACTIVE patch I posted a while ago would fix: http://www.maths.tcd.ie/~iedowse/FreeBSD/vinactive.diff (Sorry, I haven't updated it, so it probably needs manual merging) See also the comments by Don Lewis on this list ("Re: nfs_inactive() bug? -> panic: lockmgr:"). Kirk, is this vput() recursion expected? If so, something like the patch above should ensure that the second vput() does not call VOP_INACTIVE() again. Ian >(kgdb) where >#0 doadump () at ../../../kern/kern_shutdown.c:223 >#1 0xc0236e8a in boot (howto=260) at ../../../kern/kern_shutdown.c:355 >#2 0xc0237147 in panic () at ../../../kern/kern_shutdown.c:508 >#3 0xc027e982 in bwrite (bp=0xce3e92dc) at ../../../kern/vfs_bio.c:796 >#4 0xc027f039 in bawrite (bp=0x0) at ../../../kern/vfs_bio.c:1085 >#5 0xc033b16f in ffs_fsync (ap=0xd795958c) at ../../../ufs/ffs/ffs_vnops.c:23 >6 >#6 0xc033a499 in ffs_sync (mp=0xc4057400, waitfor=2, cred=0xc13c3e80, >td=0xc0436be0) at vnode_if.h:612 >#7 0xc02939e8 in sync (td=0xc0436be0, uap=0x0) >at ../../../kern/vfs_syscalls.c:130 >#8 0xc0236a6b in boot (howto=256) at ../../../kern/kern_shutdown.c:264 >#9 0xc0237147 in panic () at ../../../kern/kern_shutdown.c:508 >#10 0xc0228f80 in lockmgr (lkp=0xc45fd43c, flags=65543, interlkp=0xc45fd378, >td=0xc3eb70d0) at ../../../kern/kern_lock.c:433 >#11 0xc028640c in vop_stdlock (ap=0x0) at ../../../kern/vfs_default.c:279 >#12 0xc0349348 in ufs_vnoperate (ap=0x0) at ../../../ufs/ufs/ufs_vnops.c:2763 >#13 0xc0290d6a in vclean (vp=0xc45fd378, flags=8, td=0xc3eb70d0) >at vnode_if.h:990 >#14 0xc029142a in vgonel (vp=0xc45fd378, td=0x0) >at ../../../kern/vfs_subr.c:2665 >#15 0xc0291310 in vrecycle (vp=0xc45fd378, inter_lkp=0x0, td=0x0) >at ../../../kern/vfs_subr.c:2620 >#16 0xc03413fc in ufs_inactive (ap=0x0) at ../../../ufs/ufs/ufs_inode.c:133 >#17 0xc0349348 in ufs_vnoperate (ap=0x0) at ../../../ufs/ufs/ufs_vnops.c:2763 >#18 0xc0290420 in vput (vp=0xc45fd378) at vnode_if.h:930 >#19 0xc033212d in handle_workitem_freeblocks (freeblks=0xc4d07500, flags=0) >at ../../../ufs/ffs/ffs_softdep.c:2494 >#20 0xc03315f4 in softdep_setup_freeblocks (ip=0xc4d4c500, length=0, >flags=2048) at ../../../ufs/ffs/ffs_softdep.c:2077 >---Type to continue, or q to quit--- >#21 0xc0327938 in ffs_truncate (vp=0xc45fd378, length=0, flags=3072, cred=0x0, > >td=0xc3eb70d0) at ../../../ufs/ffs/ffs_inode.c:271 >#22 0xc03412c8 in ufs_inactive (ap=0x0) at ../../../ufs/ufs/ufs_inode.c:100 >#23 0xc0349348 in ufs_vnoperate (ap=0x0) at ../../../ufs/ufs/ufs_vnops.c:2763 >#24 0xc0290420 in vput (vp=0xc45fd378) at vnode_if.h:930 >#25 0xc03336c2 in handle_workitem_remove (dirrem=0xc4567140, xp=0x0) >at ../../../ufs/ffs/ffs_softdep.c:3324 >#26 0xc032f0ed in process_worklist_item (matchmnt=0x0, flags=0) >at ../../../ufs/ffs/ffs_softdep.c:727 >#27 0xc032eea0 in softdep_process_worklist (matchmnt=0x0) >at ../../../ufs/ffs/ffs_softdep.c:624 >#28 0xc028f411 in sched_sync () at ../../../kern/vfs_subr.c:1739 >#29 0xc0222e14 in fork_exit (callout=0xc028f010 , arg=0x0, >frame=0x0) at ../../../kern/kern_fork.c:860 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: -CURRENT running really slow under vmware2
In message <[EMAIL PROTECTED]>, Jim Pirzyk writes: >I would think we need to at least patch current for this case. Enclosed >is a possible implementation. Comments? I think I tried this before, and puting the option in opt_cpu.h does not work, because not all files that include atomic.h will include opt_cpu.h. The other options referenced in atomic.h are all in opt_global.h, so CPU_DISABLE_CMPXCHG needs to go there too (note that the instruction is called cmpxchg, not cmpxfhg BTW). Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
ELCR in PCI-ISA bridge not getting set on resume
Since starting to use -current with ACPI on a Sony C1 laptop, I noticed that after resume, occasionally IRQ 9 would get stuck and not deliver any interrupts. IRQ 9 is shared by sound, USB and the pccard slot. It turned out that something was not saving the ELCR edge/level control registers in the PCI-ISA bridge, so on resume IRQ 9 was configured in edge-triggered mode, making interrupt loss inevitable. The patch below makes the "isab" driver save and restore the ELCR around suspends on the Intel 82371AB. Any comments on whether this is the right way or the right place to solve the problem? Ian Index: isa_pci.c === RCS file: /dump/FreeBSD-CVS/src/sys/dev/pci/isa_pci.c,v retrieving revision 1.6 diff -u -r1.6 isa_pci.c --- isa_pci.c 21 Dec 2001 01:28:46 - 1.6 +++ isa_pci.c 29 Oct 2002 01:01:33 - @@ -37,21 +37,36 @@ #include #include #include +#include +#include #include +#include #include #include +#defineELCR_IOADDR 0x4d0 /* Interrupt Edge/Level Control Registers */ +#defineELCR_IOLEN 2 + +struct isab_softc { +struct resource *elcr_res; +u_char saved_elcr[ELCR_IOLEN]; +}; + static int isab_probe(device_t dev); static int isab_attach(device_t dev); +static int isab_detach(device_t dev); +static int isab_resume(device_t dev); +static int isab_suspend(device_t dev); static device_method_t isab_methods[] = { /* Device interface */ DEVMETHOD(device_probe,isab_probe), DEVMETHOD(device_attach, isab_attach), +DEVMETHOD(device_detach, isab_detach), DEVMETHOD(device_shutdown, bus_generic_shutdown), -DEVMETHOD(device_suspend, bus_generic_suspend), -DEVMETHOD(device_resume, bus_generic_resume), +DEVMETHOD(device_suspend, isab_suspend), +DEVMETHOD(device_resume, isab_resume), /* Bus interface */ DEVMETHOD(bus_print_child, bus_generic_print_child), @@ -68,7 +83,7 @@ static driver_t isab_driver = { "isab", isab_methods, -0, +sizeof(struct isab_softc), }; static devclass_t isab_devclass; @@ -143,14 +158,82 @@ isab_attach(device_t dev) { device_t child; +struct isab_softc *sc = device_get_softc(dev); +int error, rid; /* * Attach an ISA bus. Note that we can only have one ISA bus. */ child = device_add_child(dev, "isa", 0); -if (child != NULL) - return(bus_generic_attach(dev)); +if (child != NULL) { + error = bus_generic_attach(dev); + if (error) +return (error); +} + +switch (pci_get_devid(dev)) { +case 0x71108086: /* Intel 82371AB */ + /* +* Sometimes the ELCR (Edge/Level Control Register) is not restored +* correctly on resume by the BIOS, so we handle it ourselves. +*/ + rid = 0; + bus_set_resource(dev, SYS_RES_IOPORT, rid, ELCR_IOADDR, ELCR_IOLEN); + sc->elcr_res = bus_alloc_resource(dev, SYS_RES_IOPORT, &rid, 0, ~0, 1, + RF_ACTIVE); + if (sc->elcr_res == NULL) + device_printf(dev, "failed to allocate ELCR resource\n"); +break; +} return(0); } +static int +isab_detach(device_t dev) +{ +struct isab_softc *sc = device_get_softc(dev); + +if (sc->elcr_res != NULL) + bus_release_resource(dev, SYS_RES_IOPORT, 0, sc->elcr_res); + + return (bus_generic_detach(dev)); +} + +static int +isab_suspend(device_t dev) +{ +struct isab_softc *sc = device_get_softc(dev); +bus_space_tag_t bst; +bus_space_handle_t bsh; +int i; + +/* Save the ELCR if required. */ +if (sc->elcr_res != NULL) { + bst = rman_get_bustag(sc->elcr_res); + bsh = rman_get_bushandle(sc->elcr_res); + for (i = 0; i < ELCR_IOLEN; i++) + sc->saved_elcr[i] = bus_space_read_1(bst, bsh, i); +} + +return (bus_generic_suspend(dev)); +} + +static int +isab_resume(device_t dev) +{ +struct isab_softc *sc = device_get_softc(dev); +bus_space_tag_t bst; +bus_space_handle_t bsh; +int i; + +/* Restore the ELCR if required. */ +if (sc->elcr_res != NULL) { + bst = rman_get_bustag(sc->elcr_res); + bsh = rman_get_bushandle(sc->elcr_res); + for (i = 0; i < ELCR_IOLEN; i++) + bus_space_write_1(bst, bsh, i, sc->saved_elcr[i]); +} + +return (bus_generic_resume(dev)); +} To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: kern/42417 cannot probe Olympus digital camera, "C-1"
In message <[EMAIL PROTECTED]>, Nate Lawson wri tes: >I looked at the change and it seems good. Can someone more familiar with >the USB system verify this? Done - I have a C-1 here, so I was able to test it - obviously I haven't accessed the camera from -current in a while! Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [acpi-jp 1925] Re: acpid implementation?
In message <[EMAIL PROTECTED]>, Takanori Watanabe writes: >It is obious there will be good if we have a way to catch power >event from userland. > >I have some ideas to implement it. >One way is implement with kqueue(2) and /dev/acpi to >get power events. This way does not require daemons >to wait the event exclusively. Each process that requires >to get power event can handle it by the interface. >I wrote the experimental code a years ago. I've been using the following far-from-ideal patch for a while now - it just supplies binary integers to /dev/acpi whenever the sleep state changes. The choice of encoding of data is stupid, and the acpiread() doesn't do blocking - I just use it in a script like while :; do sleep 5 acpidat="`wc -c < /dev/acpi`" if [ "$acpidat" -gt 0 ]; then killall -HUP moused fi done to send a SIGHUP to moused after a resume, which seems to be necessary on my Vaio C1. Ian Index: acpi.c === RCS file: /dump/FreeBSD-CVS/src/sys/dev/acpica/acpi.c,v retrieving revision 1.80 diff -u -r1.80 acpi.c --- acpi.c 31 Oct 2002 20:23:41 - 1.80 +++ acpi.c 9 Nov 2002 20:20:10 - @@ -32,6 +32,7 @@ #include "opt_acpi.h" #include #include +#include #include #include #include @@ -42,6 +43,7 @@ #include #include #include +#include #include #include @@ -69,16 +71,18 @@ static d_open_tacpiopen; static d_close_t acpiclose; +static d_read_tacpiread; static d_ioctl_t acpiioctl; +static d_poll_tacpipoll; #define CDEV_MAJOR 152 static struct cdevsw acpi_cdevsw = { acpiopen, acpiclose, -noread, +acpiread, nowrite, acpiioctl, -nopoll, +acpipoll, nommap, nostrategy, "acpi", @@ -1327,6 +1331,9 @@ } sc->acpi_sstate = state; + if (sc->acpi_usereventq_len < ACPI_USER_EVENTQ_LEN) + sc->acpi_usereventq[sc->acpi_usereventq_len++] = state; + selwakeup(&sc->acpi_selp); sc->acpi_sleep_disabled = 1; /* @@ -1375,6 +1382,9 @@ AcpiLeaveSleepState((UINT8)state); DEVICE_RESUME(root_bus); sc->acpi_sstate = ACPI_STATE_S0; + if (sc->acpi_usereventq_len < ACPI_USER_EVENTQ_LEN) + sc->acpi_usereventq[sc->acpi_usereventq_len++] = ACPI_STATE_S0; + selwakeup(&sc->acpi_selp); acpi_enable_fixed_events(sc); break; @@ -1808,6 +1818,35 @@ return(0); } +int +acpiread(dev_t dev, struct uio *uio, int flag) +{ +struct acpi_softc *sc; +intbytes, error, events, i; + +ACPI_LOCK; + +sc = dev->si_drv1; + +error = 0; +if (uio->uio_resid >= sizeof(int) && sc->acpi_usereventq_len > 0) { + events = sc->acpi_usereventq_len; + if (events > uio->uio_resid / sizeof(int)) + events = uio->uio_resid / sizeof(int); + bytes = events * sizeof(int); + error = uiomove((caddr_t)sc->acpi_usereventq, bytes, uio); + if (!error) { + for (i = 0; i < sc->acpi_usereventq_len - events; i++) + sc->acpi_usereventq[i] = sc->acpi_usereventq[i + events]; + sc->acpi_usereventq_len -= events; + } +} + +ACPI_UNLOCK; + +return (error); +} + static int acpiioctl(dev_t dev, u_long cmd, caddr_t addr, int flag, d_thread_t *td) { @@ -1871,6 +1910,25 @@ out: ACPI_UNLOCK; return(error); +} + +static int +acpipoll(dev_t dev, int events, d_thread_t *td) +{ +struct acpi_softc *sc; +intrevents; + +ACPI_LOCK; +sc = dev->si_drv1; + +revents = events & (POLLOUT | POLLWRNORM); +if ((events & (POLLIN | POLLRDNORM)) && sc->acpi_usereventq_len > 0) { + revents |= (POLLIN | POLLRDNORM); + selrecord(td, &sc->acpi_selp); +} + +ACPI_UNLOCK; +return (revents); } static int Index: acpivar.h === RCS file: /dump/FreeBSD-CVS/src/sys/dev/acpica/acpivar.h,v retrieving revision 1.37 diff -u -r1.37 acpivar.h --- acpivar.h 31 Oct 2002 17:58:38 - 1.37 +++ acpivar.h 9 Nov 2002 20:20:10 - @@ -30,6 +30,7 @@ #include "bus_if.h" #include +#include #include #if __FreeBSD_version >= 50 #include @@ -50,6 +51,11 @@ intacpi_enabled; intacpi_sstate; intacpi_sleep_disabled; + +#define ACPI_USER_EVENTQ_LEN 4 +intacpi_usereventq[ACPI_USER_EVENTQ_LEN]; +intacpi_usereventq_len; +struct selinfo acpi_selp; struct sysctl_ctx_list acpi_sysctl_ctx; struct sysctl_oid *acpi_sysctl_tree; To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscr
Re: [acpi-jp 1933] Re: acpid implementation?
In message <[EMAIL PROTECTED]>, Takanori Watanabe writes: >== >Next way is that make /dev/acpictl node that can open >exclusively and catch the power event by it, like apmd. >== > >This way requires that the event reading proceess should >be only one, so we need another device node to read event. Yes, exactly - I think that your suggestion of extending /dev/devctl to support device-specific events to be handled by devd is a much nicer solution though. >options PSM_HOOKRESUME #hook the system resume event, useful >#for some laptops >options PSM_RESETAFTERSUSPEND #reset the device at the resume event > >will resolve your problem without the patch. Cool, thanks. I didn't know those options existed - I'll try them out next time I reboot. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: MD broken in current
In message <[EMAIL PROTECTED]>, Bruce Evans writes: >Better fix mddestroy(). I don't know why it hangs ... I guess it is >because it is called before initialization is completed in mdinit(), >and there aren't enough state checks in mddestroy(). I think moving the line tsleep(sc, PRIBIO, "mdwait", 0); to just after the following `if' statement may do the trick. If the wakeup() from mddestroy() comes in before md_kthread() gets to the main loop, then it would be missed. I think jhb posted a better way of synchronising with kthreads during their destruction, but I haven't found the time to look into that yet. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: MD broken in current
In message <[EMAIL PROTECTED]>, Bruce Evans writes: >On Wed, 27 Nov 2002, Ian Dowse wrote: >> I think moving the line >> >> tsleep(sc, PRIBIO, "mdwait", 0); >> >> to just after the following `if' statement may do the trick. If the > >Wouldn't Giant locking prevent races here? There is no locking in >sight for the ioctl, but ioctl() holds Giant. Yes, but mddestroy() assumes that the kthread is waiting in the "mdwait" tsleep() when it calls wakeup(). That won't be true if the kthread has not yet had a chance to run, or if it is waiting to acquire Giant before entering the main loop (or if anything it calls drops Giant). Moving the check of the MD_SHUTDOWN to before the tsleep should catch all of these cases, and Giant ensures that the wakeup() does not occur after the flag is tested but before the tsleep(). Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: MD broken in current
In message <[EMAIL PROTECTED]>, Hiten Pandya writ es: >Is anyone planning to take this task, because, I think its important >that it is fixed. Or should it be put on the 5.0-todo list? If not, we >should put it in the BUGS section of mdconfig/ or the md(4) manual page. >IMO. I've tested the fix, and I'm just waiting for re@ approval to commit it. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: unkillable process - 'mdconfig -t vnode' on small file
In message <[EMAIL PROTECTED]>, Michal Mertl writes: >Subject says it all. Fixed in md.c revision 1.74 - this was discussed here a few days ago, but I was just waiting for approval to commit the fix. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
panic: ata_dmasetup: transfer active on this device!
Hi Søren, I get the above panic every few days when resuming, especially if the disk was active while the laptop was suspending - it's easy to reproduce by starting some disk-intensive activity and then hitting the suspend button. I see that IWASAKI-san posted patches for this a few months ago - do you have any plans to incorporate his work? http://docs.freebsd.org/cgi/getmsg.cgi?fetch=814727+0+archive/2002/freebsd-current/20020908.freebsd-current http://docs.freebsd.org/cgi/getmsg.cgi?fetch=822137+0+archive/2002/freebsd-current/20020908.freebsd-current ata0: resetting devices .. done ata1: resetting devices .. done panic: ata_dmasetup: transfer active on this device! Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
In message <[EMAIL PROTECTED]>, Kirk McKusick wr ites: >Adding a two minute delay before starting background fsck >sounds like a very good idea to me. Please send me your >suggested change. BTW, I've been using a fsck_ffs modificaton for a while now that does something like the disabled kernel I/O slowdown, but from userland. It seems to help quite a lot in leaving some disk bandwidth for other processes. Waiting a while before starting the fsck seems like a good idea anyway though. Patch below (I think I posted an earlier version of this before). Ian Index: fsutil.c === RCS file: /dump/FreeBSD-CVS/src/sbin/fsck_ffs/fsutil.c,v retrieving revision 1.19 diff -u -r1.19 fsutil.c --- fsutil.c27 Nov 2002 02:18:57 - 1.19 +++ fsutil.c4 Dec 2002 02:16:28 - @@ -40,6 +40,7 @@ #endif /* not lint */ #include +#include #include #include #include @@ -62,7 +63,13 @@ #include "fsck.h" +static void slowio_start(void); +static void slowio_end(void); + long diskreads, totalreads; /* Disk cache statistics */ +struct timeval slowio_starttime; +int slowio_delay_usec = 1; /* Initial IO delay for background fsck */ +int slowio_pollcnt; int ftypeok(union dinode *dp) @@ -350,10 +357,15 @@ offset = blk; offset *= dev_bsize; + if (bkgrdflag) + slowio_start(); if (lseek(fd, offset, 0) < 0) rwerror("SEEK BLK", blk); - else if (read(fd, buf, (int)size) == size) + else if (read(fd, buf, (int)size) == size) { + if (bkgrdflag) + slowio_end(); return (0); + } rwerror("READ BLK", blk); if (lseek(fd, offset, 0) < 0) rwerror("SEEK BLK", blk); @@ -463,6 +475,39 @@ idesc.id_blkno = blkno; idesc.id_numfrags = frags; (void)pass4check(&idesc); +} + +/* Slow down IO so as to leave some disk bandwidth for other processes */ +void +slowio_start() +{ + + /* Delay one in every 8 operations by 16 times the average IO delay */ + slowio_pollcnt = (slowio_pollcnt + 1) & 7; + if (slowio_pollcnt == 0) { + usleep(slowio_delay_usec * 16); + gettimeofday(&slowio_starttime, NULL); + } +} + +void +slowio_end() +{ + struct timeval tv; + int delay_usec; + + if (slowio_pollcnt != 0) + return; + + /* Update the slowdown interval. */ + gettimeofday(&tv, NULL); + delay_usec = (tv.tv_sec - slowio_starttime.tv_sec) * 100 + + (tv.tv_usec - slowio_starttime.tv_usec); + if (delay_usec < 64) + delay_usec = 64; + if (delay_usec > 100) + delay_usec = 100; + slowio_delay_usec = (slowio_delay_usec * 63 + delay_usec) >> 6; } /* To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Revision 1.88 of kern_linker.c breaks module loading for diskless
In message <[EMAIL PROTECTED]>, Harti Brandt write s: >the check for rootdev != NODEV introduced in rev 1.88 breaks loading of >kernel modules from an NFS mounted root in diskless configurations. >Dropping in gdb and printing rootdev shows -1 which is, I assume, NODEV. Ah, that would explain a problem I saw recently on a netbooted box where kldload only worked with full module paths. Could `rootvnode' be checked for NULL instead? Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message