Re: Found the startup panic - ccd ( patch included )
This looks correct to me. I originally had the BUF_KERNPROC macro just changing the target buffer, but then changed it to follow chains in the hopes of reducing the instances of its use. The way that you have used it below is much clearer and should definitely be put in. Also I hope that your fix to breadn will clear up Greg's problem with NFS. Kirk =-=-=-=-=-=-=-= To: Kirk McKusick <[EMAIL PROTECTED]> Cc: Julian Elischer <[EMAIL PROTECTED]>, Matthew Dillon <[EMAIL PROTECTED]>, Alan Cox <[EMAIL PROTECTED]>, Mike Smith <[EMAIL PROTECTED]>, "John S. Dyson" <[EMAIL PROTECTED]>, [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], Greg Lehey <[EMAIL PROTECTED]> Subject: Re: Found the startup panic - ccd ( patch included ) In-reply-to: Your message of "Mon, 28 Jun 1999 04:54:07 MST." <[EMAIL PROTECTED]> Date: Tue, 29 Jun 1999 00:57:38 +0800 From: Peter Wemm <[EMAIL PROTECTED]> Message-Id: <[EMAIL PROTECTED]> Kirk McKusick wrote: [..] I've been tinkering around for a while and think I have at least a partial fix for the remaining problems. Certain places use B_CALL and have biodone() from the b_iodone routine, so we can't reliably use B_ASYNC as an indicator of needing to reassign to LK_KERNPROC. We have to do it on a case-by-case basis. It's easier to do cluster_head processing at the point it's gathered rather than in BUF_KERNPROC(). vfs_cluster.c is confusing, but I think I've figured out how to get it right. I'm not 100% sure about checking for B_CALL in both cases prior to VOP_STRATEGY(), and maybe reqbp needs to be considered for the first read in cluster_read(). Also, I think the inline BUF_*() routines/macros would be better as routines in something like vfs_bio.c as the internals cause problems with prototypes etc. I think this patch fixes the remaining panics in pageouts and clustering. Index: kern/vfs_bio.c === RCS file: /home/ncvs/src/sys/kern/vfs_bio.c,v retrieving revision 1.218 diff -u -r1.218 vfs_bio.c --- vfs_bio.c 1999/06/28 15:32:10 1.218 +++ vfs_bio.c 1999/06/28 16:48:53 @@ -517,7 +517,8 @@ if (curproc != NULL) curproc->p_stats->p_ru.ru_oublock++; splx(s); - BUF_KERNPROC(bp); + if (oldflags & B_ASYNC) + BUF_KERNPROC(bp); VOP_STRATEGY(bp->b_vp, bp); /* Index: kern/vfs_cluster.c === RCS file: /home/ncvs/src/sys/kern/vfs_cluster.c,v retrieving revision 1.84 diff -u -r1.84 vfs_cluster.c --- vfs_cluster.c 1999/06/26 02:46:08 1.84 +++ vfs_cluster.c 1999/06/28 16:48:53 @@ -252,7 +252,8 @@ if ((bp->b_flags & B_CLUSTER) == 0) vfs_busy_pages(bp, 0); bp->b_flags &= ~(B_ERROR|B_INVAL); - BUF_KERNPROC(bp); + if (bp->b_flags & (B_ASYNC|B_CALL)) + BUF_KERNPROC(bp); error = VOP_STRATEGY(vp, bp); curproc->p_stats->p_ru.ru_inblock++; } @@ -286,7 +287,8 @@ if ((rbp->b_flags & B_CLUSTER) == 0) vfs_busy_pages(rbp, 0); rbp->b_flags &= ~(B_ERROR|B_INVAL); - BUF_KERNPROC(rbp); + if (rbp->b_flags & (B_ASYNC|B_CALL)) + BUF_KERNPROC(rbp); (void) VOP_STRATEGY(vp, rbp); curproc->p_stats->p_ru.ru_inblock++; } @@ -414,6 +416,11 @@ break; } } + /* +* XXX fbp from caller may not be B_ASYNC, but we are going +* to biodone() it in cluster_callback() anyway +*/ + BUF_KERNPROC(tbp); TAILQ_INSERT_TAIL(&bp->b_cluster.cluster_head, tbp, b_cluster.cluster_entry); for (j = 0; j < tbp->b_npages; j += 1) { @@ -788,6 +795,7 @@ reassignbuf(tbp, tbp->b_vp);/* put on clean list */ ++tbp->b_vp->v_numoutput; splx(s); + BUF_KERNPROC(tbp); TAILQ_INSERT_TAIL(&bp->b_cluster.cluster_head, tbp, b_cluster.cluster_entry); } Index: sys/buf.h === RCS file: /home/ncvs/src/sys/sys/buf.h,v retrieving revision 1.73 diff -u -r1.73 buf.h --- buf.h 1999/06/27 11:40:03 1.73 +++ buf.h 1999/06/28 16:48:53 @@ -315,17 +315,8 @@ static __inline void BUF_KERNPR
Re: lockmanager panic
Please be sure that you are running with vm/swap_pager.c at version 1.120 or later. In particular, you should have two calls to the macro BUF_KERNPROC in that file. If you are missing those two calls, you will get the panic. If you do have those two calls in that code, then (and *only* then) try the following patch to see if it helps. It is making use of BUF_KERNPROC for cases in which it is not intended, but if it gets around your current problem, then gives a good indication of what to look for as a real fix. Kirk McKusick Index: vm_pager.c === RCS file: /usr/ncvs/src/sys/vm/vm_pager.c,v retrieving revision 1.51 diff -c -r1.51 vm_pager.c *** vm_pager.c 1999/07/05 12:50:54 1.51 --- vm_pager.c 1999/07/20 06:33:59 *** *** 550,555 --- 550,556 nbp->b_flags = B_CALL | (bp->b_flags & B_ORDERED) | flags; nbp->b_rcred = nbp->b_wcred = proc0.p_ucred; nbp->b_iodone = vm_pager_chain_iodone; + BUF_KERNPROC(nbp); crhold(nbp->b_rcred); crhold(nbp->b_wcred); To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: lockmanager panic
I have been unable to produce the panic on my test machine. My current stumbling block is that I do not know how to create the `modeule' files, specifically /modules/vn.ko. My previous patch is *not* a valid solution, but does indicate to me that the problem probably lies in a missing BUF_KERNPROC in the vnode driver before it starts an async I/O. Unfortunately, I am departing in 6.5 hours on a six week vacation. And I have no intent on taking my laptop (or any other high tech computer gear beyond my scuba diving computer) with me. Peter proved quite adept at finding the missing BUF_KERNPROC in the paging code, so I am hopeful that he can apply his wizardry here as well. If the problem has not been resolved by the time of my return (August 30th) I promise to track down the problem on my return. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: The eventual fate of BLOCK devices.
I would like to take a step back from the debate for a moment and ask the bigger question: How many real-world applications actually use the block device interface? I know of none whatsoever. All the filesystem utilities go out of their way to avoid the block device and use the raw interface. Does anyone on this list know of any programs that need/want the block interface? If there are none, or only very obscure ones, then it seems pointless to waste any kernel code supporting them. Indeed it will clean up a good deal of code to get rid of them. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Mounting one FS on more than one system
To: [EMAIL PROTECTED] cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Mounting one FS on more than one system In-reply-to: Your message of "Sat, 04 Dec 1999 10:10:20 MST." <[EMAIL PROTECTED]> Date: Sat, 04 Dec 1999 13:53:26 -0800 From: Mike Smith <[EMAIL PROTECTED]> (moved to -current where there are more eyes that are interested) > P.S. Mike, at comdex I spoke to you some about clustering two > computers and one RAID array, remember? You mentioned that > someone had pursued that avenue some, perhaps not to a working > solution, but I don't remember who. Can you (or anyone else) point > me to the guilty parties? We would like to pick up the work and run > with it for a while. Yup, I remember. I also remember going through my stack of business cards wondering whether I remembered to get one from you; obviously not. 8) The sticking issue that we discussed was allowing more than one system to mount a given filesystem; I seemed to recall that Kirk has spoken about this before, and there may be some folks here (or Kirk himself, also copiedd) who may have some more input on the topic. Once this is resolved, everything else is (relatively!) straightforward... -- \\ Give a man a fish, and you feed him for a day. \\ Mike Smith \\ Tell him he should learn how to fish himself, \\ [EMAIL PROTECTED] \\ and he'll hate you for a lifetime. \\ [EMAIL PROTECTED] Mounting on more than one system is generally problematical unless you are willing to have all systems read-only. The problem is cache coherence between the machines. If one changes a block, the other machines will not see it. Basically, this is why we have the NFS filesystem. That lets a disk be mounted on one machine, but shared out to others. If you wanted to write a protocol that would allow for multiple machines, then you would need to have some central coordinator running some sort of coherency protocol with a complexity akin to that of NFS. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Mounting one FS on more than one system
To: Kirk McKusick <[EMAIL PROTECTED]> cc: Mike Smith <[EMAIL PROTECTED]>, [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: Mounting one FS on more than one system In-reply-to: Your message of "Sat, 04 Dec 1999 12:44:43 PST." <[EMAIL PROTECTED]> Date: Sun, 05 Dec 1999 09:44:32 +1000 From: George Michaelson <[EMAIL PROTECTED]> Let me start by saying that I do not have the definitive answers to all your questions. I'll take a crack at some of them. Allowing for cache writeback delays, is the speed of direct-to-shared-disk fast enough that using NFS as an "abstraction" layer would be faster than any network extant? The gains come from being able to read data directly from the disk rather than transferring it across the network. However, the cost of maintaining cache coherency would be at least as difficult and bandwidth consuming as a distributed filesystem. Would it be as fast? would the effort to make this work exceed the cost of making real networks exist? As the network speed approaches the disk speed, the gains would diminish. It would seem that there might be opportunities to do 'cut through' in the coding for known-private files after open (ok, inode allocation/extension has problems) to optimize them to at-worst 'disk+bits' instead of NFS costs. The problem is in identifying when private goes to shared. Also as you point out, new block and inode allocations have to be centrally coordinated. If one party mounts -r the FS (eg news spool) then does this reduce the complexity? eg /usr mounted read-mostly for a bunch of tightly coupled boxes. If any machine can write, then all the other machines have to have some way of keeping their caches consistent with the machine that did the modification. If some other protocol is used for interlock, does this make mmap shares across clusters faster? Mmap sharing across machines is going to be slow. I have never been a fan of distributed shared memory as a programming model, and this does not look like a way of making it run any faster. -George -- George Michaelson | DSTC Pty Ltd Email: [EMAIL PROTECTED]| University of Qld 4072 Phone: +61 7 3365 4310| Australia Fax: +61 7 3365 4311| http://www.dstc.edu.au To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: bioops
To: [EMAIL PROTECTED] Subject: HEADSUP: bioops patch. From: Poul-Henning Kamp <[EMAIL PROTECTED]> Date: Wed, 14 Jun 2000 22:29:32 +0200 This patch virtualizes & untangles the bioops operations vector. Background: The bioops operation vector is a list of OO-like operations which can be performed on struct buf. They are used by the softupdates code to handle dependencies. Ideally struct buf should have had a real OO like operations vector like vnodes have it, and struct bioops is the first step towards that. One of the reasons we should have OO-like struct buf, is that as long as bwrite(bp) "knows" that the buffer is backed by a disk device, we cannot use the UFS layer on top of a storage manager which isn't based on disk-devices: When UFS modifies a directory inode, it will call bwrite(bp) on the buffer with the data. This would not work if the backing were based on malloc(9) or anonymous swap-backed VM objects for instance. In other words: this is the main reason why we don't have a decent tmpfs in FreeBSD. Instead of just assuming that it works on a disk, bwrite(bp) should do a "bp->b_ops->bwrite(bp)" so that each buffer could have its own private idea of how to write itself, depending on what backing it has. So in order to move bioops closer to become a bp->b_ops, this patch takes two entries out of bioops: the "sync" and the "fsync" items and virtualizes the rest of the elements a bit. The "sync" item is called only from the syncer, and is a call to the softupdates code to do what it needs to do for periodic syncing. The real way of doing that would be to have an event-handler for this since other code could need to be part of the sync trafic, raid5 private parity caches could be one example. I have not done this yet, since currently softupdates is the only client. The fsync item really doesn't belong in the fsync system call, it belongs in ffs_fsync, and has been moved there. If it had been possible to put the fsync call in ffs_fsync, I would have done that. Unfortunately, it is not possible and will hang or panic the kernel if you put it there. The problem is that ffs_fsync syncs out the data blocks of the associated file. The bioops call to soft updates requests that any names associated with the file being sync'ed be sync'ed to disk as well. That is a necessary semantic of the system call fsync. However, it is not needed by most clients of VOP_FSYNC. Because the sync'ing of the name requires a walk up the filesystem tree from the inode in question to the root of the filesystem, the locking protocol requires that the nodes lower in the tree be unlocked before locking nodes that are higher. This means that the vnode being fsync'ed must be briefly unlocked while its containing parent is locked. If the vnode being fsync'ed is a directory, this creates a window where another process can sneak in and make changes which leads to the panics, two entries with the same name, etc. This window is not a problem for the fsync call because it is not creating a new name, but it is a problem if VOP_FSYNC is called in the open, link, mkdir, etc paths (as it will be in for example if a block allocation fails due to the filesystem being full. Thus there are two choices: put the code back as it was or chance the VOP_FSYNC call interface to add a flags value that indicates whether the name needs to be syned out as well as the data. I chose the former as I did not want to disrupt a widely used interface. To give the right behaviour when SOFTUPDATES is not compiled in, stubs for both of these functions have been added to ffs_softdep_stub.c Finally all the checks to see if the bioops vector is populated has been centralized in in-line functions in thereby paving the road for the global bioops to become bp->b_ops. Comments, reviews, tests please Poul-Henning Index: contrib/softupdates/ffs_softdep.c === RCS file: /home/ncvs/src/sys/contrib/softupdates/ffs_softdep.c,v retrieving revision 1.64 diff -u -r1.64 ffs_softdep.c --- contrib/softupdates/ffs_softdep.c 2000/05/26 02:01:59 1.64 +++ contrib/softupdates/ffs_softdep.c 2000/06/14 19:26:46 @@ -222,8 +222,6 @@ softdep_disk_io_initiation, /* io_start */ softdep_disk_write_complete,/* io_complete */ softdep_deallocate_dependencies,/* io_deallocate */ - softdep_fsync, /* io_fsync */ - softdep_process_worklist, /* io_sync */ softdep_move_dependencies, /* io_movedeps */ softdep_count_dependencies, /* io_countdeps */ };
Re: Panic with userquota(softupdates?)
From: Kevin Day <[EMAIL PROTECTED]> Subject: Panic with userquota(softupdates?) To: [EMAIL PROTECTED] Date: Fri, 16 Jun 2000 18:55:01 -0500 (CDT) Cc: [EMAIL PROTECTED] I keep getting panics in dqget(ufs_quota.c), with a -current from a couple of days ago. I think this might be softupdates related, since I can't make it happen with softupdates turned off, although it's quite possible that it has nothing to do with it. Does anyone have any idea what might be causing this? Any other information that might be useful here? -- Kevin I have just committed a change to sys/contrib/softupdates/ffs_softdep.c (delta 1.68) which corrects a panic in the kernel when quotas and soft updates are used together. While the specific problem that I fixed appears somewhat different than the one you are reporting, it may be related. I suggest that you update to the above delta and see if it solves your problem. If your problem persists, let me know. As always, if you can give a specific set of inputs which trigger the problem, that is always helpful in tracking it down. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: cvs commit: src/sys/contrib/softupdates softdep.h ffs_softdep.c
Date: Thu, 22 Jun 2000 11:54:26 +0200 From: Adrian Chadd <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: cvs commit: src/sys/contrib/softupdates softdep.h ffs_softdep.c On Thu, Jun 22, 2000, Brad Knowles wrote: > At 10:30 AM +0200 2000/6/22, Adrian Chadd wrote: > > > I like this. Would anyone object if this was brought over > > from NetBSD ? > > If you're asking for a vote, you've got mine. Hmm, Kirk has valid points for leaving a softupdates filesystem identified by tunefs and not a mount option. Kirk, do you still want to keep things that way ? Adrian Yes, I do want it kept as a yunefs option. Date: Thu, 22 Jun 2000 13:31:29 +0200 From: Stefan Esser <[EMAIL PROTECTED]> To: Adrian Chadd <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], Stefan Esser <[EMAIL PROTECTED]> Subject: Re: cvs commit: src/sys/contrib/softupdates softdep.h ffs_softdep.c ... I do remember the discussion that lead to the requirement to enable soft-updates with tunefs -n. But I do not remember, why the soft-updates state could not be just set in the local copy of the super-block and flushed to disk when the file system is marked dirty ? Just before a clean file system is to be mounted R/W, it is obviously safe to modify the soft-updates state. The file system must have been cleaned before, or the R/W mount will not be possible (extra logic can prevent the modification of the MNT_SOFTDEP bit if a mount of a non-clean partition is forced, in order to preserve the soft-updates state for the next fsck run). If the kernel was compiled without soft-updates, it may be the right thing to keep MNT_SOFTDEP cleared, to not mislead FSCK ... Did I miss something obvious ? Regards, STefan Your above proposal would work, though that is not how NetBSD implemented it. I feel that it is a lot of extra mechanism for very little gain. Administrators generally make a one-time decision to run soft updates on a filesystem. It is not the sort of thing that they want to change on a regular basis. It is possible to run tunefs on a filesystem that is mounted read-only, so it no more difficult to use tunefs than it is to make it a mount-time option (i.e., they still have to down-grade to read-only, set the option, then upgrade). Finally, I expect that soft updates will eventually just be defaulted to `on' when a filesystem is built, and in a few rare instances an administrator will want to turn it off. I do not want to have an option that needs to be added to nearly every fstab entry to get the default behavior. Plus it is just one more bit of trivia that new system administrators need to learn to make their systems run well. The more of those details that need not be learned because they just do the right thing, the better. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Panic: bqrelse: multiple refs
Date: Tue, 25 Jul 2000 11:47:03 -0400 (EDT) From: Brian Fundakowski Feldman <[EMAIL PROTECTED]> To: Ollivier Robert <[EMAIL PROTECTED]> Cc: "FreeBSD Current Users' list" <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Panic: lockmgr: pid 5, not exclusive lock holder 0 unlocking In-Reply-To: <[EMAIL PROTECTED]> On Tue, 25 Jul 2000, Ollivier Robert wrote: > According to Brian Fundakowski Feldman: > > Actually, I'm pretty certain this is the fix: > > Well it won't panic but isn't it putting the problem under the carpet? > I agree the panic seems to be here temporarely but... No, I'm really certain this isn't the case. You see, struct buf has a b_lock that until recently was a plain, exclusive lockmgr lock. In Kirk's last round of changes, he converted b_lock to be LK_CANRECURSE, which means that the lock, while still an exclusive lock, may be relocked multiple times by the same caller. The panics are plain wrong. What's left is to determine what is the proper thing to do in each of these cases, which I'm certain that many people already know already (you see, I'm still a bit green ;). What I am _almost_ sure about is that the right thing is just to remove one of the locks and let it get freed back up the call chain. I'm almost certain this is the case because if you are grabbing exclusive locks and recursing upon them, your call chain is the only consumer and in a recursive-locking-callchain, you will have multiple symmetric lock and unlock pairs. Anything else horribly complicates things, and this makes me a good 95% certain that this is exactly the right fix, not that it's sweeping any true bugs under the carpet. Allowing recursive locks is pretty much the only way to solve many of the problems here because it's simply not possible to support all code paths without allowing for this recursion. The code would either be horribly complicated or non-functional. I'm certain Kirk may be able to back me up here. It seems that the cleanup is meant to make the locks recursive mostly to facilitate correct/proper call chains, and that's consistent with my understand at least :) Indeed, if you look at the comment in brelse() from the delta, you will see that the intention of allowing this very situation to occur and simply BUF_UNLOCK() was planned for and the panic()s were for debugging during the previous time that b_locks weren't LK_CANRECURSE. As always, take what I say with a grain of salt since I'm definitely not a VFS guru in any manner; I just happen to think I understand this one :) > -- > Ollivier ROBERT -=- Eurocontrol EEC/ITM -=- [EMAIL PROTECTED] > The Postman hits! The Postman hits! You have new mail. -- Brian Fundakowski Feldman \ FreeBSD: The Power to Serve! / [EMAIL PROTECTED]`--' The above explanation is correct. When I made the change to allow recursive buffer locks, I should have removed that panic (but forgot that I had put it in there, sigh). I have just made the change on freefall. Sorry for the problems caused by that change. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Unconnected files problem
I have (finally) found and fixed this problem. You need to get version 1.107 or later of /sys/ufs/ffs/ffs_softdep.c (2002/02/07). Kirk McKusick =-=-=-=-=-= Date: Tue, 28 Aug 2001 14:02:24 +0200 From: Ollivier Robert <[EMAIL PROTECTED]> To: "FreeBSD Current Users' list" <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Unconnected files problem I have a script that generates index for all my mail messages (using glimpse). Sometimes, the disk is full because it has some rather big temporary files (and I have a lot of mail). It seems that we may have a softupdate-related (that's a guess from me) problem because some of these temporaty files end up as unconnected to any directory but link count is still one and they still takes space. The last time fsck ran on the filesystem, it gave me back more than 6 (!!) fragments (cf the following: -=-=- Aug 23 12:21:38 caerdonn root: /dev/da0s1g: Reclaimed: 0 directories, 22 files, 60424 fragments Aug 23 12:21:38 caerdonn root: /dev/da0s1g: 10295 files, 387087 used, 73408 free (1048 frags, 9045 blocks, 0.2% fragmentation) -=-=- lsof doesn't show them so they're not open by any process. The mtime of the files are exactly when the glimpseindex command is run. We know that SU has some issues when a filesystem is full but this is quite a problem because as you can see below, I'm losing a lot of space till the next reboot... UNREF FILE I=1081 OWNER=roberto MODE=100600 SIZE=523 MTIME=Aug 28 00:46 2001 CLEAR? no UNREF FILE I=18498 OWNER=roberto MODE=100600 SIZE=230665 MTIME=Aug 26 08:05 2001 RECONNECT? no CLEAR? no UNREF FILE I=18508 OWNER=roberto MODE=100600 SIZE=11225707 MTIME=Aug 23 20:02 2001 RECONNECT? no CLEAR? no UNREF FILE I=18530 OWNER=roberto MODE=100600 SIZE=28322748 MTIME=Aug 24 20:09 2001 RECONNECT? no CLEAR? no UNREF FILE I=18573 OWNER=roberto MODE=100600 SIZE=28326193 MTIME=Aug 25 20:09 2001 RECONNECT? no CLEAR? no UNREF FILE I=18575 OWNER=roberto MODE=100600 SIZE=18684173 MTIME=Aug 24 20:08 2001 RECONNECT? no CLEAR? no UNREF FILE I=19204 OWNER=roberto MODE=100600 SIZE=13771800 MTIME=Aug 26 08:05 2001 RECONNECT? no CLEAR? no UNREF FILE I=19353 OWNER=roberto MODE=100600 SIZE=18679309 MTIME=Aug 25 20:08 2001 RECONNECT? no CLEAR? no ** Phase 5 - Check Cyl groups 10223 files, 446324 used, 74595 free (1019 frags, 9197 blocks, 0.2% fragmentation) fsdb (inum: 2)> inode 19353 current inode: regular file I=19353 MODE=100600 SIZE=18679309 MTIME=Aug 25 20:08:18 2001 [0 nsec] CTIME=Aug 25 20:08:18 2001 [0 nsec] ATIME=Aug 25 20:08:11 2001 [0 nsec] OWNER=roberto GRP=staff LINKCNT=1 FLAGS=0 BLKCNT=8ec0 GEN=4c2a6c10 -- Ollivier ROBERT -=- Eurocontrol EEC/ITM -=- [EMAIL PROTECTED] FreeBSD caerdonn.eurocontrol.fr 5.0-CURRENT #46: Wed Jan 3 15:52:00 CET 2001 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Unconnected files problem
I have (finally) found and fixed this problem. You need to get version 1.107 or later of /sys/ufs/ffs/ffs_softdep.c (2002/02/07). Kirk McKusick =-=-=-=-=-= Date: Tue, 28 Aug 2001 14:02:24 +0200 From: Ollivier Robert <[EMAIL PROTECTED]> To: "FreeBSD Current Users' list" <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Unconnected files problem I have a script that generates index for all my mail messages (using glimpse). Sometimes, the disk is full because it has some rather big temporary files (and I have a lot of mail). It seems that we may have a softupdate-related (that's a guess from me) problem because some of these temporaty files end up as unconnected to any directory but link count is still one and they still takes space. The last time fsck ran on the filesystem, it gave me back more than 6 (!!) fragments (cf the following: -=-=- Aug 23 12:21:38 caerdonn root: /dev/da0s1g: Reclaimed: 0 directories, 22 files, 60424 fragments Aug 23 12:21:38 caerdonn root: /dev/da0s1g: 10295 files, 387087 used, 73408 free (1048 frags, 9045 blocks, 0.2% fragmentation) -=-=- lsof doesn't show them so they're not open by any process. The mtime of the files are exactly when the glimpseindex command is run. We know that SU has some issues when a filesystem is full but this is quite a problem because as you can see below, I'm losing a lot of space till the next reboot... UNREF FILE I=1081 OWNER=roberto MODE=100600 SIZE=523 MTIME=Aug 28 00:46 2001 CLEAR? no UNREF FILE I=18498 OWNER=roberto MODE=100600 SIZE=230665 MTIME=Aug 26 08:05 2001 RECONNECT? no CLEAR? no UNREF FILE I=18508 OWNER=roberto MODE=100600 SIZE=11225707 MTIME=Aug 23 20:02 2001 RECONNECT? no CLEAR? no UNREF FILE I=18530 OWNER=roberto MODE=100600 SIZE=28322748 MTIME=Aug 24 20:09 2001 RECONNECT? no CLEAR? no UNREF FILE I=18573 OWNER=roberto MODE=100600 SIZE=28326193 MTIME=Aug 25 20:09 2001 RECONNECT? no CLEAR? no UNREF FILE I=18575 OWNER=roberto MODE=100600 SIZE=18684173 MTIME=Aug 24 20:08 2001 RECONNECT? no CLEAR? no UNREF FILE I=19204 OWNER=roberto MODE=100600 SIZE=13771800 MTIME=Aug 26 08:05 2001 RECONNECT? no CLEAR? no UNREF FILE I=19353 OWNER=roberto MODE=100600 SIZE=18679309 MTIME=Aug 25 20:08 2001 RECONNECT? no CLEAR? no ** Phase 5 - Check Cyl groups 10223 files, 446324 used, 74595 free (1019 frags, 9197 blocks, 0.2% fragmentation) fsdb (inum: 2)> inode 19353 current inode: regular file I=19353 MODE=100600 SIZE=18679309 MTIME=Aug 25 20:08:18 2001 [0 nsec] CTIME=Aug 25 20:08:18 2001 [0 nsec] ATIME=Aug 25 20:08:11 2001 [0 nsec] OWNER=roberto GRP=staff LINKCNT=1 FLAGS=0 BLKCNT=8ec0 GEN=4c2a6c10 -- Ollivier ROBERT -=- Eurocontrol EEC/ITM -=- [EMAIL PROTECTED] FreeBSD caerdonn.eurocontrol.fr 5.0-CURRENT #46: Wed Jan 3 15:52:00 CET 2001 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: vm page panic
Date: Sun, 25 Mar 2001 11:20:17 +0200 From: Jeroen Ruigrok/Asmodai <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]>, Peter Wemm <[EMAIL PROTECTED]>, Paul Saab <[EMAIL PROTECTED]>, Matt Dillon <[EMAIL PROTECTED]>, Soeren Schmidt <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: vm page panic Hi guys, ok, sources cvsupped yesterday afternoon, just before my ffs_alloc.c commit [which I did, obviously, add myself locally]. Box had been running for a while when all of a sudden it got into a panic: vm_page_alloc: free/cache page 0xc0776fa4 was dirty a trace in ddb shows: allocbuf() getblk() ffs_balloc() ffs_write() vn_rdwr() elf_coredump() coredump() Unfortunately my ata controller didn't get reprobed [just was hanging there] so I couldn't get a crashdump. =( [HPT366] So consider this a heads-up, since you might encounter this. Extra info: devfs running, / is normal FFS /tmp, /var, /usr, /storage all soft-updated. -- Jeroen Ruigrok van der Werven/Asmodai .oUo. asmodai@[wxs.nl|freebsd.org] Documentation nutter/C-rated Coder BSD: Technical excellence at its best D78D D0AD 244D 1D12 C9CA 7152 035C 1138 546A B867 Pleasure's a sin, and sometimes sin's a pleasure... The latest round of changes to ffs_alloc.c add code which is only ever used by background fsck which is not yetbeing used. So, it seems very unlikely that your panic has been triggered byv these changes. Kirk To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: fsdb broken in -current
To: Kirk McKusick <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED] Subject: fsdb broken in -current Date: Mon, 23 Apr 2001 22:23:48 +0100 From: Ian Dowse <[EMAIL PROTECTED]> The last set of changes to fsck_ffs moved the initialisation of dev_bsize to sblock_init(), but this is not called by fsdb(8) so fsdb dies almost immediately with a floating exception. I'm just going to commit the obvious fix, which is to have fsdb call sblock_init() also. Ian Right you are. Sorry I missed that. It did not occur to me to verify fsdb. Kirk To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: worklist_remove panic
I have checked in revision 1.99 to ffs_softdep.c which builds on the change in revision 1.98 by [EMAIL PROTECTED] The symptom being treated in 1.98 was to avoid freeing a pagedep dependency if there was still a newdirblk dependency referencing it. That change is correct and no longer prints the warning message ``handle_written_filepage: active pagedep'' when it occurs. The other part of revision 1.98 was to panic with ``deallocate_dependencies: active pagedep'' when a newdirblk dependency was encountered during a file truncation. This fix removes that panic and replaces it with code to find and delete the newdirblk dependency so that the truncation can succeed. This delta should clear up the recent problems that folks have been having with soft updates. Kirk McKusick =-=-=-=-= To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: worklist_remove panic From: Dag-Erling Smorgrav <[EMAIL PROTECTED]> Date: 26 May 2001 21:25:32 +0200 No dump (dumps seem to have been broken for about a month now), but a stacktrace from DDB: kernel: type 12 trap, code=0 Stopped at worklist_remove+0x1c: cmpw$0,0xa(%ecx) db> trace worklist_remove(deadc0de) at worklist_remove+0x1c free_diradd(deadc0de) at free_diradd+0x26 free_newdirblk(c2e45cd0) at free_newdirblk+0x32 handle_written_inodeblock(c287b200,c6323480) at handle_written_inodeblock+0x2b2 softdep_disk_write_complete(c6323480) at softdep_disk_write_complete+0x6a bufdone(c6323480,cf2c7f54,c014de93,c6323480,c258b280) at bufdone+0x101 bufdonebio(c6323480) at bufdonebio+0xe ad_interrupt(c2c5f940,c2564300,cf2c7f7c,c01ba6e4,c258b280) at ad_interrupt+0x3ef ata_intr(c258b280) at ata_intr+0xae ithread_loop(c258b200,cf2c7fa8) at ithread_loop+0x424 fork_exit(c01ba2c0,c258b200,cf2c7fa8) at fork_exit+0xf4 fork_trampoline() at fork_trampoline+0x8 db> panic panic: from debugger Debugger("panic") Stopped at worklist_remove+0x1c: cmpw$0,0xa(%ecx) db> panic: from debugger Uptime: 1d0h12m13s dumping to dev ad0b, offset 131104 dump ata0: resetting devices .. panic: witness_restore: lock (sleep mutex) Giant not locked Uptime: 1d0h12m13s Dump already in progress, bailing... Automatic reboot in 15 seconds - press a key on the console to abort des@des ~% gdb -k GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd". (kgdb) exec-file /boot/kernel/kernel (kgdb) symbol-file /sys/compile/DES/kernel.debug Reading symbols from /sys/compile/DES/kernel.debug...done. (kgdb) l *(worklist_remove+0x1c) 0xc0261750 is in worklist_remove (../../ufs/ffs/ffs_softdep.c:432). 427 struct worklist *item; 428 { 429 430 if (lk.lkt_held == -1) 431 panic("worklist_remove: lock not held"); 432 if ((item->wk_state & ONWORKLIST) == 0) { 433 FREE_LOCK(&lk); 434 panic("worklist_remove: not on list"); 435 } 436 item->wk_state &= ~ONWORKLIST; (kgdb) l *(free_diradd+0x26) 0xc02640fa is in free_diradd (../../ufs/ffs/ffs_softdep.c:2601). 2596#ifdef DEBUG 2597if (lk.lkt_held == -1) 2598panic("free_diradd: lock not held"); 2599#endif 2600WORKLIST_REMOVE(&dap->da_list); 2601LIST_REMOVE(dap, da_pdlist); 2602if ((dap->da_state & DIRCHG) == 0) { 2603pagedep = dap->da_pagedep; 2604} else { 2605dirrem = dap->da_previous; (kgdb) l *(free_newdirblk+0x32) 0xc026345e is in free_newdirblk (../../ufs/ffs/ffs_softdep.c:2033). 2028 */ 2029pagedep = newdirblk->db_pagedep; 2030pagedep->pd_state &= ~NEWBLOCK; 2031if ((pagedep->pd_state & ONWORKLIST) == 0) 2032while ((dap = LIST_FIRST(&pagedep->pd_pendinghd)) != NULL) 2033free_diradd(dap); 2034/* 2035 * If no dependencies remain, the pagedep will be freed. 2036 */ 2037for (i = 0; i < DAHASHSZ; i++) After this panic, fsck complained of bad superblocks on all file systems. By the way, fsck is intolerably slow these days: more than twenty minutes for 'fsck -y' of a 5.5 GB filesystem (roughly 380,000 files) on a recent and far from sluggish IBM IDE drive. Most (nearly all) of that time is spent in phase 2. DES -- Dag-Erling Smorgrav - [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: filesystem errors
To: Michael Harnois <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: filesystem errors In-Reply-To: Your message of "Wed, 25 Jul 2001 23:14:16 CDT." <[EMAIL PROTECTED]> Date: Thu, 26 Jul 2001 15:14:09 +0100 From: Ian Dowse <[EMAIL PROTECTED]> In message <[EMAIL PROTECTED]>, Michael Harnois writes: >I'm tearing my hair out trying to find a filesystem error that's >causing me a panic: ufsdirhash_checkblock: bad dir inode. > >When I run fsck from a single user boot, it finds no errors. > >When I run it on the same filesystem mounted, it finds errors: but, of >course, it then can't correct them [Kirk, I'm cc'ing you because here the dirhash code sanity checks found a directory entry with d_ino == 0 that was not at the start of a DIRBLKSIZ block. This doesn't happen normally, but it seems from this report that fsck does not correct this. Is it a basic filesystem assumption that d_ino == 0 can only happen at the start of a directory block, or is it something the code should tolerate?] FFS will never set a directory ino == 0 at a location other than the first entry in a directory, but fsck will do so to get rid of an unwanted entry. The readdir routines know to skip over an ino == 0 entry no matter where in the directory it is found, so applications will never see such entries. It would be a fair amount of work to change fsck to `do the right thing', as the checking code is given only the current entry with which to work. I am of the opinion that you should simply accept that mid-directory block ino == 0 is acceptable rather than trying to `fix' the problem. Interesting - this is an error reported by the UFS_DIRHASH code that you enabled in your kernel config. A sanity check that the dirhash code is performing is failing. These checks are designed to catch bugs in the dirhash code, but in this case I think it may be a bug that fsck is not finding this problem, or else my sanity tests are too strict. A workaround is to turn off the sanity checks with: sysctl vfs.ufs.dirhash_docheck=0 or to remove UFS_DIRHASH from your kernel config. You could also try to find the directory that is causing the problems. Copy the following script to a file called dircheck.pl, and try running: chmod 755 dircheck.pl find / -fstype ufs -type d -print0 | xargs ./dircheck.pl That should show up any directories that would fail that dirhash sanity check - there will probably just be one or two that resulted from some old filesystem corruption. Ian #!/usr/local/bin/perl while (defined($dir = shift)) { unless (open(DIR, "$dir")) { print STDERR "$dir: $!\n"; next; } $b = 0; my(%dir) = (); while (sysread(DIR, $dat, 512) == 512) { $off = 0; while (length($dat) > 0) { ($dir{'d_ino'}, $dir{'d_reclen'}, $dir{'d_type'}, $dir{'d_namlen'}) = unpack("LSCC", $dat); $dir{'d_name'} = substr($dat, 8, $dir{'d_namlen'}); $minreclen = (8 + $dir{'d_namlen'} + 1 + 3) & (~3); $gapinfo = ($dir{'d_reclen'} == $minreclen) ? "" : sprintf("[%d]", $dir{'d_reclen'} - $minreclen); if ($dir{'d_ino'} == 0 && $off != 0) { printf("%s off %d ino %d reclen 0x%x type 0%o" . " namelen %d name '%s' %s\n", $dir, $off, $dir{'d_ino'}, $dir{'d_reclen'}, $dir{'d_type'}, $dir{'d_namlen'}, $dir{'d_name'}, $gapinfo); } if ($dir{'d_reclen'} > length($dat)) { die "reclen too long!\n"; } $dat = substr($dat, $dir{'d_reclen'}); $off += $dir{'d_reclen'}; } $b++; } } To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: fsck setting d_ino == 0 (was Re: filesystem errors)
To: Kirk McKusick <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED], [EMAIL PROTECTED], Ollivier Robert <[EMAIL PROTECTED]>, Mikhail Teterin <[EMAIL PROTECTED]> Subject: fsck setting d_ino == 0 (was Re: filesystem errors) In-Reply-To: Your message of "Sat, 28 Jul 2001 12:48:54 PDT." <[EMAIL PROTECTED]> Date: Wed, 22 Aug 2001 01:21:03 +0100 From: Ian Dowse <[EMAIL PROTECTED]> In message <[EMAIL PROTECTED]>, Kirk McKusick writes: >FFS will never set a directory ino == 0 at a location other >than the first entry in a directory, but fsck will do so to >get rid of an unwanted entry. The readdir routines know to >skip over an ino == 0 entry no matter where in the directory >it is found, so applications will never see such entries. >It would be a fair amount of work to change fsck to `do the >right thing', as the checking code is given only the current >entry with which to work. I am of the opinion that you >should simply accept that mid-directory block ino == 0 is >acceptable rather than trying to `fix' the problem. Bleh, well I guess not too surprisingly, there is a case in ufs_direnter() (ufs_lookup.c) where the kernel does the wrong thing when a mid-block entry has d_ino == 0. The result can be serious directory corruption, and the bug has been there since the Lite/2 merge: # fetch http://www.maths.tcd.ie/~iedowse/FreeBSD/dirbug_img.gz Receiving dirbug_img.gz (6745 bytes): 100% 6745 bytes transferred in 0.0 seconds (4.69 MBps) # gunzip dirbug_img.gz # mdconfig -a -t vnode -f dirbug_img md0 # fsck_ffs /dev/md0 ** /dev/md0 ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 20 files, 1 used, 2638 free (14 frags, 328 blocks, 0.5% fragmentation) # mount /dev/md0 /mnt # touch /mnt/ff12 # umount /mnt # fsck_ffs /dev/md0 ** /dev/md0 ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames DIRECTORY CORRUPTED I=2 OWNER=root MODE=40755 SIZE=512 MTIME=Aug 21 22:28 2001 DIR=/ SALVAGE? [yn] The bug is that when compressing directory blocks, the code trusts the DIRSIZ() macro to calculate the amount of data to be bcopy'd when moving a directory entry. If d_ino is zero, DIRSIZ() cannot be trusted, so random bytes in unused portions of the directory determine how much gets copied. I think it is very unlikely in practice for the value returned by DIRSIZ() to be harmful, but fsck certainly doesn't check it so this bug can be triggered after other types of corruption have been repaired by fsck. I just found this while looking for a dirhash bug - the dirhash code didn't check for d_ino == 0 when compressing directories, so it would freak when it couldn't find the entry to move. The patch below should fix both these issues, and it makes it clearer that DIRSIZ() is not used when d_ino == 0. Any comments welcome. The patch is a bit larger than it needs to be, but that directory compression code is so hard to understand that I think it is worth clarifying it slightly :-) Ian The compaction code started out deeply nested and highly conditional. I was very happy to get it down to one for loop with single nested conditionals. That being said, it is still pretty hard to follow. Anyway, I agree with your change. It is amazing to me that that bug has been present since the day the code was written (1983) and has not been noticed until now. Kirk Index: ufs_lookup.c === RCS file: /FreeBSD/FreeBSD-CVS/src/sys/ufs/ufs/ufs_lookup.c,v retrieving revision 1.52 diff -u -r1.52 ufs_lookup.c --- ufs_lookup.c2001/08/18 03:08:48 1.52 +++ ufs_lookup.c2001/08/21 23:59:09 @@ -869,26 +869,38 @@ * dp->i_offset + dp->i_count would yield the space. */ ep = (struct direct *)dirbuf; - dsize = DIRSIZ(OFSFMT(dvp), ep); + dsize = ep->d_ino ? DIRSIZ(OFSFMT(dvp), ep) : 0; spacefree = ep->d_reclen - dsize; for (loc = ep-
Re: Bad commit?
Date: Wed, 15 Nov 2000 11:47:07 -0700 From: Warner Losh <[EMAIL PROTECTED]> Subject: Bad commit? Sender: [EMAIL PROTECTED] As near as I can tell on my laptop, the following change causes panics with kernel page faults. With it, my laptop panics every time on boot (although in slightly different places for my two different kernels) and without it I'm rock solid. Has anybody else seen this? Warner mckusick2000/11/14 12:46:02 PST Modified files: sys/sys rman.h sys/kern subr_bus.c subr_rman.c Log: In preparation for deprecating CIRCLEQ macros in favor of TAILQ macros which provide the same functionality and are a bit more efficient, convert use of CIRCLEQ's in resource manager to TAILQ's. Approved by: Garrett Wollman <[EMAIL PROTECTED]> Revision ChangesPath 1.13 +3 -3 src/sys/sys/rman.h 1.83 +3 -5 src/sys/kern/subr_bus.c 1.14 +30 -35src/sys/kern/subr_rman.c I have checked in revision 1.15 for subr_rman.c which should fix the problems being experienced with version 1.14. If you continue to experience problems with version 1.15, please let me know. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: softupdate panic, anyone seen this? (fwd)
Jeffrey Hsu and I just came to the same conclusion about the splbio additions earlier this week. I had assumed that Jeffrey had put in these changes already. Anyway, the two of you need to coordinate getting the changes put in so that you do not collide. Kirk McKusick =-=-=-=-=-=-=-= Date: Fri, 19 Feb 1999 02:43:20 -0800 (PST) From: Matthew Dillon To: Julian Elischer Cc: Kirk McKusick , Jake , Don Lewis , curr...@freebsd.org Subject: Re: softupdate panic, anyone seen this? (fwd) References: This may or may not be related. In tracking down the sched_sync() panic I found two bugs. First, a couple of places where the worklist was not being protected at splbio(). I'm not 100% sure that this is a problem but the code is complex enough that it's just too dangerous not to do it. Second, a double LIST_REMOVE() was being performed in the case where VOP_FSYNC() would fail to sync all the dirty pages. This can occur legally for both NFS and filesystems with SOFTUPDATES set. I'd appreciate it if someone could verify the double LIST_REMOVE() bug. vn_syncer_add_to_worklist() already removes the vn from the list ( assuming the VONWORKLIST v_flag is set, which it should be in this case ). -Matt Matthew Dillon Index: kern/vfs_subr.c === RCS file: /home/ncvs/src/sys/kern/vfs_subr.c,v retrieving revision 1.186 diff -u -r1.186 vfs_subr.c --- vfs_subr.c 1999/02/04 18:25:39 1.186 +++ vfs_subr.c 1999/02/19 10:40:17 @@ -881,10 +881,8 @@ /* * Add an item to the syncer work queue. */ -void -vn_syncer_add_to_worklist(vp, delay) - struct vnode *vp; - int delay; +static void +vn_syncer_add_to_worklist(struct vnode *vp, int delay) { int s, slot; @@ -928,7 +926,8 @@ starttime = time_second; /* -* Push files whose dirty time has expired. +* Push files whose dirty time has expired. Be careful +* of interrupt race on slp queue. */ s = splbio(); slp = &syncer_workitem_pending[syncer_delayno]; @@ -941,16 +940,20 @@ vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, p); (void) VOP_FSYNC(vp, p->p_ucred, MNT_LAZY, p); VOP_UNLOCK(vp, 0, p); + s = splbio(); if (LIST_FIRST(slp) == vp) { if (TAILQ_EMPTY(&vp->v_dirtyblkhd) && vp->v_type != VBLK) - panic("sched_sync: fsync failed"); + panic("sched_sync: fsync failed vp %p type %d tag %d", vp, vp->v_type, vp->v_tag); /* * Move ourselves to the back of the sync list. +* vn_syncer_*worklist() will remove and re-add +* the node. */ - LIST_REMOVE(vp, v_synclist); + /*LIST_REMOVE(vp, v_synclist);*/ vn_syncer_add_to_worklist(vp, syncdelay); } + splx(s); } /* @@ -2841,6 +2844,8 @@ /* * The syncer vnode is no longer needed and is being decommissioned. + * + * Modifications to the worklist must be protected at splbio(). */ static int sync_reclaim(ap) @@ -2849,12 +2854,15 @@ } */ *ap; { struct vnode *vp = ap->a_vp; + int s; + s = splbio(); vp->v_mount->mnt_syncer = NULL; if (vp->v_flag & VONWORKLST) { LIST_REMOVE(vp, v_synclist); vp->v_flag &= ~VONWORKLST; } + splx(s); return (0); } To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: dump -L and privilege
Date: Fri, 17 Jan 2003 09:08:09 +0900 From: Jun Kuriyama <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: Current <[EMAIL PROTECTED]> Subject: dump -L and privilege X-ASK-Info: Confirmed by User I'm trying to use dump -L option to dump with snapshot on -current/RELENG_5_0 family. I found dump -L needs writable permission to the device (that's reasonable because it *writes* snapshot file). But when I try to dump by operator group, it's impossible to dump with -L option (target device has root:operator and crw-r-). This behavior is understandable. But in actual backup operations, what should we do? I'd like to hear what you thought in design. (1) Do dump as root with -L option. (2) Do chmod g+w for device. (3) Other ideas? -- Jun Kuriyama <[EMAIL PROTECTED]> // IMG SRC, Inc. <[EMAIL PROTECTED]> // FreeBSD Project Sorry for the slow reply. I am just back from several weeks of travel and am trying to get caught up on my email. You have raised an important point here. By default (that is when vfs.usermount == 0) only root is allowed to do mounts. Since dump -L needs to do a snapshot, that can only be done by a root process. I see two possible solutions to the problem. The first would be to change the default for vfs.usermount == 1 and then have dump -L create the snapshot in a directory owned by "operator" (or by whatever user runs the dumps). Then the snapshot could be created, used, and deleted by that user. The other alternative would be to create a setuid-to-root program that would take a snapshot and chown it to the user that does dumps. This setuid program could then be invoked by dump -L to create a snapshot for it. I favor the first approach, but there may be good security issues of which I am unaware that make that a bad choice. Perhaps we could get someone like Robert Watson to comment on these choices. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dump -L and privilege
Date: Fri, 31 Jan 2003 02:24:00 +0200 From: Giorgos Keramidas <[EMAIL PROTECTED]> To: Garrett Wollman <[EMAIL PROTECTED]> Cc: Kirk McKusick <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: dump -L and privilege X-ASK-Info: Confirmed by User On 2003-01-30 15:52, Garrett Wollman <[EMAIL PROTECTED]> wrote: > < Kirk McKusick <[EMAIL PROTECTED]> said: > > The other alternative would be to > > create a setuid-to-root program that would take a snapshot and > > chown it to the user that does dumps. > > I think this would actually be a useful feature for more than just > dumps. I might want to allow some users (say, those in group > `operator') to be able to create snapshots on their own, without > allowing arbitrary mounting privileges. Do normal permissions apply for the files included in a snapshot? It would be horrible from a security standpoint if any user could use a setuid program to snapshot filesystems, mount the snapshot to places of their own, and read random files from the mounted snapshot. - Giorgos By default snapshots are mode 400 owned by root, so normal users cannot access them. The setuid program is proposing to make them mode 440 group operator which would let anyone in the operator group read them. This is the same level of permission given to disks, so is neither more nor less secure than regular disks. If the snapshot is mounted, then the same filesystem permissions are enforced as would be enforced for the mounted disk except that the mount must be done read-only, so nothing in the snapshot can be moved, deleted, or changed. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dump -L and privilege
From: Jun Kuriyama <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: dump -L and privilege In-Reply-To: <[EMAIL PROTECTED]> X-ASK-Info: Whitelist match Is this enough? -r-sr-x--- 1 root operator 5750 Jan 31 22:13 mksnap_ffs o Should use filesystem device name rather than mountpoint? o Should use the group of device rather than "operator"? -- Jun Kuriyama <[EMAIL PROTECTED]> // IMG SRC, Inc. <[EMAIL PROTECTED]> // FreeBSD Project The mount command needs the mountpoint, not the device name, so the device name would only be needed if we want to use the group of the device rather than operator. I argue that we should use operator rather than the group of the device because the purpose of this command is to allow the dump program, run by people in the operator group, to take snapshots. At any rate, I have cleaned up the program and provided a Makefile and manual page (see below). The only semantic change that I made to your program was to do the `chown' before doing the `chmod' so as not to open a brief hole that would allow members of the default (wheel) group to get read access to the snapshot. Kirk McKusick # This is a shell archive. Save it in a file, remove anything before # this line, and then unpack it by entering "sh file". Note, it may # create directories; files and directories will be owned by you and # have default permissions. # # This archive contains: # # mksnap_ffs/Makefile # mksnap_ffs/mksnap_ffs.8 # mksnap_ffs/mksnap_ffs.c # mkdir mksnap_ffs echo x - mksnap_ffs/Makefile sed 's/^X//' >mksnap_ffs/Makefile << 'END-of-mksnap_ffs/Makefile' X# $FreeBSD$ X XPROG= mksnap_ffs XMAN= mksnap_ffs.8 X X.if defined(NOSUID) XBINMODE=550 X.else XBINMODE=4550 XBINOWN=root X.endif XBINGRP=operator X X.include END-of-mksnap_ffs/Makefile echo x - mksnap_ffs/mksnap_ffs.8 sed 's/^X//' >mksnap_ffs/mksnap_ffs.8 << 'END-of-mksnap_ffs/mksnap_ffs.8' X.\" X.\" Copyright (c) 2003 Networks Associates Technology, Inc. X.\" All rights reserved. X.\" X.\" This software was developed for the FreeBSD Project by Marshall X.\" Kirk McKusick and Network Associates Laboratories, the Security X.\" Research Division of Network Associates, Inc. under DARPA/SPAWAR X.\" contract N66001-01-C-8035 ("CBOSS"), as part of the DARPA CHATS X.\" research program. X.\" X.\" Redistribution and use in source and binary forms, with or without X.\" modification, are permitted provided that the following conditions X.\" are met: X.\" 1. Redistributions of source code must retain the above copyright X.\"notice, this list of conditions and the following disclaimer. X.\" 2. Redistributions in binary form must reproduce the above copyright X.\"notice, this list of conditions and the following disclaimer in the X.\"documentation and/or other materials provided with the distribution. X.\" 3. The names of the authors may not be used to endorse or promote X.\"products derived from this software without specific prior written X.\"permission. X.\" X.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND X.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE X.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE X.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE X.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL X.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS X.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) X.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT X.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY X.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF X.\" SUCH DAMAGE. X.\" X.\" $FreeBSD$ X.\" X.Dd January 19, 2003 X.Dt MKSNAP_FFS 8 X.Os X.Sh NAME X.Nm mksnap_ffs X.Nd take a filesystem snapshot X.Sh SYNOPSIS X.Nm X.Ar mountpoint X.Ar snapshot_name X.Sh DESCRIPTION XThe X.Nm Xcommand creates a snapshot named X.Ar snapshot_name Xon the filesystem mounted at X.Ar mountpoint . XThe X.Ar snapshot_name Xargument must be contained within the filesystem mounted at X.Ar mountpoint . X.Pp XThe group ownership of the file is set to X.Dq operator ; Xthe owner of the file remains X.Dq root . XThe mode of the snapshot is set to be readable by the owner Xor members of the X.Dq operator Xgroup. X.Sh SEE ALSO X.Xr chmod 2 , X.Xr chown 8 , X.Xr mount_ffs 8 X.Sh HISTOR
Re: INVARIANTS-related fs panic on alpha
I have tried running my test machine out of filesystem space (repeatedly) and have not been able to get this panic. I will keep running that test in the hopes that it will show up. In the meantime, if you can come up with an example that reliably triggers it, that would be most helpful. Kirk McKusick =-=-=-=-=-= Date: Fri, 14 Feb 2003 15:54:13 -0800 From: Kris Kennaway <[EMAIL PROTECTED]> To: Kris Kennaway <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: INVARIANTS-related fs panic on alpha On Sat, Jan 25, 2003 at 12:12:34AM -0800, Kris Kennaway wrote: > One of the alpha package clients panicked with this. It was under > very high load at the time (25 simultaneous package builds): >=20 > fatal kernel trap: >=20 > trap entry =3D 0x2 (memory management fault) > faulting va=3D 0xdeadc0dedeadc0e6 > type =3D access violation > cause =3D store instruction > pc =3D 0xfc53453c > ra =3D 0xfc53b2a8 > sp =3D 0xfe001da15b30 > curthread =3D 0xfc003e33b930 > pid =3D 3, comm =3D g_up >=20 > Stopped at add_to_worklist+0xac: stq a0,0x8(t0) <0xdeadc0dedea= dc0e6> > db> trace > add_to_worklist() at add_to_worklist+0xac > handle_written_inodeblock() at handle_written_inodeblock+0x5e8 > softdep_disk_write_complete() at softdep_disk_write_complete+0xac > bufdone() at bufdone+0x19c > bufdonebio() at bufdonebio+0x1c > biodone() at biodone+0x28 > g_dev_done() at g_dev_done+0xd8 > biodone() at biodone+0x28 > g_io_schedule_up() at g_io_schedule_up+0x4c > g_up_procbody() at g_up_procbody+0x9c > fork_exit() at fork_exit+0x100 > exception_return() at exception_return > --- root of call graph --- > db> I'm still getting this (on i386 and alpha). I believe it is related to a filesystem becoming full. Can someone please investigate? Kris To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: strange dump/restore behaviour
To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: strange dump/restore behaviour From: Dag-Erling Smorgrav <[EMAIL PROTECTED]> Date: Thu, 09 Jan 2003 16:41:10 +0100 This happened while copying data over to a new disk (mounted on /mnt and /mnt/usr; the original disk has only one partition). The machine was in single-user mode, but / was mounted read-write due to restore's insistance on placing temporary files in /tmp (I found out later that it respects TMPDIR, though the man page doesn't mention it). root@dsa /mnt# dump -0Laf- / | restore -rf- DUMP: Date of this level 0 dump: Thu Jan 9 16:11:42 2003 DUMP: Date of last level 0 dump: the epoch DUMP: Dumping snapshot of /dev/da0a (/) to standard output DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 1838856 tape blocks. DUMP: dumping (Pass III) [directories] DUMP: dumping (Pass IV) [regular files] warning: ./usr: File exists expected next file 4, got 3 [...] I can imagine that the file that caused the warning message was one of restore's temporary files, but a) I've never seen this before, and b) isn't -L supposed to prevent just that? DES -- Dag-Erling Smorgrav - [EMAIL PROTECTED] Sorry for the slow response. I tend to get behind on my freebsd.org email. The warning comes about because you had already created /mnt/usr. Since you were doing a full restore, you are getting a warning that the usr directory already exists when restore tries to create it. It complains again about finding an already existing inode (3 which was presumably the usr directory in the original dump). Neither of these are problematic or affected your restore. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Reboot(8) when fsck_ufs is running ?
Date: Sat, 15 Feb 2003 00:50:01 +0100 (CET) From: Martin Blapp <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: Kirk McKusick <[EMAIL PROTECTED]> Subject: Reboot(8) when fsck_ufs is running ? Hi all, I don't know what the behaviour should be, but when I try to reboot a box which has fsck_ufs is running, it doesn't reboot and I have to powercycle it. Looks also like it just hangs. Do you experience the same at your side ? Shouln't we abort the fsck_ufs and reboot ? Martin Assuming that you are running fsck_ufs as part of a background fsck, the problem is probably that the fsck_ufs is in the midst of creating a snapshot. At the moment, snapshot creation is not interruptable, so the reboot is waiting for it to finish. I am presently investigating a bug which causes snapshots of filesystems bigger than about 250Gb to hang the kernel due to buffer starvation. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: BOOT2_UFS=UFS1_ONLY works for today's current
From: David Syphers <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> Subject: Re: BOOT2_UFS=UFS1_ONLY works for today's current Date: Sun, 23 Feb 2003 14:49:52 -0600 Cc: [EMAIL PROTECTED] On Sunday 23 February 2003 11:10 am, Richard Arends wrote: > On Sun, 23 Feb 2003, David Syphers wrote: > > I added BOOT2_UFS=UFS2_ONLY to my make.conf, and my buildworld still > > dies in boot2. I'm trying to upgrade from a Feb. 19 -current > > (because it's crashing all the time, and I need to enable debugging > > stuff). Is there a fix, or would other information be helpful? > > Same problem over here. I reverted back the last commit on > /usr/src/sys/ufs/ffs/fs.h in my source tree and that "fixed" the > build. Of course, this is a workaround !! Okay, I've verified that the problem is due to rev. 1.39 of /usr/src/sys/ufs/ffs/fs.h. Peter Wemm pointed out that the problem is not the commit, but gcc's bad handling of 64-bit operations. Nonetheless, this commit does break world for a lot of people... is there some official solution? The make.conf line only works for UFS1 - if it's set to UFS2, buildworld still fails. (Am I correct in assuming a 5.0-R install defaults to UFS2?) -David -- http://www.seektruth.org Astronomy and Astrophysics Center The University of Chicago I have committed the following "fix" which reverts to using the previous broken version of cgbase in ufsread.c. It will work fine provided that your filesystem is smaller than 1.5Tb. Kirk McKusick Index: ufsread.c === RCS file: /usr/ncvs/src/sys/boot/common/ufsread.c,v retrieving revision 1.9 diff -c -r1.9 ufsread.c *** ufsread.c 2002/12/14 19:39:44 1.9 --- ufsread.c 2003/02/24 04:44:50 *** *** 28,33 --- 28,35 #include #include + #undef cgbase + #define cgbase(fs, c) ((ufs2_daddr_t)((fs)->fs_fpg * (c))) /* * We use 4k `virtual' blocks for filesystem data, whatever the actual To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: How a full fsck screwed up my SU+J filesystem
> Date: Wed, 1 Dec 2010 16:27:48 +0200 > From: Kostik Belousov > To: Peter Holm > Cc: Garrett Cooper , > Marshall Kirk McKusick , curr...@freebsd.org > Subject: Re: How a full fsck screwed up my SU+J filesystem > > On Wed, Dec 01, 2010 at 12:00:08PM +0100, Peter Holm wrote: > > On Wed, Dec 01, 2010 at 01:28:06AM -0800, Garrett Cooper wrote: > > > > > > So... I was doing a portmaster -af today because vlc stopped playing > > > audio (for some reason ... I kind of went on a pkg_cutleaves rampage > > > and probably deinstalled too much stuff), and the machine hardlocked > > > during an upgrade. I did a soft reboot and saw messages along the > > > lines of "your journal and filesystem mount time mismatched; running > > > a full fsck". I figured "ok, sure..." and let it do it's thing. > > > Problem was that it pruned a lot of stuff from my /usr partition -- > > > including the .sujournal !!! So now it's stuck at Mounting local > > > file systems: stating: > > > > > > Failed to find journal. Use tunefs to create one > > > Failed to start journal: 2 > > > > > > (I assume the 2 means ENOENT). All of the above were printf(9)'s > > > from the kernel. > > > > > > Now the machine won't continue in multiuser mode (doesn't respond > > > to interrupts, no panic, etc). Going into ddb, I don't see anything > > > in info_threads (just a bunch of references to sched_switch, a few > > > to fork_trampoline, cpustop_handler, and kdb_enter). I'm going to > > > try and massage the machine back to life from single user mode, but > > > the fact that this died in this way (i.e. .sujournal getting nuked > > > by a full fsck) is a bit disheartening for SU+J :(... It would be > > > nice if at least the fsck aborted before going and nuking the > > > journal :/... (or at the very least if the file wasn't removable -- > > > i.e. SF_NOUNLINK). > > > > > > Here's to hoping I can resuscitate the filesystem... > > > > > > Thanks, > > > -Garrett > > > > Thank you for reporting this. > > > > I was able to reproduce the problem by: > > > > tunefs -j enable /dev/md5a > > mount /dev/md5a /mnt > > chflags 0 /mnt/.sujournal > > rm -f /mnt/.sujournal > > umount /mnt > > mount /dev/md5a /mnt > > > > The mount(1) is now stuck in mntref. > > > > http://people.freebsd.org/~pho/stress/log/kostik404.txt > > > > A sequence of "tunefs -j disable" + "tunefs -j enable" should get > > you going. > > The action is of the category "do not do it then" for sure. > > The problem in kostik404 is due to ffs_mount() did not cleaned up > the vnodes instantiated during the mount. Activating softdep journal > instantiates at least root vnode, and a journal vnode, if found. The > following patch fixed it for me. > > diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c > index 94951e4..72f40da 100644 > --- a/sys/ufs/ffs/ffs_vfsops.c > +++ b/sys/ufs/ffs/ffs_vfsops.c > @@ -928,6 +928,7 @@ ffs_mountfs(devvp, mp, td) > if ((fs->fs_flags & FS_DOSOFTDEP) && > (error =3D softdep_mount(devvp, mp, fs, cred)) !=3D 0) { > free(fs->fs_csp, M_UFSMNT); > + ffs_flushfiles(mp, FORCECLOSE, td); > goto out; > } > if (fs->fs_snapinum[0] !=3D 0) > Thanks all: Garrett for the report, Peter for the way to reproduce the problem, and Kostik for a fix. I have copied Jeff so that he can confirm that Kostik's fix is the appropriate thing to do. And I will take a look at fsck to see if I can make it a bit more paranoid about removing .sujournal. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
mksnap_ffs, snapshot issues, again
Robert Watson forwarded your posting to me as I am not as current on current as I should be. -- Forwarded message -- > Date: Mon, 18 Aug 2003 22:38:47 +0200 > From: "[iso-8859-2] Branko F. Graènar" <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Subject: mksnap_ffs, snapshot issues, again > > I have 900G array on a promise sx6000 controller > > This is freshly formatted filesystem (newfs -L export -O 2 -U -g 48000 -i 2048 -m 0 > -o space /dev/pst0s2d) > > # df -i /export > /dev/pst0s2d 778742004 216194 778525810 0% 2 4451592920% /export > > # mount | grep export > /dev/pst0s2d on /export (ufs, local, soft-updates) > > let's try to create a snapshot of empty filesystem > > # cd /export > # mksnap_ffs /export aaa.snap > > ... after 30 minutes ... snapshot was not created (!!! On a empty > filesystem !!!)... Ok, long snapshot creation would be fine if it > would not hang all processes, which would like to do something on > /export (ls /export for example.). Filesystem cannot be unmounted. > mksnap_ffs process cannot be killed. Reboot and foreground fsck > helps. > > This is 5.1-RELEASE (without patches, with custom kernel -> just picked up generic > kernel and removed uneeded stuff.) > > Any ideas, why is this happening? As i mentioned before, this prevents background > fsck to make his job done (machine hangs.) > > > I would really like to solve this issue > > Brane Discussion - Paul Saab kindly arranged a machine (tank.freebsd.org) with a 2Tb disk array on it for me to test. I enclose a copy of the `sysctl kern' output at the end of this message. I first ran my own test which involved creating a default configuration filesystem, taking a snapshot, and removing the snapshot. The scripted result is below. It shows that it takes 48 minutes to create the snapshot and 15 minutes to remove it. But importantly, it shows that the filesystem is only locked down and inaccessible for 0.042 seconds of that 48 minutes. The problem is that the 77,000 indirect blocks needed by the snapshot do not fit in the 300 kernel buffers allotted to it. So, every indirect block needs to be read and written approximately three times. Just to be sure that there was not something weird about your configuration, I also ran the same set of tests using your newfs parameters. Other than creating more cylinder groups the result (e.g., running time) was about the same. But, to get to the problem that you are having with accessing your filesystem. The problem is that although the filesystem is only locked briefly, the snapshot file is locked for the entire 48 minutes. Thus, if you touch the snapshot file (by for example doing a "stat" on it), then the process doing the stat will hang for 48 minutes. The next process to try and touch the snapshot will lock /export while it waits for the lock on the snapshot to clear. And at that point you are hosed for 48 minutes on all access to /export :-( So, I think that the best solution for you would be to try creating a hidden directory for the snapshot file, e.g., create a /export/.snap directory mode 700 owned by root, then create the snapshot as say /export/.snap/snap1. This way, it will be out of the way of all snoopy programs except those walking the filetree as root. Kirk McKusick Results of my test - Script started on Fri Aug 22 17:18:34 2003 tank# newfs /dev/twed0 /dev/twed0: 2097152.0MB (4294967292 sectors) block size 16384, fragment size 2048 using 11413 cylinder groups of 183.77MB, 11761 blks, 23552 inodes. super-block backups (for fsck -b #) at: 160, 376512, 752864, 1129216, 1505568, 1881920, 2258272, 2634624, 3010976, 3387328, 3763680, 4140032, 4516384, 4892736, 5269088, 5645440, 6021792, 6398144, 6774496, 7150848, 7527200, 7903552, 8279904, 8656256, 9032608, 9408960, 9785312, 10161664, 10538016, 10914368, 11290720, 11667072, 12043424, 12419776, 12796128, 13172480, 13548832, 13925184, 14301536, 14677888, 15054240, 15430592, 15806944, 16183296, 16559648, 16936000, 17312352, 17688704, 18065056, 18441408, 18817760, 19194112, 19570464, 19946816, 20323168, 20699520, 21075872, 21452224, 21828576, 22204928, 22581280, < etc, etc, etc > 4283638624, 4284014976, 4284391328, 4284767680, 4285144032, 4285520384, 4285896736, 4286273088, 4286649440, 4287025792, 4287402144, 4287778496, 4288154848, 4288531200, 4288907552, 4289283904, 4289660256, 4290036608, 4290412960, 4290789312, 4291165664, 4291542016, 4291918368, 4292294720, 4292671072, 4293047424, 4293423776, 4293800128, 4294176480, 4294552832, 4294929184 tank# dumpfs /dev/twed0 | head -22 magic 19540119 (UFS2) timeSat Aug 23 01:18:55 2003 superblock location 65536 id [ 3f47236f d612c37d ] ncg 11413 size1073741823 blo
Re: mksnap_ffs, snapshot issues, again
To: Kirk McKusick <[EMAIL PROTECTED]> cc: "[iso-8859-2] Branko F. Graènar" <[EMAIL PROTECTED]>, Paul Saab <[EMAIL PROTECTED]>, Robert Watson <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: mksnap_ffs, snapshot issues, again From: "Poul-Henning Kamp" <[EMAIL PROTECTED]> In-Reply-To: Your message of "Sat, 23 Aug 2003 01:32:38 PDT." Date: Sat, 23 Aug 2003 11:01:28 +0200 X-ASK-Info: Whitelist match In message <[EMAIL PROTECTED]>, Kirk McKusick writes: >But, to get to the problem that you are having with accessing your >filesystem. The problem is that although the filesystem is only >locked briefly, the snapshot file is locked for the entire 48 minutes. >Thus, if you touch the snapshot file (by for example doing a "stat" >on it), then the process doing the stat will hang for 48 minutes. Isn't there some way we can loosen this aspect up ? Either by having stat know about it and return approximate info or simply by failing ? (I pressume that making the sleep interruptible would break all sorts of standards) -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe The race to the root problem in general could be largely solved by changing lookup (VOP_LOOKUP really) to release the lock that it holds on the directory before blocking on the next component in the case where it is doing a lookup without intent to create. If we did this, then a single locked node would have lookups pile up on itself, but could not cascade to the root. A related change would be to do an interruptable locking request on the node so that if one did an `ls -l foo' where foo was say a locked snapshot, it would be possible to interrupt it. ~Kirk ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: file system (UFS2) consistancy after -current crash? (fwd)
Date: Fri, 03 Oct 2003 05:03:34 -0600 From: Aaron Wohl <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: file system (UFS2) consistancy after -current crash? After crashes recently ive been geting softupdate inconsistancies. Directories in which a file has recently been renamed have neither the old file nor the new file. fsck -y recovers the inode and drops it in lost in found. I was under the impression that atomic rename() synced all the way to the disk before returning? Does softupdate enabled/disable have any bearing on this? The disks themselfs are a raid5 on an adaptec 5400s. We have had some problems recently with aac (the 5400s driver) related crashes we have been working with Scott Long on. I was wondering if maybe rename is only syncing as far as the raid controller memory? The problem that we have been having with many of the RAID systems is that they give an I/O completion interrupt after they copy the change into their memeory, but before the I/O is completed to the disk. Since the filesystem uses the I/O completion interrupt as an indication that the change is on disk, it proceeds to the next step. If the RAID ultimately fails to get the data to the disk, inconsistencies arise. This problem can arise whether or not soft updates are being used, but because soft updates makes individual changes over a longer time period (potentially up to a minute rather than the few milliseconds of 2-3 synchronous writes), it is more likely to be apparent after a crash. None of this helped by a journalling filesystem as the RAID lies about writing the log so you may not have it available to do a rollback after a crash. As we discovered with IDE disks, disabling the "write cache enable" feature causes a massive performance hit, so in practice that does not seem like a viable strategy. What does work is to use tag-queueing. Unfortunately tag-queueing is found primarily in SCSI systems, though it is starting to show up in the high-end IDE disks. Kirk McKusick ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: runningbufspace related lock-ups with md(4)/UFS/SU (PATCH ?)
I have been able to reproduce your hang on my system and your suggested fix does prevent it. I am going to run some more buffer starvation-type tests on it this week and if they do not cause other problems, I will put in your suggested fix. Kirk McKusick =-=-=-=-=-= To: [EMAIL PROTECTED] From: Brian Fundakowski Feldman <[EMAIL PROTECTED]> Mime-Version: 1.0 Date: Thu, 16 Oct 2003 15:32:58 -0400 Cc: [EMAIL PROTECTED] Subject: runningbufspace related lock-ups with md(4)/UFS/SU (PATCH ?) I'm having problems where the entire system is locking up when using a MD UFS+SoftUpdates partition. I can simply dd if=/dev/zero of=/mnt/foo and in a couple tries it will lock up. When it locks up, buf_daemon (or if that is patched against, syncer) is calling waitrunningbufspace() from a non-B_ASYNC buf call. Because of this, the md(4) ("md0") thread is stuck in "ufs" waiting to receive a lock on the vnode that one of the syncer/flusher daemons has locked, waiting for bufspace to run down. The user program causing the problem is still stuck in "wdrain" because it's also waiting for waitrunningbufspace() to return. In short, everything wants to try to reduce the amount of outstanding buffer space, but nothing moves forward while GEOM/md(4)/what have you are waiting for the daemons to let go of the vnode so they can write out data. Does this scenario make sense? I have fixed it here using the following very simple patch, which disables the implicit waitrunningbufspace() calls so the daemons can't get stuck there. diff -r1.412 vfs_bio.c 73a74,75 > static struct proc *bufdaemonproc; > 889c891,893 < waitrunningbufspace(); --- > if (curthread->td_proc != bufdaemonproc && > curthread->td_proc != updateproc) > waitrunningbufspace(); 2038,2039d2041 < < static struct proc *bufdaemonproc; -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> [EMAIL PROTECTED] \ The Power to Serve! \ Opinions expressed are my own. \,,\ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: runningbufspace related lock-ups with md(4)/UFS/SU (PATCH ?)
> To: Kirk McKusick <[EMAIL PROTECTED]> > From: "Brian F. Feldman" <[EMAIL PROTECTED]> > Date: Thu, 23 Oct 2003 15:46:53 -0400 > Cc: [EMAIL PROTECTED] > Subject: Re: runningbufspace related lock-ups with md(4)/UFS/SU (PATCH ?) > > Kirk McKusick <[EMAIL PROTECTED]> wrote: > > I have been able to reproduce your hang on my system and your suggested > > fix does prevent it. I am going to run some more buffer starvation-type > > tests on it this week and if they do not cause other problems, I will > > put in your suggested fix. > > Thanks, Kirk; seems everyone who's been able to reproduce it can't do so > anymore when the synchers are disallowed from waiting on runningbufspace > (a couple extra people testing it that haven't spoken up on the list). > > -- > Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ > <> [EMAIL PROTECTED] \ The Power to Serve! \ > Opinions expressed are my own. \,,\ I have put in your suggested patch to avoid the runningbufspace related lock-ups with md(4)/UFS/SU. Kirk McKusick =-=-=-=-=-= From: Kirk McKusick <[EMAIL PROTECTED]> Date: Mon, 3 Nov 2003 22:30:01 -0800 (PST) To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: cvs commit: src/sys/kern vfs_bio.c X-FreeBSD-CVS-Branch: HEAD Sender: [EMAIL PROTECTED] mckusick2003/11/03 22:30:01 PST FreeBSD src repository Modified files: sys/kern vfs_bio.c Log: Allow the bufdaemon and update daemon processes to skip the waitrunningbufspace() calls so that they are always able to proceed and clean up buffer space. Submitted by: Brian Fundakowski Feldman <[EMAIL PROTECTED]> Revision ChangesPath 1.420 +9 -5 src/sys/kern/vfs_bio.c ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
HEADS-UP new statfs structure
The statfs structure was updated on Nov 11th with 64-bit fields to allow accurate reporting of multi-terabyte filesystem sizes. You should build and boot a new kernel BEFORE doing a `make world' as the new kernel will know about binaries using the old statfs structure, but an old kernel will not know about the new system calls that support the new statfs structure. Running an old kernel after a `make world' will cause programs such as `df' that do a statfs system call to fail with a bad system call. Kirk McKusick ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: HEADS-UP new statfs structure
> Date: Fri, 14 Nov 2003 08:33:06 + > From: Matt Smith <[EMAIL PROTECTED]> > To: Marco Wertejuk <[EMAIL PROTECTED]> > Cc: Kirk McKusick <[EMAIL PROTECTED]>, [EMAIL PROTECTED] > Subject: Re: HEADS-UP new statfs structure > X-ASK-Info: Whitelist match > > Marco Wertejuk wrote: > > Just for a short note: cfsd (ports/security/cfs) should be > > recompiled as well after those statfs changes. > > > > And mail/postfix and devel/gnomevfs2 (ones's i've found so far) > > postfix did this every time it received a mail until I recompiled it: > > pid 4049 (smtpd), uid 1003: exited on signal 11 > > And gnomevfs was something I saw in another headsup. There are bound to > be others, I'm just keeping an eye on my /var/log/messages to see if > anything else sig 11 or 12's! So far so good though. > > Matt. This is why we make this change now so that it will be in place for the masses when 5.2 is released :-) Kirk McKusick ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: -current lockup (how to diagnose?)
I am guessing that some of the recent locking changes are causing the problem. Unfortunately I am on the road now through Jan 4th, so will not be in a position to look at it. Hopefully one of the folks working on getting the SMP pushed down through the filesystem (Jeff Roberson, John Baldwin, or Alan Cox) will have some idea what broke recently. I would try looking at which process holds the buffer lock that the find is trying to get. You can usually unravel the chain of locks to eventually find what pair of events lead to the deadlock. It definitely helps to have DEBUG_LOCKS compiled into your kernel. Kirk McKusick ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: panic: initiate_write_inodeblock_ufs2: already started
This error happens if things are not properly locked. As per my previous message, I am not able to look at it now, but am hoping that resolving some of the other races will solve this as well. Kirk McKusick ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Disappearing/Reappearing Files... (fwd)
The changes that I added to soft updates two days ago only kick in when the soft dependency memory limit is hit. This certainly should not be happening at system startup, and on any machine with more than 64Mb of memory, almost never. I did make a couple of minor textual changes to other parts of the code which should not have had any effect, but just in case they did, I have put them back to their previous state in today's delta. I would appreciate your trying out the current delta (1.27) and seeing if the problem persists. If it does, please try out the version before I did my recent rework (1.24). If that version has the problem as well, then I believe that some other change is triggering the problem, as 1.24 represents a version that has been in production for half a year without trouble. Kirk McKusick =-=-=-=-=-=-=-= Date: Sun, 9 May 1999 01:39:33 -0700 (PDT) From: Julian Elischer To: mckus...@mckusick.com Subject: Re: Disappearing/Reappearing Files... (fwd) FYI also some other people are commenting on odd behaviour where a created file doesn't show up for a while... almost as if the readdir() is returning the 'backed out' version of the directory data. julian -- Forwarded message -- Date: Sat, 8 May 1999 14:11:10 -0700 (PDT) From: John Polstra To: curr...@freebsd.org Subject: Re: Disappearing/Reappearing Files... In article <199905082048.naa34...@vashon.polstra.com>, John Polstra wrote: > > I'm seeing something possibly related (possibly not) on an Alpha with > this morning's -current. First I was getting unaligned accesses and > core dumps from the "cp" in /etc/rc that updates the /etc/motd file. > (I added "set -v" to /etc/rc to catch it.) But I could do the copy by > hand once the system was up. Now on the latest reboot I got this from > it: > > + cp /tmp/_motd /etc/motd > + chmod 644 /etc/motd > chmod: : No such file or directory > chmod in free(): warning: recursive call > chmod in free(): warning: recursive call > chmod in free(): warning: recursive call > chmod in free(): warning: recursive call > > (Hmm, why didn't the filename come out in chmod's error message?) > > I'm running with soft-updates but I'll try turning them off. I tried about 10 reboots, half with and half without soft-updates enabled on the various filesystems. With soft-updates disabled, I didn't see the above problem at all. With soft-updates enabled, I saw it most of the time but not always. John -- John Polstra j...@polstra.com John D. Polstra & Co., Inc.Seattle, Washington USA "Self-interest is the aphrodisiac of belief." -- James V. DeLong To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
HEADS UP: 64-bit quotas going in to head today
Dag-Erling Smørgrav and I have been working on updating the FFS quota system to support both traditional 32-bit and new 64-bit quotas (for those of you who want to put 2+Tb quotas on your users). By default quotas are not compiled into the kernel. To include them in your kernel configuration you need to specify: options QUOTA # Enable FFS quotas If you are already running with the current 32-bit quotas, they should continue to work just as they have in the past. If you wish to convert to using 64-bit quotas, use `quotacheck -c 64'; if you wish to revert from 64-bit quotas back to 32-bit quotas, use `quotacheck -c 32'. There is a new library of functions to simplify the use of the quota system, do `man quotafile' for details. If your application is currently using the quotactl(2), it is highly recommended that you convert your application to use the quotafile interface. Note that existing binaries will continue to work. The new quota system has been heavily tested, however wider use inevitably finds new issues. If you encounter any problems with quotas please email me directly as well as posting on current as I all too often miss list email and emailing me directly will ensure the quickest response. Special thanks to John Kozubik of rsync.net for getting me interested in pursuing 64-bit quota support and for funding part of my development time on this project. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: 4.8-RC / 5-CURRENT UFS1 interoperability problem
Date: Thu, 6 Mar 2003 17:21:00 +0300 (MSK) From: Maxim Konovalov <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: 4.8-RC / 5-CURRENT UFS1 interoperability problem Hello, In short, there is a problem using the same UFS1 filesystem under -stable and -current. Please look at an attached typescript for details. I noticed a wrong superblock information either: [EMAIL PROTECTED] ~]$ df /spare Filesystem 1K-blocks UsedAvail Capacity Mounted on /dev/ad0s2a 22520288 -125476 20844144-1%/spare Is it known bug? -- Maxim Konovalov, [EMAIL PROTECTED], [EMAIL PROTECTED] Executive summary: you need to run `fsck -f -p' whenever you switch to or from a 4.X (stable) and a 5.X (current) kernel. The reason is that the UFS1 superblock summary information is maintained in different parts of the superblock on these two systems. Neither system maintains the summary information used by the other. There is no risk of trashing your filesystem if you fail to run the fsck, but the information reported by `df' will be wrong until you run the fsck. Kirk McKusick =-=-=-=-=-=-= golf# uname -a FreeBSD golf.macomnet.net 4.8-PRERELEASE FreeBSD 4.8-PRERELEASE #19: Thu Feb 27 13:33:49 GMT 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC i386 golf# fsck /dev/ad0s2a ** /dev/ad0s2a ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 3 files, 3 used, 5630069 free (21 frags, 703756 blocks, 0.0% fragmentation) golf# mount /dev/ad0s2a /mnt golf# mount | grep mnt /dev/ad0s2a on /mnt (ufs, local, soft-updates) golf# exit exit - clean reboot golf# uname -a FreeBSD golf.macomnet.net 5.0-CURRENT FreeBSD 5.0-CURRENT #6: Wed Feb 19 10:01:22 MSK 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GOLF5 i386 golf# fsck /dev/ad0s2a ** /dev/ad0s2a ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups SUMMARY INFORMATION BAD SALVAGE? [yn] y SUMMARY BLK COUNT(S) WRONG IN SUPERBLK SALVAGE? [yn] y 94155 files, 220230 used, 5409842 free (15210 frags, 674329 blocks, 0.3% fragmentation) * FILE SYSTEM WAS MODIFIED * golf# mount /dev/ad0s2a /mnt golf# mount | grep mnt /dev/ad0s2a on /mnt (ufs, local, nodev, noexec, nosuid, soft-updates) golf# exit To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: you should probably track current@ these days...
Date: Sat, 22 Jun 2002 07:49:17 -0700 (PDT) From: David Wolfskill <[EMAIL PROTECTED]> To: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: UFS2, superblocks, and UFS compatability I had noted phk's message about Kirk's "commitatron" being readied for action; I was, however, slightly surprised by some of the results of yesterday's (daily) build of -CURRENT. I had expected (and a quick, ex post facto, look at Kirk's commit message confirms) that the intent was to maintain compatability with UFS1. What surprised me was that after building yesterday's -CURRENT successfully (actually, without incident at all), I poked around a bit, then re-booted to -STABLE. (I'm tracking both -STABLE and -CURRENT on the same physical disk, using different slices. Although I build -CURRENT daily, and do some "reality checks," I still do the bulk of the work with the machines in question -- such as anything that updates the CVS repository -- in -STABLE. This probably reflects a rather conservative nature on my part.) The "surprise" was that -STABLE's fsck was rather expressively unhappy about the file systems that had been mounted while running -CURRENT. In particular, STABLE's fsck claimed that the superblock for each such file system was corrupt. I found (empirically) that running fsck (and allowing it to recover the superblock from the backup superblock at 32) for each of these files systems, then rebooting, made STABLE much less unhappy. :-} The -CURRENT code appears to be able to use a UFS1 file system OK -- I was able to boot back to -CURRENT again in preparation for building today's -CURRENT -- but it appears to me (and I haven't looked at the code to verify this) that something in the superblock is getting updated in a way that isn't completely compatable with UFS1, at least if the file system is updated. If this is intended, mention of it in UPDATING might be useful. If it's not, I'll be happy to help narrow down where things go awry and test the results of (proposed) patches. (Whether they are patches to -CURRENT, -STABLE, or both.) Cheers, david -- David H. Wolfskill [EMAIL PROTECTED] Trying to support or use Microsoft products makes about as much sense as painting the outside of a house with watercolors. My hope was that you would be able to switch painlessly between new and old systems. To make this work, I made a change to fsck on April 7th: RCS file: /usr/ncvs/src/sbin/fsck_ffs/setup.c,v: revision 1.30 date: 2002/04/07 05:16:33; author: mckusick; state: Exp; lines: +25 -61 When checking the alternate superblock, we used to copy any fields that might have changed, then did a byte-by-byte comparison with the alternate. If any unused fields got used, they had to be added to the exception list. Such changes caused too many false alarms. So, I have changed the comparison algorithm to compare a selected set of fields that are not expected to change. This new algorithm causes far fewer false hits and still does a good job of detecting problems when they have really occurred. In particular, this change should ease the transition to kernels supporting UFS2 which make some significant changes to the superblock. Sponsored by: DARPA, NAI Labs This was supposed to get MFC'ed back to 4.X, though I am not sure if that ever happened. Because of the breakup of fsck into fsck and fsck_ffs I am not sure how one goes back and makes changes to whet used to be fsck/setup.c. Anyway, if your fsck_ffs is running with a copy of setup.c that predates this change, then it will bitch about the superblock being corrupted and recover by using the first alternate. You can avoid the bitching by using `fsck -b16 ...' to override the integrity check. Given that you have had the problem, I expect that others will as well, so I will make a note in the UPDATING notes to suggest the use of `fsck -b16 ...' when going back to using filesystems on 4.X systems. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: UFS2 related message?
The problem with running out of inodes is now fixed and checked in on freefall. You need to pick up /sys/ufs/ffs/ffs_alloc.c revision 1.94 or later. For those that care, the log entry describing the problem: revision 1.94 date: 2002/06/22 21:24:58; author: mckusick; state: Exp; lines: +2 -2 This patch fixes a problem whereby filesystems that ran out of inodes in a cylinder group would fail to check for free inodes in other cylinder groups. This bug was introduced in the UFS2 code merge two days ago. An inode is allocated by calling ffs_valloc which calls ffs_hashalloc to do the filesystem scan. Ffs_hashalloc walks around the cylinder groups calling its passed allocator (ffs_nodealloccg in this case) until the allocator returns a non-zero result. The bug is that ffs_hashalloc expects the passed allocator function to return a 64-bit ufs2_daddr_t. When allocating inodes, it calls ffs_nodealloccg which was returning a 32-bit ino_t. The ffs_hashalloc code checked a 64-bit return value and usually found random non-zero bits in the high 32-bits so decided that the allocation had succeeded (in this case in the only cylinder group that it checked). When the result was passed back to ffs_valloc it looked at only the bottom 32-bits, saw zero and declared the system out of inodes. But ffs_hashalloc had really only checked one cylinder group. The fix is to change ffs_nodealloccg to return 64-bit results. Sponsored by: DARPA & NAI Labs. Submitted by: Poul-Henning Kamp <[EMAIL PROTECTED]> Reviewed by:Maxime Henrion <[EMAIL PROTECTED]> ---- Kirk McKusick =-=-=-=-=-= To: [EMAIL PROTECTED] Subject: UFS2 related message? Date: Sat, 22 Jun 2002 20:45:01 +0900 From: Munehiro Matsuda <[EMAIL PROTECTED]> Hello all, After the import of UFS2 patch into -current, I get the following messages. pid 397 (perl), uid 123 inumber 682496 on /home: out of inodes pid 397 (perl), uid 123 inumber 682496 on /home: out of inodes pid 397 (perl), uid 123 inumber 682496 on /home: out of inodes Is it related to UFS2 by anyway? FYI, here's what got with my disks. % df -i Filesystem 1K-blocksUsed Avail Capacity iused ifree %iused Mounted on /dev/ad0s2a254063 91341 14239739%2615 608714% / devfs 1 1 0 100% 0 0 100% /dev /dev/ad0s3e 7185161 4473874 213647568% 227116 1574354 13% /home /dev/ad0s2f 2787666 1668475 89617865% 176288 522078 25% /usr /dev/ad0s2e254063 10456 223282 4%1653 618333% /var procfs 4 4 0 100% 1 0 100% /proc linprocfs 4 4 0 100% 1 0 100% /usr/compat/linux/proc /dev/ad0s13663652 2542176 112147669% 0 0 100% /dos % Thanks in advance, Haro =-- _ _Munehiro (haro) Matsuda -|- /_\ |_|_| Business Incubation Dept., Kubota Corp. /|\ |_| |_|_| 1-3 Nihonbashi-Muromachi 3-Chome Chuo-ku Tokyo 103-8310, Japan Tel: +81-3-3245-3318 Fax: +81-3-3245-3315 Email: [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: -current panic in suser_cred()
I have put a fix in /sys/ufs/ufs/ufs_inode.c (on freefall) which should solve this panic. Kirk McKusick =-=-=-= From: Wesley Morgan <[EMAIL PROTECTED]> Date: Mon, 24 Jun 2002 18:04:07 -0400 (EDT) Subject: -current panic in suser_cred() To: <[EMAIL PROTECTED]> At some point between 20 Jun and (by my best guest) 22 Jun there has been a problem introduced somewhere... How much more vague can you get? :)... File creation works fine, but attempting to rm causes a panic. config and dmesg (of a non-panicking kernel) are attached, panic message and gdb stuff below... Hope it's enough info to get a fix in the works! Fatal trap 12: page fault while in kernel mode fault virtual address = 0x4 fault code = supervisor read, page not present instruction pointer = 0x8:0xc019249c stack pointer = 0x10:0xdb467b4c frame pointer = 0x10:0xdb467b50 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 145 (rm) panic: from debugger #0 0xc019614b in doadump () #1 0xc01965db in boot (howto=260) #2 0xc01967fb in panic () #3 0xc0139452 in db_panic () #4 0xc01393d2 in db_command (last_cmdp=0xc02fd2e0, cmd_table=0xc02fd100, aux_cmd_tablep=0xc02f4c7c, aux_cmd_tablep_end=0xc02f4c80) #5 0xc01394e6 in db_command_loop () #6 0xc013c07a in db_trap (type=12, code=0) at ../../../ddb/db_trap.c:76 #7 0xc0298dfe in kdb_trap (type=12, code=0, regs=0xdb467b0c) at ../../../i386/i386/db_interface.c:214 #8 0xc02a9153 in trap_fatal (frame=0xdb467b0c, eva=4) #9 0xc02a8e62 in trap_pfault (frame=0xdb467b0c, usermode=0, eva=4) #10 0xc02a885a in trap (frame= {tf_fs = -1013055464, tf_es = 196624, tf_ds = 16, tf_edi = -1, tf_esi = -1012546560, tf_ebp = -616137904, tf_isp = -616137928, tf_ebx = 0, tf_edx = 0, tf_ecx = -1012854016, tf_eax = 1, tf_trapno = 12, tf_err = 0, tf_eip = -1072094052, tf_cs = 8, tf_eflags = 66050, tf_esp = -1012854016, tf_ss = -616137864})at ../../../i386/i386/trap.c:659 --- begin interesting stuff --- #11 0xc019249c in suser_cred (cred=0x0, flag=0) #12 0xc025dab5 in chkiq (ip=0xc3a5c400, change=4294967295, cred=0x0, flags=0)#13 0xc025b57f in ufs_inactive (ap=0xdb467be0) at ../../../ufs/ufs/ufs_inode.c:132 #14 0xc0263a08 in ufs_vnoperate (ap=0xdb467be0) #15 0xc01e01e5 in vput (vp=0xc3a59c00) #16 0xc01e77c4 in unlink (td=0xc393c41c, uap=0xdb467d10) #17 0xc02a948a in syscall (frame= {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077936696, tf_esi = 0, tf_ebp = -1077936776, tf_isp = -616137356, tf_ebx = -1077936553, tf_edx = -1077936508, tf_ecx = 47, tf_eax = 10, tf_trapno = 12, tf_err = 2, tf_eip = 134524795, tf_cs = 31, tf_eflags = 582, tf_esp = -1077936916, tf_ss = 47}) #18 0xc029a57d in syscall_with_err_pushed () at {standard input}:128 #19 0x0804839a in ?? () #20 0x08048145 in ?? () --=_20020624180407_17367 Content-Type: application/octet-stream; name="CATALYST" Content-Disposition: attachment; filename="CATALYST" Content-Transfer-Encoding: base64 bWFjaGluZQkJaTM4NgppZGVudAkJQ0FUQUxZU1QKbWF4dXNlcnMJNjQKb3B0aW9ucyAJTk1CQ0xV U1RFUlM9MTYzODQKCm1ha2VvcHRpb25zCUNPTkZfQ0ZMQUdTPSItZm5vLWJ1aWx0aW4iCgpvcHRp b25zIAlQUV9DQUNIRVNJWkU9NTEyCSMgY29sb3IgZm9yIDUxMmsvMTZrIGNhY2hlCm9wdGlvbnMg CU1BTExPQ19QUk9GSUxFCgojICAgIHN0cmluZ3MgLWFvdXQgLW4gMyAva2VybmVsIHwgZ3JlcCBe X19fIHwgc2VkIC1lICdzL15fX18vLycgPiBNWUtFUk5FTAojCm9wdGlvbnMgCUlOQ0xVREVfQ09O RklHX0ZJTEUgICAgICMgSW5jbHVkZSB0aGlzIGZpbGUgaW4ga2VybmVsCgojIyMjIyMjIyMjIyMj IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMK IyBDUFUgT1BUSU9OUwpjcHUJCUk2ODZfQ1BVCm9wdGlvbnMJCUNQVV9FTkFCTEVfU1NFCm9wdGlv bnMgCUNQVV9TVVNQX0hMVAoKIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMj IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjCiMgQ09NUEFUSUJJTElUWSBPUFRJT05TICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgCm9wdGlvbnMgCUNPTVBB VF80MwoKb3B0aW9ucyAJU1lTVlNITQpvcHRpb25zIAlTWVNWU0VNCm9wdGlvbnMgCVNZU1ZNU0cK Cm9wdGlvbnMgCUREQgojb3B0aW9ucyAJRERCX1VOQVRURU5ERUQKb3B0aW9ucyAJS1RSQUNFCQkJ I2tlcm5lbCB0cmFjaW5nCgojb3B0aW9ucyAJVVNFUkNPTkZJRwkJI2Jvb3QgLWMgZWRpdG9yCiNv cHRpb25zIAlWSVNVQUxfVVNFUkNPTkZJRwkjdmlzdWFsIGJvb3QgLWMgZWRpdG9yCgojIyMjIyMj IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMj IyMjIyMKIyBORVRXT1JLSU5HIE9QVElPTlMKb3B0aW9ucyAJSU5FVAkJCSNJbnRlcm5ldCBjb21t dW5pY2F0aW9ucyBwcm90b2NvbHMKI29wdGlvbnMgCUlORVQ2CQkJI0lQdjYgY29tbXVuaWNhdGlv bnMgcHJvdG9jb2xzCm9wdGlvbnMgCUlQU0VDCQkJI0lQIHNlY3VyaXR5Cm9wdGlvbnMgCUlQU0VD X0VTUAkJI0lQIHNlY3VyaXR5IChjcnlwdG87IGRlZmluZSB3LyBJUFNFQykKCmRldmljZSAJCWV0 aGVyCQkJI0dlbmVyaWMgRXRoZXJuZXQKZGV2aWNlIAkJbG9vcAkJCSNOZXR3b3JrIGxvb3BiYWNr IGRldmljZQpkZXZpY2UgCQlicGYgCQkJI0JlcmtlbGV5IHBhY2tldCBmaWx0ZXIKI2RldmljZSAg CXR1bgkJCSNUdW5uZWwgZHJpdmVyIChwcHAoOCksIG5vcy10dW4oOCkpCgojZGV2aWNlCQlnaWYJ NAkJI0lQdjYgYW5kIElQdjQgdHVubmVsaW5nCiN
Re: Stupid UFS2 questions...
Date: Fri, 18 Oct 2002 23:06:53 +0200 (CEST) From: BOUWSMA Beery <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: Stupid UFS2 questions... [IPv6-only address above; strip the obvious for IPv4-only mail replies] In trying to track down a panic I had while mounting a newly-created UFS2 filesystem, I noted that the `newfs' k0de had changed somewhat from -stable to -current. Specifically, that which determines the value of `sbsize' which I'm guessing should be no larger than 8192 else mounts cause panics. Here are the relevant lines from the last time I built -stable (mkfs.c): 547 sblock.fs_sbsize = fragroundup(&sblock, sizeof(struct fs)); 548 if (sblock.fs_sbsize > SBSIZE) 549 sblock.fs_sbsize = SBSIZE; If I'm not mistaken, this will give an upper limit of effectively 8192 to fs_sbsize, which does not appear to be the case with -current: As seen in the RCS file just CVSup'ed (sbin/newfs/mkfs.c,v): 840 errx(31, "calloc failed"); 841 sblock.fs_sbsize = fragroundup(&sblock, sizeof(struct fs)); 842 sblock.fs_minfree = minfree; 843 sblock.fs_maxbpg = maxbpg; There is no other reference to sbsize in the HEAD branch. Now, as soon as I patched the build I did half a month ago as follows: 386 if (fscs == NULL) 387 errx(31, "calloc failed"); 388 sblock.fs_sbsize = fragroundup(&sblock, sizeof(struct fs)); 389 /* XXX HACKHACKHACK */ 390 if (sblock.fs_sbsize > SBLOCKSIZE) 391 sblock.fs_sbsize = SBLOCKSIZE; 392 sblock.fs_minfree = minfree; that is, to match how -stable does this, I can create a filesystem with fragment sizes larger than 8192 bytes (UFS2) which I can successfully mount under -current, which, without this hack, would panic my machine. `dumpfs' shows the value for sbsize no larger than 8192, while for the problem filesystems it was >8192, as large as the fragment size. Thus the question: Is this the Right Thing[tm] to do? Your fix is exactly the right thing to do. I have put it into -current. Second question: I have a drive where I first tried to create an ill- fated UFS2 filesystem, because of the above panic which I had not yet researched, so I gave up and created a UFS1 filesystem thereupon, and filled it up. It *seems* that I can mount this disk under -current and probably access the UFS1 files within, but what was really weird was the `df' output from this disk. Said disk is 100% full under -stable, but -current claims it is 0% full. Sorry I don't have the actual outputs from this command, but is it possible that the presence of the UFS2 superblock is confusing -current when there's a UFS1 superblock and filesystem present, and if -current is looking first for a UFS2 superblock and finding one, is it possible to tell `mount' that I really want a UFS1 filesystem mount, and any remnants of UFS2 should be ignored? According to ufs/ffs/fs.h, the UFS1 superblock is at 8k while UFS2 is 64k from the front, so apparently the UFS2 superblock that I initially created still remains and confuses `df' and perhaps other things that I haven't tried yet, as it didn't get wiped when I created the UFS1 filesystem. So it seems. Which makes one to wonder, if there are three superblocks at three locations present, which to believe? And how to nuke the unwanted one(s)? Insight appreciated. Thanks. barry bouwsma In general you can move UFS1 filesystems back and forth between -stable and -current. However, you must run an `fsck -f -p' using the local version (e.g., the -stable fsck on a -stable systems, and the -current fsck on a -current system) before using the UFS1 filesystem. The reason is that -stable and -current record free block information in different parts of the superblock (32-bit counters for -stable and 64-bit counters for -current) and do not maintain the alternate counter locations. The local fsck will recalculate and correct the counters that the local system uses. Unless your blocksize was bigger than 64K, you would have overwritten the UFS2 superblock with UFS1 inode blocks when you created the UFS1 filesystem. I had originally put code in to stomp out all other possible superblock locations when creating a filesystem in newfs, but got in trouble as I ended up stomping on boot block information that UFS2 filesystems place where the UFS1 superblock used to reside. So, I deleted that code. Hope this helps. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Update to UFS2 Superblock Format
On Tuesday Nov 26th I plan to make an update to the UFS2 superblock. It will not affect UFS1 filesystems so should be generally transparent to most -current users. For those using UFS2 filesystems, the new kernel will update the superblock to the new format the first time that your UFS2 filesystem is mounted read-write. Once updated it will not be able to be mounted by older kernels unless the `zapsb' program (see below) is run to revert it to the old format. The only really noticable problem arises when you are booting from a UFS2 root partition. Here, you must follow the following steps: 1) boot new kernel 2) mount -u / 3) install new bootstrap Once the new kernel has converted the filesystem format for the root partition, the old bootstrap will no longer recognize it, so if you do not have a new bootstrap, you will no longer be able to boot from it. Note that you cannot update to the new bootstrap until the filesystem has been converted as the new bootstrap will not recognize the old superblock format. Again, this change will only affect you if you are using a UFS2 filesystem as your root filesystem. The changes that I plan to apply can be viewed at: http://www.freebsd.org/~mckusick/UFS2_update.diffs The program `zapsb.c' that reverts a UFS2 filesystem to its previous state can be found at: http://www.freebsd.org/~mckusick/zapsb.c If this change is going to cause you undue hardship, please send me mail ([EMAIL PROTECTED]). Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Update to UFS2 Superblock Format
Some of these fields could usefully be made unsigned others not (for example fs_pendingblocks and fs_pendinginodes). So just going through and making everything unsigned is not the right approach. I will make a pass through and consider changing some of these fields once the tree opens back up, but not at this point in time when we are trying to keep changes to a minimum and do not have time for extensive testing. Kirk McKusick =-=-=-=-= Date: Sun, 24 Nov 2002 21:28:38 -0800 (PST) From: Julian Elischer <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED], Robert Watson <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: Update to UFS2 Superblock Format In-Reply-To: <[EMAIL PROTECTED]> X-ASK-Info: Whitelist match I do have one question re: UFS2, not specifically about this change however.. I notice that the fields of the disk structure are signed. Wouldn;t it make more sence at this early stage to declare them as unsigned? For example take this snippet from struct fs int64_t fs_size; /* number of blocks in fs */ int64_t fs_dsize; /* number of data blocks in fs */ ufs2_daddr_t fs_csaddr; /* blk addr of cyl grp summary area */ int64_t fs_pendingblocks; /* blocks in process of being freed */ int32_t fs_pendinginodes; /* inodes in process of being freed */ int32_t fs_snapinum[FSMAXSNAP];/* list of snapshot inode numbers */ int32_t fs_avgfilesize;/* expected average file size */ int32_t fs_avgfpdir; /* expected # of files per directory */ int32_t fs_save_cgsize;/* save real cg size to use fs_bsize */ int32_t fs_sparecon32[27]; /* reserved for future constants */ int32_t fs_contigsumsize; /* size of cluster summary array */ int32_t fs_maxsymlinklen; /* max length of an internal symlink */ int32_t fs_old_inodefmt; /* format of on-disk inodes */ u_int64_t fs_maxfilesize; /* maximum representable file size */ int64_t fs_qbmask; /* ~fs_bmask for use with 64-bit size */ int64_t fs_qfmask; /* ~fs_fmask for use with 64-bit size */ int32_t fs_state; /* validate fs_clean field */ int32_t fs_old_postblformat; /* format of positional layout tables */ int32_t fs_old_nrpos; /* number of rotational positions */ How can any of these values be meaningfully -ve? Making them signed just gives fsck a harder time to check the values. (as we saw this week). I have run a system with many of these made unsigned and it made no difference to the system. It was binarily compatible too. i.e it mounted existing filesystemd with no problems. julian To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Update to UFS2 Superblock Format
Date: Mon, 25 Nov 2002 01:08:30 -0800 (PST) From: Julian Elischer <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED], Robert Watson <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: Update to UFS2 Superblock Format In-Reply-To: <[EMAIL PROTECTED]> X-ASK-Info: Whitelist match On Sun, 24 Nov 2002, Kirk McKusick wrote: > Some of these fields could usefully be made unsigned others not > (for example fs_pendingblocks and fs_pendinginodes). So just > going through and making everything unsigned is not the right > approach. I will make a pass through and consider changing some > of these fields once the tree opens back up, but not at this > point in time when we are trying to keep changes to a minimum > and do not have time for extensive testing. > > Kirk McKusick I'm not in a hurry.. It's just something that I thought should be considered. "eventually". BTW how can fs_pendingblocks and fs_pendinginodes be -ve? In theory they should never go negative. But if an inconsistency occurs (for example a crash and remount before background fsck has run) the accounting can get out of whack and the numbers go negative. We check for this happening and take corrective action. If they were changed to unsigned, we would miss the negative transition and instead suddenly think that we had a huge amount of pending space to free. So this is an example where changing them to unsigned would break existing code. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: UFS-2 partition destroyed by change
If I understand correctly, you ran a new fsck on a UFS-2 that had not yet been mounted by your new kernel. Thus you had a version of fsck that expected a converted UFS-2 filesystem that you had not yet converted (by mounting with the new kernel). I would have expected it to fail with a bad superblock magic number. Had you mounted it before running the new fsck, all would have been well. I am at a loss to explain why fsck did not gag and refuse to check it though. Kirk McKusick =-=-=-=-= Date: Thu, 28 Nov 2002 06:01:36 + (GMT) From: Daniel Flickinger <[EMAIL PROTECTED]> To: FreeBSD-CURRENT <[EMAIL PROTECTED]> Cc: Kirk McKusick <[EMAIL PROTECTED]> Subject: UFS-2 partition destroyed by change X-ASK-Info: Confirmed by User I only had one UFS-2 partition, the backup root partition on da1a. After McKusick's notice of change: Message-ID: <[EMAIL PROTECTED]> for 26 Nov, I installed the kernel and world sliced at 1200 GMT 27 Nov. As a matter of principle, I _always_ run fsck -y from single user at reboot of a new world (which means every day now) even though I have not had a crash --pardon me for too many years of BSD, but habits stick da1a was shredded; only lost+found: p1:da1a #535-> ll lost+found/ total 8 0 br-xrw--wT 1 root wheel0, 0 Jan 1 1970 #00455 0 br-xrw--wT 1 root wheel0, 0 Jan 1 1970 #00561 0 br-xr-xr-x 1 root wheel0, 0 Jan 1 1970 #00813 0 br-xr-x--t 1 root wheel0, 0 Jan 1 1970 #00865 8 d-wSr-x--T 2 root wheel 8192 Jan 1 1970 #01031 The directory is empty. No pipers for Last Post, but a rather good sendoff of 80MB to bit heaven. No other partition even whimpered and nothing really lost since it was a duplicate of da0a. I was about to convert the remaining 9 partitions to UFS-2 when I read Kirk's notice and decided to wait. I'll rebuild the da1a partition with UFS-2 (new and improved version?) and see what happens tomorrow morning with the 1200 GMT 28 Nov slice. If I do a 'disklabel -B da1' (I have a pair of dangerously dedicated 9G 160 SCSIs), I presume that /boot/mbr is now the correct "new" UFS-2 boot record? My intention is to convert all parititions one-by-one, except da0a, to UFS-2, and then 'boot -s' from da1 and 'dd' da0a since the disks are siamese twins. -- Sanity is the Playground for the Unimaginative To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: UFS-2 partition destroyed by change
I appreciate the offer to go through the whole upgrade process again, but I don't think it is necessary. If there were going to be many anguished folks that had to go through it, I would have played out all the senarios and made sure they worked. The point of doing this change now was to fix problems with UFS2 before most people had deployed it. From here on out, I promise not to introduce major breakage :-) Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Trashed Disk Labels
If you have updated your kernel sources on or after Nov 27th, and are running with ufs/ffs/ffs_vfsops.c version 1.197, this message applies to you. I have had a report of a disk label getting trashed after booting up to a kernel with the new UFS2 superblock format. I have just checked in an update to ufs/ffs/ffs_vfsops.c (version 1.198) that explicitly checks to make sure that it will not trash your disk label. I highly recommend that you update to this version, even if you are only running with UFS1 filesystems. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Trashed Disk Labels
Date: Fri, 29 Nov 2002 14:53:06 -0500 (EST) From: Wesley Morgan <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED] Subject: Re: Trashed Disk Labels X-ASK-Info: Confirmed by User On Fri, 29 Nov 2002, Kirk McKusick wrote: > I have had a report of a disk label getting trashed after booting > up to a kernel with the new UFS2 superblock format. I have just > checked in an update to ufs/ffs/ffs_vfsops.c (version 1.198) that > explicitly checks to make sure that it will not trash your disk > label. I highly recommend that you update to this version, even if > you are only running with UFS1 filesystems. > > Kirk McKusick Great! Any tools available to extract my var/db/pkg dirs from this image of my trashed UFS2 filesystem? :> What seems to work is to boot from CD-ROM, use disklabel -r -w auto to reinstall the default disklabel, then disklabel -B to put back the bootstrap. At that point your existing filesystems should all come back. This of course assumes that you used the orginal default partition sizes. If not, you will need to figure them out and edit up an appropriate disk label. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Trashed Disk Labels
Date: Sat, 30 Nov 2002 17:43:53 +1100 (EST) From: Bruce Evans <[EMAIL PROTECTED]> X-X-Sender: [EMAIL PROTECTED] To: Kirk McKusick <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED] Subject: Re: Trashed Disk Labels In-Reply-To: <[EMAIL PROTECTED]> X-ASK-Info: Whitelist match On Fri, 29 Nov 2002, Kirk McKusick wrote: > I have had a report of a disk label getting trashed after booting > up to a kernel with the new UFS2 superblock format. I have just > checked in an update to ufs/ffs/ffs_vfsops.c (version 1.198) that > explicitly checks to make sure that it will not trash your disk > label. I highly recommend that you update to this version, even if > you are only running with UFS1 filesystems. Labels should be write protected, but this seems to have been broken by GEOM. Bruce Disk labels certainly used to be write protected. Not sure when that stopped, but it certainly would have been useful in this recent context. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Update to UFS2 Superblock Format
You will have to ask Puol-Henning Kamp, but I do not believe that he has yet put together a bootstrap for the i386 platform that can boot from a UFS2 filesystem. As such, I believe that you are required to have a UFS1 root on the i386 at this time. I have copied Poul-Henning Kamp so that he can correct me if I am incorrect on this point. Kirk McKusick =-=-=-=-=-= Date: Fri, 29 Nov 2002 22:57:12 -0800 To: Kirk McKusick <[EMAIL PROTECTED]>, [EMAIL PROTECTED] From: Manfred Antar <[EMAIL PROTECTED]> Subject: Re: Update to UFS2 Superblock Format Cc: Robert Watson <[EMAIL PROTECTED]>, [EMAIL PROTECTED] In-Reply-To: <[EMAIL PROTECTED]> X-ASK-Info: Confirmed by User At 09:11 PM 11/24/2002 -0800, Kirk McKusick wrote: >On Tuesday Nov 26th I plan to make an update to the UFS2 >superblock. It will not affect UFS1 filesystems so should >be generally transparent to most -current users. For those >using UFS2 filesystems, the new kernel will update the >superblock to the new format the first time that your UFS2 >filesystem is mounted read-write. Once updated it will not >be able to be mounted by older kernels unless the `zapsb' >program (see below) is run to revert it to the old format. > >The only really noticable problem arises when you are booting >from a UFS2 root partition. Here, you must follow the following >steps: > >1) boot new kernel >2) mount -u / >3) install new bootstrap > >Once the new kernel has converted the filesystem format for the >root partition, the old bootstrap will no longer recognize it, so >if you do not have a new bootstrap, you will no longer be able to >boot from it. Note that you cannot update to the new bootstrap >until the filesystem has been converted as the new bootstrap will >not recognize the old superblock format. Again, this change will >only affect you if you are using a UFS2 filesystem as your root >filesystem. > >The changes that I plan to apply can be viewed at: > >http://www.freebsd.org/~mckusick/UFS2_update.diffs > >The program `zapsb.c' that reverts a UFS2 filesystem to its >previous state can be found at: > >http://www.freebsd.org/~mckusick/zapsb.c > >If this change is going to cause you undue hardship, please >send me mail ([EMAIL PROTECTED]). > >Kirk McKusick > >To Unsubscribe: send mail to [EMAIL PROTECTED] >with "unsubscribe freebsd-current" in the body of the message Kirk With a kernel and system current as of Thurs night. I did a dump of / , /var , /usr filesystems. I did a disklabel -B da0s1 I did a make release and booted off the cdrom. went into the fixit mode and did newfs -O2 /dev/da0s1a (root) /dev/da0s1e (/var) /dev/da0s1f (/usr) I then did a restore of the file systems. when i reboot somehow the bootstrap bypasses /boot/loader Here is what I see on the screen /boot.config -P Invalid format >>FreeBSD/i386/UFS1 BOOT Default: 0:da(0,a)/kernel boot: WARNING: loader(8) metadata is missing! I have a current kernel in the / directory so it boots that and I get to the: mountroot>and do mountroot> ufs:da0s1a I guess what ineed to know is how to install the UFS2 bootblocks Thanks Manfred == || [EMAIL PROTECTED] || || Ph. (415) 681-6235 || == To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Update to UFS2 Superblock Format
Date: Fri, 29 Nov 2002 23:16:51 -0800 To: Kirk McKusick <[EMAIL PROTECTED]> From: Manfred Antar <[EMAIL PROTECTED]> Subject: Re: Update to UFS2 Superblock Format Cc: [EMAIL PROTECTED], Robert Watson <[EMAIL PROTECTED]>, [EMAIL PROTECTED], Poul-Henning Kamp <[EMAIL PROTECTED]> In-Reply-To: <[EMAIL PROTECTED]> X-ASK-Info: Whitelist match At 11:11 PM 11/29/2002 -0800, Kirk McKusick wrote: >You will have to ask Puol-Henning Kamp, but I do not believe that >he has yet put together a bootstrap for the i386 platform that can >boot from a UFS2 filesystem. As such, I believe that you are >required to have a UFS1 root on the i386 at this time. I have >copied Poul-Henning Kamp so that he can correct me if I am incorrect >on this point. > >Kirk McKusick Ah No wonder, I tried editing the /sys/boot/i386/boot2/Makefile to enable UFS2 bootblock but then disklabel complained that boot2 was too big. I will have to revert to UFS1 Thanks Manfred == || [EMAIL PROTECTED] || || Ph. (415) 681-6235 || == You have hit upon the exact problem. UFS2 has a much bigger area reserved for the boot block, but the programs that set up disk labels and boot blocks don't know about it yet so assume that they have to cram into the much smaller UFS1 boot-block area. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
newfs chokes, cores, & dies if inode density too high; patch attached
Date: Fri, 1 Nov 2002 00:43:38 + From: Ceri Davies <[EMAIL PROTECTED]> To: David Wolfskill <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: newfs chokes, cores, & dies if inode density too high; patch attached In-Reply-To: <[EMAIL PROTECTED]> I don't have time to test this right now, but see also PR bin/30959. Ceri -- you can't see when light's so strong you can't see when light is gone Better late than never, this bug has been fixed. From: Kirk McKusick <[EMAIL PROTECTED]> Date: Sat, 30 Nov 2002 10:28:26 -0800 (PST) To: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: cvs commit: src/sbin/newfs mkfs.c newfs.c mckusick2002/11/30 10:28:26 PST Modified files: sbin/newfs mkfs.c newfs.c Log: Add some more checks to newfs so that it will not build filesystems that the kernel will refuse to mount. Specifically it now enforces the MAXBSIZE blocksize limit. This update also fixes a problem where newfs could segment fault if the selected fragment size was too large. PR: bin/30959 Submitted by: Ceri Davies <[EMAIL PROTECTED]> Sponsored by: DARPA & NAI Labs. Revision ChangesPath 1.66 +24 -14src/sbin/newfs/mkfs.c 1.66 +5 -1 src/sbin/newfs/newfs.c To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: UFS Snapshot deadlock
Your deadlock should now be fixed. Kirk McKusick =-=-=-=-= From: Kirk McKusick <[EMAIL PROTECTED]> Date: Fri, 29 Nov 2002 23:27:12 -0800 (PST) To: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: cvs commit: src/sys/ufs/ffs ffs_snapshot.c X-FreeBSD-CVS-Branch: HEAD mckusick2002/11/29 23:27:12 PST Modified files: sys/ufs/ffs ffs_snapshot.c Log: Fix two deadlocks in snapshots: 1) Release the snapshot file lock while suspending the system. Otherwise a process trying to read the lock may block on its containing directory preventing the suspension from completing. Thanks to Sean Kelly <[EMAIL PROTECTED]> for finding this deadlock. 2) Replace some bdwrite's with bawrite's so as not to fill all the buffers with dirty data. The buffers could not be cleaned as the snapshot vnode was locked hence the system could deadlock when making snapshots of really massive filesystems. Thanks to Hidetoshi Shimokawa <[EMAIL PROTECTED]> for figuring this out. Sponsored by: DARPA & NAI Labs. Revision ChangesPath 1.51 +7 -2 src/sys/ufs/ffs/ffs_snapshot.c =-=-=-=-=-= Date: Wed, 30 Oct 2002 03:57:52 -0600 From: Sean Kelly <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: UFS Snapshot deadlock While playing with UFS snapshots on a UFS2 filesystem I mounted specifically for this purpose, I encountered a little problem. It seems I have processes deadlocked on each other. Steps to repeat: /# mount /dev/ad2a /mnt ; cd /mnt /dev/ad2a on /mnt (ufs, local, soft-updates, multilabel) # UFS2 /mnt# cd /mnt; mount -u -o snapshot /mnt/snapshot /mnt *switch vtys* /# cd /mnt; ls -l *ls deadlocks* *I get bored and ^C the mount on the other vty about 30 minutes later* /mnt# ls *this ls deadlocks too* For the record, /mnt was a new filesystem. It had *nothing* in it. No directories or anything. So now, I've got these: UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 0 1133 669 0 -4 0 692 548 ufsD+v10:00.00 ls 1001 939 856 0 -4 0 696 560 ufsD+v20:00.00 ls -l 0 937 1 0 -4 0 560 336 ufsD v10:00.65 mount -u -o snapshot /mnt/snapshot /mnt Now for some numbers. db> trace 937 mi_switch(c71aab60,50,c03375c6,c7,c03ad2f8) at mi_switch+0x158 msleep(c75098dc,c03a9358,50,c034f732,0) at msleep+0x3b4 acquire(c75098dc,140,600,e6,3a9) at acquire+0xa7 lockmgr(c75098dc,1010002,c7509818,c71aab60,e5b076a8) at lockmgr+0x2f7 vop_stdlock(e5b076c4,e5b076e0,c021e306,e5b076c4,0) at vop_stdlock+0x2c ufs_vnoperate(e5b076c4,0,c033dd28,e5b076e0,c01ba4a5) at ufs_vnoperate+0x18 vn_lock(c7509818,10002,c71aab60,815,c7509818) at vn_lock+0xd6 vget(c7509818,2,c71aab60,470,0) at vget+0xd6 ffs_sync(c74c5400,1,c726a780,c71aab60,c74f1000) at ffs_sync+0x126 vfs_write_suspend(c74c5400,c74ffcb8,d351f08c,1,c2c06e80) at vfs_write_suspend+0x70 ffs_snapshot(c74c5400,bfbffd1d,70,c033990d,252) at ffs_snapshot+0xa48 ffs_mount(c74c5400,c745ce80,bfbff000,e5b07bf0,c71aab60) at ffs_mount+0x548 vfs_mount(c71aab60,c6d2b780,c745ce80,101,bfbff000) at vfs_mount+0x85e mount(c71aab60,e5b07d14,c03590ba,409,4) at mount+0xb8 syscall(2f,2f,2f,bfbfeffc,bfbff9f4) at syscall+0x22e Xint0x80_syscall() at Xint0x80_syscall+0x1d db> trace 939 mi_switch(c74260d0,50,c03375c6,c7,1cc) at mi_switch+0x158 msleep(c74ffd7c,c03a9688,50,c034f732,0) at msleep+0x3b4 acquire(c74ffd7c,140,600,e6,3ab) at acquire+0xa7 lockmgr(c74ffd7c,1010002,c74ffcb8,c74260d0,e5bfd83c) at lockmgr+0x2f7 vop_stdlock(e5bfd858,e5bfd874,c021e306,e5bfd858,246) at vop_stdlock+0x2c ufs_vnoperate(e5bfd858,246,0,c74f1000,0) at ufs_vnoperate+0x18 vn_lock(c74ffcb8,10002,c74260d0,7f,3) at vn_lock+0xd6 vget(c74ffcb8,10002,c74260d0,7f,c74260d0) at vget+0xd6 ufs_ihashget(c74cce00,3,2,e5bfd98c,e5bfd8f0) at ufs_ihashget+0xd2 ffs_vget(c74c5400,3,2,e5bfd98c,e5bfd994) at ffs_vget+0x44 ufs_lookup(e5bfdac0,e5bfdafc,c0207a24,e5bfdac0,e5bfdc3c) at ufs_lookup+0xdae ufs_vnoperate(e5bfdac0,e5bfdc3c,e5bfdc50,3ab,c74260d0) at ufs_vnoperate+0x18 vfs_cache_lookup(e5bfdb70,e5bfdb9c,c020bd39,e5bfdb70,c7509818) at vfs_cache_lookup+0x2e4 ufs_vnoperate(e5bfdb70,c7509818,e5bfdc50,e5bfdb5c,c74260d0) at ufs_vnoperate+0x18 lookup(e5bfdc28,0,c033d6ad,a4,c74260d0) at lookup+0x309 namei(e5bfdc28,c03ade38,c03ade10,c03b42a0,0) at namei+0x1e0 lstat(c74260d0,e5bfdd14,c03590ba,409,2) at lstat+0x52 syscall(2f,2f,2f,80d3200,80d1040) at syscall+0x22e Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (190, FreeBSD ELF32, lstat), eip = 0x805838b, esp = 0xbfbff3dc, ebp = 0xbfbff468 --- db> trace 1133 mi_switch(c6d31680,50,c03375c6,c7,2) at mi_switch+0x158 msleep(c75098dc,c03a9358,50,c034f732,0) at msleep+0x3b4 acquire(c75098dc,140,600,e6,46d) at acquire+0xa7 lockmgr(c75098dc,1030002,c7509818,c6d31680,e3887ad0) at lockmgr+0x2f7 vop_stdlock(e3887aec,e3887b08,c021e306,e3887aec,0) at vop_stdlock+0x2
Re: corrupted UFS2 label after ffs_vfsops.c,v 1.198
Date: Sat, 30 Nov 2002 00:44:10 +0100 (CET) From: Michael Reifenberger <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: FreeBSD-Current <[EMAIL PROTECTED]> Subject: corrupted UFS2 label after ffs_vfsops.c,v 1.198 Hi, after cvsupping a kernel with the mentioned version of ffs_vfsops.c I tried to upgrade my kernel from a some weeks aged -current. After that I'm no longer able to mount or fsck a UFS2 formatted disk. My dmesg is attached. Trying fsck_ffs /dev/da0s1a gives: (nihil)(root) # fsck_ffs /dev/da0s1a ** /dev/da0s1a Cannot find file system superblock LOOK FOR ALTERNATE SUPERBLOCKS? [yn] y Fließkommafehler (floating point error in german) Any possible alternate superblock given with -b gives a fp-error also. How to resolve this? Bye! Michael Reifenberger ^.*Plaut.*$, IT, R/3 Basis, GPS Once you have upgraded your fsck to the current version, it will only check converted UFS2 filesystems. To convert your UFS2 filesystem, simply mount it with your new kernel. Once you have done that, you will be able to unmount it and run the new fsck. Similarly, if you have an older kernel (vintage last four months of -current) then it will back-convert your UFS2 filesystems every time you run it and thus you will have to forward convert before fsck will run on it again. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
Date: Thu, 5 Dec 2002 15:22:27 -0800 (PST) From: Archie Cobbs <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: backgroud fsck is still locking up system Just rebuilt -current this morning. Background fsck is still causing a "soft lockup". I thought the conclusion was we were going to disable it for 5.0. Not trying to rush anyone, just pointing out that this still needs to be done.. -Archie __ Archie Cobbs*Packet Design*http://www.packetdesign.com What do you mean by background fsck causing a "soft lockup"? Is it failing? Is it deadlocking the system? Do you have a specific test case that shows the problem? Needless to say it is working fine on my system and on my regression tests. The only problem that I am having with 5.0 as of last night is getting login to work on my console. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
Does the background fsck process continue to run, or does the whole system come to a halt? If the fsck process continues to run, what happens when it eventually finishes? Is the system still dead, or does it come back to life? If the system does not come back to life can you get me the output of `ps axl'? If not, can you break into the debugger and get a ps output? (You will need to have the DDB option specified in your config file). Kirk McKusick =-=-=-=-=-= From: Archie Cobbs <[EMAIL PROTECTED]> Subject: Re: backgroud fsck is still locking up system (fwd) In-Reply-To: <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> Date: Thu, 5 Dec 2002 16:22:20 -0800 (PST) CC: Archie Cobbs <[EMAIL PROTECTED]>, Robert Watson <[EMAIL PROTECTED]>, [EMAIL PROTECTED] X-ASK-Info: Confirmed by User Kirk McKusick wrote: > Just rebuilt -current this morning. Background fsck is still > causing a "soft lockup". I thought the conclusion was we were > going to disable it for 5.0. > > What do you mean by background fsck causing a "soft lockup"? > Is it failing? Is it deadlocking the system? Do you have a > specific test case that shows the problem? Needless to say > it is working fine on my system and on my regression tests. > The only problem that I am having with 5.0 as of last night > is getting login to work on my console. What happens is that at first I can login, but the system seems slow. I then got as far as running 'top' but it never refreshed its display and subsequently all keystrokes were ignored. Changing virtual terminals still works OK, but they are effectively dead too. I'm imagining processes getting stuck on some lock one by one. Top did get as far as showing the background fsck process, which had a priority of -6 or something. The previous time it didn't even spit out a login prompt, but that may just be due to experimental noise. For me, it appears easy to reproduce... 1. Boot -current system 2. Pull the power cable out 3. Put the power cable back in 4. Let the box boot; it notes backgroud fsck 5. Login and try to do something I can give you more details about my system separately if you like. Thanks, -Archie __ Archie Cobbs * Packet Design * http://www.packetdesign.com To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
From: Archie Cobbs <[EMAIL PROTECTED]> Subject: Re: backgroud fsck is still locking up system (fwd) In-Reply-To: <[EMAIL PROTECTED]> To: Nate Lawson <[EMAIL PROTECTED]> Date: Fri, 6 Dec 2002 10:57:13 -0800 (PST) CC: Kirk McKusick <[EMAIL PROTECTED]>, Archie Cobbs <[EMAIL PROTECTED]>, [EMAIL PROTECTED] X-ASK-Info: Whitelist match Nate Lawson wrote: > > Does the background fsck process continue to run, or does the whole > > system come to a halt? If the fsck process continues to run, what > > happens when it eventually finishes? Is the system still dead, or > > does it come back to life? If the system does not come back to life > > can you get me the output of `ps axl'? If not, can you break into > > the debugger and get a ps output? (You will need to have the DDB > > option specified in your config file). > > Sorry for butting in. I think Archie is referring to bg fsck gaining > an unfair share of cpu due to it running due to IO completions. Last I > heard, we were waiting until after 5.0 to experiment with scheduler > changes to make it more fair. I have not seen any hard locks or other > problems with bg fsck after your commit. I'm actually seeing something different. The box becomes unresponsive (except for virtual console changes and CTRL-ALT-ESC) but there's no disk activity. It never recovers. Reproduced it again just now. After pulling the plug and rebooting I didn't touch the box. It booted normally, started background fsck, and the HDD light was blinking as expected. After about 10 seconds, rather suddenly the HDD light stopped blinking. At this point it was pretty dead. Broke into the debugger and it showed a similar 'ps' output to what I previously posted. -Archie Your ps shows fsck_ufs and the syncer process both blocked on "nbufbs". That means the system has blocked them from running bacause it feels that there are too many dirty buffers. What you are probably experiencing is that you have a relatively small memory machine which has a rather low threshhold for blocking on dirty buffers. All the dirty buffers in your system are held by the indirect blocks of the snapshot and thus the bufdaemon cannot push them out. That task can only be done by the syncer who is also blocked. Could you please run the following command on your system and send me the results: sysctl vfs.lodirtybuffers sysctl vfs.hidirtybuffers sysctl vfs.numdirtybuffers both before and after the lockup. If you cannot run this command after the lockup, the global variable names are: lodirtybuffers hidirtybuffers numdirtybuffers If my hypothesis is correct, that will let me tweek the thrshholds on dirty buffers to get a solution. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: UFS1 created by 5.0 is incompatible with 4.0's?
Date: Fri, 6 Dec 2002 18:06:03 +0200 From: Ruslan Ermilov <[EMAIL PROTECTED]> To: Petr Holub <[EMAIL PROTECTED]>, Matt Dillon <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: UFS1 created by 5.0 is incompatible with 4.0's? X-ASK-Info: Whitelist match On Fri, Dec 06, 2002 at 01:59:11PM +0100, Petr Holub wrote: > Hi, >=20 > > While testing the 4.0 -> 5.0 upgrade path, I've created (under > > 5.0) a UFS1 partition and installed 4.0 onto it. After booting > > the 4.0 from it, kernel complained about ``numdirs is zero, try > > using an alternate superblock'' for / partition -- I've tried > > what it suggests (by fsck -b 32, etc.) but the result was always > > the same -- the file system was marked dirty and only read-only > > usable. After rebooting in 5.0, this file system was similarly > > unusable. Is this a bug or a feature? >=20 > I've discussed this issue with Poul-Henning Kamp. You need fsck > from at least 4.7. >=20 Is this handled by fsck/setup.c,v 1.17.2.4 commit? : revision 1.17.2.4 : date: 2002/06/24 05:10:41; author: dillon; state: Exp; lines: +26 -56 : MFC 1.30. Check only the fields we know should be the same between the : primary and alternate superblocks, so fsck doesn't barf on new features : added to UFS in later releases. :=20 : Submitted by: mckusick Cheers, --=20 Ruslan Ermilov Sysadmin and DBA, [EMAIL PROTECTED] Sunbay Software AG, [EMAIL PROTECTED] FreeBSD committer, +380.652.512.251Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age If the 1.17.2.4 commit does not solve your problem, try the following patch that I made to the 5.0 fsck. If it solves your problem, then it should probably be MFC'ed. Kirk McKusick Index: sbin/fsck_ffs/setup.c === RCS file: /usr/ncvs/src/sbin/fsck_ffs/setup.c,v retrieving revision 1.41 diff -c -r1.41 setup.c *** setup.c 2002/11/27 02:18:57 1.41 --- setup.c 2002/12/04 23:13:18 *** *** 258,269 (unsigned)(sizeof(struct inostatlist) * (sblock.fs_ncg))); goto badsb; } ! numdirs = sblock.fs_cstotal.cs_ndir; dirhash = numdirs; - if (numdirs == 0) { - printf("numdirs is zero, try using an alternate superblock\n"); - goto badsb; - } inplast = 0; listmax = numdirs + 10; inpsort = (struct inoinfo **)calloc((unsigned)listmax, --- 258,265 (unsigned)(sizeof(struct inostatlist) * (sblock.fs_ncg))); goto badsb; } ! numdirs = MAX(sblock.fs_cstotal.cs_ndir, 128); dirhash = numdirs; inplast = 0; listmax = numdirs + 10; inpsort = (struct inoinfo **)calloc((unsigned)listmax, To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
The loss of files under soft updates is possible if your editor fails to fsync the new file before unlinking the old file. The `vi' editor always does an `fsync' after writing the new copy and before removing the old copy. I have not checked with other editors such as emacs to see if they properly use fsync. Note that there is also a vulnerability without soft updates, it is just that the window of vulnerability is shorter. So, editors should always do fsync's, it is just more critical if you are using soft updates (or journalling for that matter). The main reason for not using soft updates on the root filesystem was because of the delay between removing files and having the space show up. The result was that world installs on the root filesystem often failed if the root was nearly full (as is so often the case). That problem has now been fixed in 5.0 with a callback to soft updates if a filesystem full error is about to be generated. When called back, soft updates expedites the freeing of space so that the new allocation can succeed. So, the primary reason for not using soft updates on the root is now fixed. If however, mainline editors are not doing fsync's, then there is still a good reason not to use soft updates on the root filesystem. Kirk McKusick =-=-=-=-= From: Archie Cobbs <[EMAIL PROTECTED]> Subject: Re: backgroud fsck is still locking up system (fwd) In-Reply-To: <[EMAIL PROTECTED]> To: Dan Nelson <[EMAIL PROTECTED]> Date: Fri, 6 Dec 2002 11:28:52 -0800 (PST) CC: [EMAIL PROTECTED], [EMAIL PROTECTED] X-ASK-Info: Whitelist match Dan Nelson wrote: > > Why does softupdates not get enabled on / , by default on the > > install? > > Softupdates updates on-disk structures in the background, and > background fsck cannot relink unreferenced files into lost+found, so > you run the risk of losing both the original and backup copies of > important files in case of a sudden reboot. Imagine you edited > /etc/rc.conf, saved it, and 5 seconds later the system panic'ed. > Because the default metadata flush time is 28 seconds, there's a pretty > good chance that neither the new file or the original is in /etc after > a reboot. I got bit by this three times before I learned my lesson. I I don't understand this.. presumably vi updates the file contents by opening and writing into the file; why would this cause the file's directory entry to disappear? On the other hand, if you do "mv rc.conf.new rc.conf" then you are supposedly guaranteed that the file exists in some form; see rename(2). In any case, you seem to be implying that with respect to modifying files just before a system crash: (a) Softupdates is more 'dangerous' than non-softupdates (b) Background fsck is more 'dangerous' than normal fsck Is this really true? I thought if anything the reverse of (a) would be true. -Archie __ Archie Cobbs * Packet Design * http://www.packetdesign.com To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
From: Archie Cobbs <[EMAIL PROTECTED]> Subject: Re: backgroud fsck is still locking up system (fwd) In-Reply-To: <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> Date: Fri, 6 Dec 2002 13:01:20 -0800 (PST) CC: Archie Cobbs <[EMAIL PROTECTED]>, Nate Lawson <[EMAIL PROTECTED]>, [EMAIL PROTECTED] X-ASK-Info: Whitelist match Kirk McKusick wrote: > by the syncer who is also blocked. Could you please run the following > command on your system and send me the results: > > sysctl vfs.lodirtybuffers > sysctl vfs.hidirtybuffers > sysctl vfs.numdirtybuffers > > both before and after the lockup. If you cannot run this command after > the lockup, the global variable names are: > > lodirtybuffers > hidirtybuffers > numdirtybuffers Before (system running normally): vfs.lodirtybuffers: 126 vfs.hidirtybuffers: 252 vfs.numdirtybuffers: 0 After: vfs.lodirtybuffers: 126 vfs.hidirtybuffers: 252 vfs.numdirtybuffers: 445 -Archie __ Archie Cobbs * Packet Design * http://www.packetdesign.com OK, it looks like my hypothesis on having a small number of buffers and running out of them is the problem. I enclose below a patch which should check for the problem arising and help to mitigate it. I would appreciate you dropping it into your kernel and seeing if it solves your problem. The fix is not ideal, but merely to see if it solves this problem. If it does, I will figure out how to do it properly. Thanks for your help. Kirk McKusick Index: sys/buf.h === RCS file: /usr/ncvs/src/sys/sys/buf.h,v retrieving revision 1.138 diff -c -r1.138 buf.h *** sys/buf.h 2002/08/30 04:04:37 1.138 --- sys/buf.h 2002/12/06 21:44:25 *** *** 468,473 --- 468,474 caddr_t kern_vfs_bio_buffer_alloc(caddr_t v, long physmem_est); void bufinit(void); void bwillwrite(void); + int checkdirtybufs(struct vnode *); int buf_dirty_count_severe(void); void bremfree(struct buf *); int bread(struct vnode *, daddr_t, int, struct ucred *, struct buf **); Index: kern/vfs_bio.c === RCS file: /usr/ncvs/src/sys/kern/vfs_bio.c,v retrieving revision 1.342 diff -c -r1.342 vfs_bio.c *** kern/vfs_bio.c 2002/11/23 19:10:30 1.342 --- kern/vfs_bio.c 2002/12/06 21:44:35 *** *** 1114,1119 --- 1114,1137 } /* + * Check to see if a vnode holds too many dirty buffers. If it does, + * flush it. + */ + int + checkdirtybufs(struct vnode *vp) + { + struct buf *bp; + int dirtycnt = 0, error = 0; + struct thread *td = curthread; + + TAILQ_FOREACH(bp, &vp->v_dirtyblkhd, b_vnbufs) + dirtycnt++; + if (dirtycnt > lodirtybuffers) + error = VOP_FSYNC(vp, td->td_ucred, MNT_NOWAIT, td); + return (error); + } + + /* * Return true if we have too many dirty buffers. */ int Index: ufs/ffs/ffs_balloc.c === RCS file: /usr/ncvs/src/sys/ufs/ffs/ffs_balloc.c,v retrieving revision 1.39 diff -c -r1.39 ffs_balloc.c *** ufs/ffs/ffs_balloc.c2002/10/22 01:14:25 1.39 --- ufs/ffs/ffs_balloc.c2002/12/06 21:49:56 *** *** 295,300 --- 295,301 if (bp->b_bufsize == fs->fs_bsize) bp->b_flags |= B_CLUSTEROK; bdwrite(bp); + checkdirtybufs(vp); } } /* *** *** 335,340 --- 336,342 if (bp->b_bufsize == fs->fs_bsize) bp->b_flags |= B_CLUSTEROK; bdwrite(bp); + checkdirtybufs(vp); } *bpp = nbp; return (0); *** *** 756,761 --- 758,764 if (bp->b_bufsize == fs->fs_bsize) bp->b_flags |= B_CLUSTEROK; bdwrite(bp); + checkdirtybufs(vp); } } /* *** *** 796,801 --- 799,805 if (bp->b_bufsize == fs->fs_bsize) bp->b_flags |= B_CLUSTEROK; bdwrite(bp); + checkdirtybufs(vp); } *bpp = nbp; return (0); To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
I suggest that we drag Thomas-Henning von Kamptz into this discussion as he was one of the main authors of growfs. He is copied on my reply. Kirk McKusick =-=-=-=-=-= From: Archie Cobbs <[EMAIL PROTECTED]> Subject: Re: backgroud fsck is still locking up system (fwd) In-Reply-To: <[EMAIL PROTECTED]> To: Julian Elischer <[EMAIL PROTECTED]> Date: Fri, 6 Dec 2002 14:52:24 -0800 (PST) CC: [EMAIL PROTECTED], [EMAIL PROTECTED] X-ASK-Info: Whitelist match Julian Elischer wrote: > most systems follow / with their swap region.. > > you can boot from fixit, or picoBSD floppy > and use disklabel -e to exend the root partition > then you can use growfs to add the new space to your root fs. Hmm.. I tried that and it didn't seem to work. The disklabel change was successful, but growfs didn't seem to expand the root partition any.. df(1) still shows it as 50M. I ran growfs after booting single user mode but before mounting any disks.. perhaps that caused it to not work. Since that didn't work, I booted a 4.7-REL fixit floppy and tried to run growfs from there, but then that growfs core dumped: Program terminated with signal 11, Segmentation fault. #0 0x804c089 in updclst (block=-874) at growfs.c:2335 2335setbit(cg_clustersfree(&acg), block); (gdb) list 2330return; 2331} 2332/* 2333 * update cluster allocation map 2334 */ 2335setbit(cg_clustersfree(&acg), block); 2336 (gdb) where #0 0x804c089 in updclst (block=-874) at growfs.c:2335 #1 0x8049584 in updjcg (cylno=2, utime=1039185218, fsi=4, fso=3, Nflag=0) at growfs.c:862 #2 0x8048280 in growfs (fsi=4, fso=3, Nflag=0) at growfs.c:219 #3 0x804beb2 in main (argc=2, argv=0xbfbff7a4) at growfs.c:2213 #4 0x8048135 in _start () Notice "block=-874" which indicates something is weird or corrupted. So now I've got extra space in the partition which (apparently) is not being used and I can't seem to get at it (see below). Plus I have a sneaking suspicion that I've screwed up something, but there's nothing in the growfs man page that indicates what I did was wrong. FYI, this is a test machine so it's OK if it gets hosed. -Archie __ Archie Cobbs * Packet Design * http://www.packetdesign.com $ disklabel ad0s1 # /dev/ad0s1c: type: ESDI disk: ad0s1 label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 255 sectors/cylinder: 16065 cylinders: 1860 sectors/unit: 29896902 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # milliseconds track-to-track seek: 0 # milliseconds drivedata: 0 8 partitions: #size offsetfstype [fsize bsize bps/cpg] a: 20480004.2BSD 1024 8192 32768 # (Cyl.0 - 12*) b: 164608 204800 swap# (Cyl. 12*- 22*) c: 298969020unused0 0 # (Cyl.0 - 1860*) e:40960 3694084.2BSD 1024 819216 # (Cyl. 22*- 25*) f: 29486534 4103684.2BSD 1024 819216 # (Cyl. 25*- 1860*) $ df Filesystem 1K-blocksUsedAvail Capacity Mounted on /dev/ad0s1a49583 36751 886681%/ devfs 1 10 100%/dev /dev/ad0s1f 14289643 2794938 1035153421%/usr /dev/ad0s1e1981535551467520%/var procfs 4 40 100%/proc To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
From: Archie Cobbs <[EMAIL PROTECTED]> Subject: Re: backgroud fsck is still locking up system (fwd) In-Reply-To: <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> Date: Fri, 6 Dec 2002 15:23:36 -0800 (PST) CC: Archie Cobbs <[EMAIL PROTECTED]>, Nate Lawson <[EMAIL PROTECTED]>, [EMAIL PROTECTED] X-ASK-Info: Whitelist match Kirk McKusick wrote: > OK, it looks like my hypothesis on having a small number of buffers > and running out of them is the problem. I enclose below a patch which > should check for the problem arising and help to mitigate it. I > would appreciate you dropping it into your kernel and seeing if > it solves your problem. The fix is not ideal, but merely to see > if it solves this problem. If it does, I will figure out how to > do it properly. Thanks for your help. Yep, that fixes it. Now I just get the usual sluggishness while the background fsck runs (which is not too bad), but it eventually finishes and then all is well. Thanks, -Archie __ Archie Cobbs * Packet Design * http://www.packetdesign.com Thanks for verifying that the idea works. I will attempt to figure out how to do it correctly and submit a proposed fix. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
Adding a two minute delay before starting background fsck sounds like a very good idea to me. Please send me your suggested change. Kirk McKusick =-=-=-=-= Date: Fri, 6 Dec 2002 10:44:45 -0800 From: Brooks Davis <[EMAIL PROTECTED]> To: Nate Lawson <[EMAIL PROTECTED]> Cc: Kirk McKusick <[EMAIL PROTECTED]>, Archie Cobbs <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: backgroud fsck is still locking up system (fwd) X-ASK-Info: Confirmed by User On Fri, Dec 06, 2002 at 10:27:10AM -0800, Nate Lawson wrote: > On Thu, 5 Dec 2002, Kirk McKusick wrote: > > Does the background fsck process continue to run, or does the whole > > system come to a halt? If the fsck process continues to run, what=20 > > happens when it eventually finishes? Is the system still dead, or=20 > > does it come back to life? If the system does not come back to life > > can you get me the output of `ps axl'? If not, can you break into > > the debugger and get a ps output? (You will need to have the DDB > > option specified in your config file). >=20 > Sorry for butting in. I think Archie is referring to bg fsck gaining an > unfair share of cpu due to it running due to IO completions. Last I > heard, we were waiting until after 5.0 to experiment with scheduler > changes to make it more fair. I have not seen any hard locks or other > problems with bg fsck after your commit. My experience is that, at least with my laptop (which has a very slow disk), bg fsck works OK, but starting applictions for the first time while fsck is running is _very_ painful. Even getty seems to have a hard time. I've found that adding a two minute delay before the fsck is sufficent to allow the system to finish starting up and for me to load X and my main applictions which lets me work while bg fsck is running. I posted a patch to add an optional delay in the rc scripts a while ago, but Kirk was going to re-enable the priority stuff soon so I didn't persue it. If there's intrest, I'll regenerate it and repost it. -- Brooks Any statement of the form "X is the one, true Y" is FALSE. PGP fingerprint 655D 519C 26A7 82E7 2529 9BF0 5D8E 8BE9 F238 1AD4 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
Thanks for reminding me about your userland change to background fsck. I have tried it out and concur that it is the right approach until we manage to get the general solution in the kernel. I suggest that you propose it to release engineering and if approved check it in. Kirk McKusick =-=-=-=-=-= To: Kirk McKusick <[EMAIL PROTECTED]> cc: Brooks Davis <[EMAIL PROTECTED]>, Nate Lawson <[EMAIL PROTECTED]>, Archie Cobbs <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: backgroud fsck is still locking up system (fwd) In-Reply-To: Your message of "Fri, 06 Dec 2002 17:52:38 PST." <[EMAIL PROTECTED]> Date: Sat, 07 Dec 2002 14:26:39 + From: Ian Dowse <[EMAIL PROTECTED]> X-ASK-Info: Whitelist match In message <[EMAIL PROTECTED]>, Kirk McKusick wr ites: >Adding a two minute delay before starting background fsck >sounds like a very good idea to me. Please send me your >suggested change. BTW, I've been using a fsck_ffs modificaton for a while now that does something like the disabled kernel I/O slowdown, but from userland. It seems to help quite a lot in leaving some disk bandwidth for other processes. Waiting a while before starting the fsck seems like a good idea anyway though. Patch below (I think I posted an earlier version of this before). Ian Index: fsutil.c === RCS file: /dump/FreeBSD-CVS/src/sbin/fsck_ffs/fsutil.c,v retrieving revision 1.19 diff -u -r1.19 fsutil.c --- fsutil.c27 Nov 2002 02:18:57 - 1.19 +++ fsutil.c4 Dec 2002 02:16:28 - @@ -40,6 +40,7 @@ #endif /* not lint */ #include +#include #include #include #include @@ -62,7 +63,13 @@ #include "fsck.h" +static void slowio_start(void); +static void slowio_end(void); + long diskreads, totalreads; /* Disk cache statistics */ +struct timeval slowio_starttime; +int slowio_delay_usec = 1; /* Initial IO delay for background fsck */ +int slowio_pollcnt; int ftypeok(union dinode *dp) @@ -350,10 +357,15 @@ offset = blk; offset *= dev_bsize; + if (bkgrdflag) + slowio_start(); if (lseek(fd, offset, 0) < 0) rwerror("SEEK BLK", blk); - else if (read(fd, buf, (int)size) == size) + else if (read(fd, buf, (int)size) == size) { + if (bkgrdflag) + slowio_end(); return (0); + } rwerror("READ BLK", blk); if (lseek(fd, offset, 0) < 0) rwerror("SEEK BLK", blk); @@ -463,6 +475,39 @@ idesc.id_blkno = blkno; idesc.id_numfrags = frags; (void)pass4check(&idesc); +} + +/* Slow down IO so as to leave some disk bandwidth for other processes */ +void +slowio_start() +{ + + /* Delay one in every 8 operations by 16 times the average IO delay */ + slowio_pollcnt = (slowio_pollcnt + 1) & 7; + if (slowio_pollcnt == 0) { + usleep(slowio_delay_usec * 16); + gettimeofday(&slowio_starttime, NULL); + } +} + +void +slowio_end() +{ + struct timeval tv; + int delay_usec; + + if (slowio_pollcnt != 0) + return; + + /* Update the slowdown interval. */ + gettimeofday(&tv, NULL); + delay_usec = (tv.tv_sec - slowio_starttime.tv_sec) * 100 + + (tv.tv_usec - slowio_starttime.tv_usec); + if (delay_usec < 64) + delay_usec = 64; + if (delay_usec > 100) + delay_usec = 100; + slowio_delay_usec = (slowio_delay_usec * 63 + delay_usec) >> 6; } /* To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
Date: Sat, 7 Dec 2002 11:07:23 -0800 (PST) From: Nate Lawson <[EMAIL PROTECTED]> To: Archie Cobbs <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: backgroud fsck is still locking up system (fwd) X-ASK-Info: Whitelist match On Fri, 6 Dec 2002, Archie Cobbs wrote: > Julian Elischer wrote: > > I put a copy of / in /usr > > then from the fixit, I mounted /usr as / and ran growfs from there.. > > the trick is to not do it while / is mounted. > > / wasn't mounted yet when I ran growfs: > > > > I ran growfs after booting single user mode but before mounting > > > any disks.. perhaps that caused it to not work. > > But it was the root partition and I was running in single user mode. > If that's a problem then the growfs man page should say so, or maybe > it should be more clear about what is meant by "mounted". growfs won't work with any mounted fs (even ro) because it needs to quiesce kenrel file ops and you can't do that from usermode (yet). I wonder if there might be some clever way to abuse snapshots to have this same effect (i.e. keep an open handle to the underlying fs cdev for growfs to use and then mount a snapshot of the fs over its own mountpoint for procs to use.) > In any case, running it from the fixit floppy didn't work either > (got a core dump), but that may be because it was already screwed up. > > So at minimum, there's a documentation bug (IMHO). I assume the superblock changes between 4 and 5 changed the ability to use 4.x growfs on 5.x ufs partitions. Also, does growfs need to be updated for ufs2? -Nate I have made the structural changes to growfs to make it work for UFS2, however, I have not done more than cursory testing. I would appreciate it if someone could try running it on various UFS2 filesystems to see if it works properly. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
In theory the MNT_RELOAD command should reload all the filesystem metadata properly though this feature has not been tested with growfs. If anyone has the time to try it out and report back any problems, that would be appreciated. Kirk McKusick =-=-=-=-= From: Archie Cobbs <[EMAIL PROTECTED]> Subject: Re: backgroud fsck is still locking up system (fwd) In-Reply-To: <[EMAIL PROTECTED]> To: Bruce Evans <[EMAIL PROTECTED]> Date: Sun, 8 Dec 2002 17:03:43 -0800 (PST) CC: Archie Cobbs <[EMAIL PROTECTED]>, Kirk McKusick <[EMAIL PROTECTED]>, Julian Elischer <[EMAIL PROTECTED]>, [EMAIL PROTECTED], Thomas-Henning von Kamptz <[EMAIL PROTECTED]> X-ASK-Info: Whitelist match Bruce Evans wrote: > > > Er, it should be obvious that growfs can't reasonably work on the mounted > > > partitions. growfs.1 doesn't exist, but growfs.8 already has the warning > > > in a general form: > > > > > > Currently growfs can only enlarge unmounted file systems. Do not > > > try enlarging a mounted file system, your system may panic and you will > > > not be able to use the file system any longer... > > > > Well, I suspected that it might not work... but I would disagree that it > > was *obvious* that it would not work. This was before "mount" had been > > run, so / was supposedly mounted (?) read-only. > > Perhaps the unobvious point is that fsck could work. If the mount is r/w, > then neither growfs nor fsck can even open the partition r/w. fsck somehow > works in the case of a r/o root, but growfs apparently doesn't. I think > fsck depends on no other processes making (significant) vfs syscalls for > on the same partition while it is running (even r/o ones might be harmful > if they caused reads of metadata which might be inconsistent). Then when > fsck has finished it calls mount(... MNT_RELOAD...) to sync the metadata. > growfs doesn't do this, and even if it did it is not clear that it does > all the necessary syncing (growfs may change more or different metadata). > However, I think it does most of the necessary things. FYI, I submitted a bug/enhancement request to summarize this.. http://www.freebsd.org/cgi/query-pr.cgi?pr=46110 -Archie P.S. Why does submitting a bug now generate an email response from (and who the heck is) "ThinkHost Support" ?? __ Archie Cobbs * Packet Design * http://www.packetdesign.com To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Data corruption in soft updates?
Date: Mon, 9 Dec 2002 18:04:03 -0800 (PST) From: Nate Lawson <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] cc: [EMAIL PROTECTED] Subject: Data corruption in soft updates? X-ASK-Info: Whitelist match I rebuilt my kernel with today's current + the acpica-20021122 patch and rebooted. I use ufs1, no acls or special options other than SU (installed with DP1). Everything booted fine with some errors from acpi but as booting proceeded, I started getting kernel messages of "bad inode". I quickly rebooted to single user and ran fsck and got a huge set of errors. See this partial log (600KB gzipped): http://www.root.org/~nate/fsck.gz I didn't touch all those files (just booted and started getting errors) so I don't want to say "yes" to deleting them. Do I have to newfs/reinstall? Should I try using a superblock backup? -Nate It appears that you are getting all those errors (BAD block) because fsck thinks that your filesystem is smaller than it really is. If you do a dumpfs on the filesystem and check the size (about line 5), I expect that you will find that all those bad blocks exceed that size. It might be interesting to check one or more of the alternate blocks to see if they have a different size. If so, using an alternate should help. If not, then the question is why all those out of range blocks were allocated. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Panic with recent CURRENT (1 hour ago)
There was a problem with snapshots that lead to incomplete checking by background fsck which in turn could lead to the problem that you were seeing (i.e., repeated failures until fsck was run manually). This problem was fixed with version 1.54 of ufs/ffs/ffs_snapshot.c which was checked in on Dec 14, 2002. Please verify that you are running with this version. If you had this problem after that conversion please contact me directly so I can try and work out more of the details. Kirk McKusick =-=-=-=-=-= Date: Sat, 14 Dec 2002 21:47:20 +0100 (CET) From: Martin Blapp <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> cc: <[EMAIL PROTECTED]> Subject: Panic with recent CURRENT (1 hour ago) X-ASK-Info: Confirmed by User Hi Kirk, Panic message was: "Block already free". I had to fsck -y manually, but nothing special was found and fixed. The machine rebooted over and over and paniced always at the same place. This shouln't happen I guess. #10 0xc02f055b in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:503 #11 0xc03f9a8a in ffs_blkfree (fs=0xcc27b800, devvp=0xcc284384, bno=370040, size=16384, inum=1400) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1771 #12 0xc0408fcf in indir_trunc (freeblks=0xcc586500, dbn=1481056, level=0, lbn=45068, countp=0xe6928c10) at /usr/src/sys/ufs/ffs/ffs_softdep.c:2603 #13 0xc0408f94 in indir_trunc (freeblks=0xcc586500, dbn=1480064, level=1, lbn=4108, countp=0xe6928c10) at /usr/src/sys/ufs/ffs/ffs_softdep.c:2599 #14 0xc0408a75 in handle_workitem_freeblocks (freeblks=0xcc586500, flags=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:2469 #15 0xc0405c9a in process_worklist_item (matchmnt=0x0, flags=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:745 #16 0xc04059e0 in softdep_process_worklist (matchmnt=0x0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:624 #17 0xc034337e in sched_sync () at /usr/src/sys/kern/vfs_subr.c:1749 #18 0xc02dcc14 in fork_exit (callout=0xc0343090 , arg=0x0, frame=0x0) at /usr/src/sys/kern/kern_fork.c:872 Martin Martin Blapp, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> -- ImproWare AG, UNIXSP & ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 061 826 93 00: +41 61 826 93 01 PGP: PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E -- To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Data corruption in soft updates?
Please send me a `dumpfs /usr | head -50' output of the filesystem under the current system. Then clean it up with fsck and run the same command again. Finally, boot up under the old kernel and get the output both before and after fsck cleaning. What I am looking for is changes in the reported size of the filesystem because that getting out of sync is what is causing these problems. The basic deal is that the old UFS1 superblock stored the filesystem size in a 32-bit field. The new UFS1 superblock stores the filesystem size in a new (previously unused) 64-bit field. When you mount a UFS1 filesystem on a new kernel, it copies the 32-bit size field to the 64-bit field. At that point the filesystem size is in both places and should work equally well on old or new kernels. However, it does not update the 64-bit size field on any of the alternate superblocks. So, somehow, your using and copying an alternate into the standard location is losing the update done for the size field. I am not sure how that is happening, but I am hoping to catch where in all your messing around with alternates that is happening so I can cover that hole. Kirk McKusick =-=-=-=-=-= Date: Tue, 17 Dec 2002 12:14:12 -0800 (PST) From: Nate Lawson <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED] Subject: Re: Data corruption in soft updates? In-Reply-To: <[EMAIL PROTECTED]> X-ASK-Info: Whitelist match On Mon, 9 Dec 2002, Kirk McKusick wrote: > It appears that you are getting all those errors (BAD block) > because fsck thinks that your filesystem is smaller than it > really is. If you do a dumpfs on the filesystem and check > the size (about line 5), I expect that you will find that > all those bad blocks exceed that size. It might be interesting > to check one or more of the alternate blocks to see if they > have a different size. If so, using an alternate should help. > If not, then the question is why all those out of range blocks > were allocated. I booted an older kernel (Dec. 4) and ran fsck_ffs -b 32. It repaired a few simple errors (summary info bad). I then copied the alt sblock to the default location with dd. I reran fsck to make sure the sblock was copied correctly and it came up clean. Everything was fine. I rebooted into multiuser with the old kernel and everything worked fine. I did a full buildkernel with srcs as of yesterday at 5 pm without any bad block messages. But after rebooting with that new kernel, it tried to correct the sblockloc again and my system started having the same problem again. uname and dmesg is below. -Nate FreeBSD 5.0-CURRENT #1: Mon Dec 16 18:05:56 PST 2002 /: correcting fs_sblockloc from 4 to 8192 bad block 1553167, ino 386832 /usr: optimization changed from TIME to SPACE bad block 1553152, ino 387421 pid 42 (syncer), uid 0 inumber 387421 on /usr: bad block bad block 1551181, ino 383169 pid 42 (syncer), uid 0 inumber 383169 on /usr: bad block bad block 1632087, ino 383281 pid 42 (syncer), uid 0 inumber 383281 on /usr: bad block bad block 1616355, ino 383200 pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block bad block 1623472, ino 383200 pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block bad block 1551227, ino 383200 pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block bad block 1552592, ino 383200 pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block bad block 1555160, ino 383200 pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block bad block 1555208, ino 383200 pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block bad block 1550776, ino 383200 pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block bad block 1551208, ino 383198 pid 42 (syncer), uid 0 inumber 383198 on /usr: bad block bad block 1551209, ino 383241 pid 42 (syncer), uid 0 inumber 383241 on /usr: bad block bad block 1553153, ino 387219 pid 42 (syncer), uid 0 inumber 387219 on /usr: bad block bad block 1552704, ino 389415 pid 42 (syncer), uid 0 inumber 389415 on /usr: bad block bad block 1552707, ino 390100 pid 42 (syncer), uid 0 inumber 390100 on /usr: bad block bad block 1639665, ino 391119 pid 42 (syncer), uid 0 inumber 391119 on /usr: bad block bad block 1553170, ino 39 pid 42 (syncer), uid 0 inumber 39 on /usr: bad block bad block 1553431, ino 391118 pid 42 (syncer), uid 0 inumber 391118 on /usr: bad block bad block 1553405, ino 391122 pid 42 (syncer), uid 0 inumber 391122 on /usr: bad block To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: panic: ffs_blkfree: freeing free block
> Date: Mon, 16 Dec 2002 22:42:07 -0600 > From: Dan Nelson <[EMAIL PROTECTED]> > To: Aurelien Nephtali <[EMAIL PROTECTED]> > Cc: [EMAIL PROTECTED] > Subject: Re: panic: ffs_blkfree: freeing free block > > In the last episode (Dec 16), Aurelien Nephtali said: > > Hi, > > > > I got a panic today which occured during a background fsck, after a > > hard-reboot of the system. The dump from gdb is attached and I can, > > of course, provide more infos if needed. > > "Me too". My info attached as well; almost identical stack trace. > Kernel was built from sources cvsupped just after 2002/12/15 17:41:07 > PST. (Why in the heck are all the timestamps in commitlogs in PST??) > > -- > Dan Nelson > [EMAIL PROTECTED] I introduced a bug to snapshots on 11/30/02 which did not get fixed until 12/15/02 which caused background fsck to (silently) fail to fix certain filesystem problems. If you ran background fsck on a system between 11/30 and 12/15 and then ran background fsck again on a system after that date, the earlier missed corruption causes the panic that you have seen. Once fixed on a post 12/15 system, it should not recur. You can avoid the panic by running `fsck -f -p' on all your system after upgrading to a post 12/15 system. If you find continued evidence of trouble after following the above procedures, please send me mail. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: backgroud fsck is still locking up system (fwd)
Date: Mon, 9 Dec 2002 11:19:13 -0800 From: Brooks Davis <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> Cc: Brooks Davis <[EMAIL PROTECTED]>, Nate Lawson <[EMAIL PROTECTED]>, Archie Cobbs <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: backgroud fsck is still locking up system (fwd) On Fri, Dec 06, 2002 at 05:52:38PM -0800, Kirk McKusick wrote: > Adding a two minute delay before starting background fsck > sounds like a very good idea to me. Please send me your > suggested change. Here it is. As written it doesn't add the delay, but you can change etc/defaults/rc.conf to do that it desired. -- Brooks I have added your suggested change to -current (6.0). I decided to set the default startup delay to sixty seconds as that seems to be enough time to let the initial system startup settle down. If this change proves to be popular, it can be considered for MFC'ing to 5.0. Kirk McKusick =-=-=-=-=-= From: Kirk McKusick <[EMAIL PROTECTED]> Date: Tue, 17 Dec 2002 23:21:31 -0800 (PST) To: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: cvs commit: src/etc rc src/etc/defaults rc.conf src/etc/rc.d bgfsck src/share/man/man5 rc.conf.5 X-FreeBSD-CVS-Branch: HEAD mckusick2002/12/17 23:21:31 PST Modified files: etc rc etc/defaults rc.conf etc/rc.d bgfsck share/man/man5 rc.conf.5 Log: Delay an optional amount of time after booting before starting a background fsck. The delay defaults to sixty seconds to allow large applications such as the X server to start before disk I/O bandwidth is monopolized by fsck. Submitted by: Brooks Davis <[EMAIL PROTECTED]> Sponsored by: DARPA & NAI Labs. Revision ChangesPath 1.165 +1 -0 src/etc/defaults/rc.conf 1.324 +8 -2 src/etc/rc 1.3 +13 -2 src/etc/rc.d/bgfsck 1.168 +5 -0 src/share/man/man5/rc.conf.5 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: panic: ffs_blkfree: freeing free block
I corrected a botched patch last night. Make sure that you are running with version 1.56 2002/12/18 07:19:41 of ufs/ffs/ffs_snapshot.c. Kirk McKusick =-=-=-=-= Date: Wed, 18 Dec 2002 11:43:25 +0100 From: Aurelien Nephtali <[EMAIL PROTECTED]> To: Kirk McKusick <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: panic: ffs_blkfree: freeing free block X-ASK-Info: Confirmed by User > I introduced a bug to snapshots on 11/30/02 which did not get fixed=20 > until 12/15/02 which caused background fsck to (silently) fail to fix > certain filesystem problems. If you ran background fsck on a system > between 11/30 and 12/15 and then ran background fsck again on a system > after that date, the earlier missed corruption causes the panic that > you have seen. Once fixed on a post 12/15 system, it should not recur. > You can avoid the panic by running `fsck -f -p' on all your system > after upgrading to a post 12/15 system. If you find continued > evidence of trouble after following the above procedures, please > send me mail. > > Kirk McKusick I rebuild a brand new system and the problem is still here :/. uname -a: FreeBSD nebula.wanadoo.fr 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Wed Dec 18 10:45:30 CET 2002 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/NEBULA i386 I've also attached a new dump which matches to the new system. -- Aurelien --LQksG6bCIzRHxTLp Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=dump Content-Transfer-Encoding: quoted-printable Script started on Wed Dec 18 11:36:02 2002 =0D=1B[m=1B[27m=1B[Jnebula# =1B[K=1B[61C/var/crash=1B[71Dg=08gdb -k=0D=0D GNU gdb 5.2.1 (FreeBSD)=0D Copyright 2002 Free Software Foundation, Inc.=0D GDB is free software, covered by the GNU General Public License, and you ar= e=0D welcome to change it and/or distribute copies of it under certain condition= s.=0D Type "show copying" to see the conditions.=0D There is absolutely no warranty for GDB. Type "show warranty" for details.= =0D This GDB was configured as "i386-undermydesk-freebsd".=0D (kgdb) symbol-file kernel.debug.7 =0D Reading symbols from kernel.debug.7...done.=0D (kgdb) exec-file kernel.7=0D (kgdb) core-file vmcore.7 =0D panic: from debugger=0D panic messages:=0D ---=0D panic: ffs_blkfree: freeing free block=0D panic: from debugger=0D Uptime: 50s=0D Dumping 123 MB=0D ata0: resetting devices ..=0D done=0D 16 32 48 64 80 96 112=0D ---=0D #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:232=0D 232 dumping++;=0D (kgdb) bt=0D #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:232=0D #1 0xc021c37e in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:3= 64=0D #2 0xc021c5c3 in panic () at /usr/src/sys/kern/kern_shutdown.c:517=0D #3 0xc013c212 in db_panic () at /usr/src/sys/ddb/db_command.c:450=0D #4 0xc013c192 in db_command (last_cmdp=3D0xc03a0a00, cmd_table=3D0x0, =0D aux_cmd_tablep=3D0xc039b53c, aux_cmd_tablep_end=3D0xc039b540)=0D at /usr/src/sys/ddb/db_command.c:346=0D #5 0xc013c2a6 in db_command_loop () at /usr/src/sys/ddb/db_command.c:472=0D #6 0xc013ef9a in db_trap (type=3D3, code=3D0) at /usr/src/sys/ddb/db_trap.= c:72=0D #7 0xc0335d42 in kdb_trap (type=3D3, code=3D0, regs=3D0xc850da4c)=0D at /usr/src/sys/i386/i386/db_interface.c:166=0D #8 0xc0346b2f in trap (frame=3D=0D {tf_fs =3D 24, tf_es =3D 16, tf_ds =3D 16, tf_edi =3D -1061658176, tf= _esi =3D 256, tf_ebp =3D -934225256, tf_isp =3D -934225288, tf_ebx =3D 0, t= f_edx =3D 0, tf_ecx =3D -1069390144, tf_eax =3D 18, tf_trapno =3D 3, tf_err= =3D 0, tf_eip =3D -1070374940, tf_cs =3D 8, tf_eflags =3D 646, tf_esp =3D = -1069984152, tf_ss =3D -1070066283})=0D at /usr/src/sys/i386/i386/trap.c:603=0D #9 0xc0337558 in calltrap () at {standard input}:98=0D #10 0xc021c5ab in panic (fmt=3D0x0) at /usr/src/sys/kern/kern_shutdown.c:50= 3=0D #11 0xc02d8f0a in ffs_blkfree (fs=3D0xc18f3000, devvp=3D0xc191dce4, bno=3D1= 088, =0D size=3D16384, inum=3D1088) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1771=0D #12 0xc02e843f in indir_trunc (freeblks=3D0xc1b37500, dbn=3D4288, level=3D0= , lbn=3D12, =0D countp=3D0xc850dc10) at /usr/src/sys/ufs/ffs/ffs_softdep.c:2600=0D #13 0xc02e7ee5 in handle_workitem_freeblocks (freeblks=3D0xc1b37500, flags= =3D0)=0D ---Type to continue, or q to quit---=0D at /usr/src/sys/ufs/ffs/ffs_softdep.c:2466=0D #14 0xc02e510a in process_worklist_item (matchmnt=3D0x0, flags=3D0)=0D at /usr/src/sys/ufs/ffs/ffs_softdep.c:742=0D #15 0xc02e4e50 in softdep_process_worklist (matchmnt=3D0x0)=0D at /usr/src/sys/ufs/ffs/ffs_softdep.c:621=0D #16 0xc026f89e in sched_sync () at /usr/src/sys/kern/vfs_subr.c:1751=0D #17 0xc0208c64 in fork_exit (callout=3D0xc026f5b0 , arg=3D0x0, = =0D frame=3D0x0) at /usr/src/sys/kern/kern_fork.c:872=0D (kgdb) quit=0D =0D=1B[m=1B[27m=1B[Jnebula# =1B[K=1B[61C/var/crash=1B[71De=08exit=0D=0D Script done on Wed Dec 18 11:36:32 20
Re: panic: ffs_blkfree: freeing free block
I have managed to panic my system on a hard reboot and now believe that I have found the problem on which you are faulting. I have checked in a fix to the head of the tree (sys/ufs/ffs/ffs_snapshot.c version 1.57). Let me know if it fixes your problem. Kirk McKusick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: enabling inode hashes results in kernel panics
As the person resposible for ading the inode check-hashes, sorry for the problems that they are causing you. Gary, I may want your crash dumps and the core.txt files, but let me do some preliminary investigation of fsck to see if I can figure out why it is failing to fix the inode check-hashes. Are you running with soft updates or journaled soft updates? Is the problem that fsck is not finding that are are bad check-hashes, or is fsck finding them and not fixing them? Since fsck and fsdb share the same code for reading and updating inodes, it is odd that fsdb fixes the check-hashes, but the same code running in fsck does not. Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic in getblkx() booting from disc1.iso in Qemu VM
Thanks Rebecca for the report and Mark for the analysis of the problem. This should be fixed in -r342290. Kirk McKusick =-=-= From:Kirk McKusick Date:Fri, 21 Dec 2018 01:09:25 + (UTC) To: src-committ...@freebsd.org, svn-src-...@freebsd.org, svn-src-h...@freebsd.org Subject: svn commit: r342290 - head/sys/kern Author: mckusick Date: Fri Dec 21 01:09:25 2018 New Revision: 342290 URL: https://svnweb.freebsd.org/changeset/base/342290 Log: Some filesystems (like cd9660 and ext3) require that VFS_STATFS() be called before VFS_ROOT() is called. Move the call for VFS_STATFS() so that it is done after VFS_MOUNT(), but before VFS_ROOT(). This change actually improves the robustness of the mount system call because it returns an error rather than failing silently when VFS_STATFS() returns failure. Reported by: Rebecca Cran Sponsored by: Netflix Modified: head/sys/kern/vfs_mount.c Modified: head/sys/kern/vfs_mount.c == --- head/sys/kern/vfs_mount.c Thu Dec 20 22:39:58 2018(r342289) +++ head/sys/kern/vfs_mount.c Fri Dec 21 01:09:25 2018(r342290) @@ -895,6 +895,7 @@ vfs_domount_first( */ error1 = 0; if ((error = VFS_MOUNT(mp)) != 0 || + (error1 = VFS_STATFS(mp, &mp->mnt_stat)) != 0 || (error1 = VFS_ROOT(mp, LK_EXCLUSIVE, &newdp)) != 0) { if (error1 != 0) { error = error1; @@ -916,7 +917,6 @@ vfs_domount_first( vfs_freeopts(mp->mnt_opt); mp->mnt_opt = mp->mnt_optnew; *optlist = NULL; - (void)VFS_STATFS(mp, &mp->mnt_stat); /* * Prevent external consumers of mount options from reading mnt_optnew. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Checking out the CSRG repository?
> From: Alan Somers > Date: Wed, 19 Jun 2019 14:12:21 -0600 > Subject: Checking out the CSRG repository? > To: FreeBSD CURRENT > > Does anybody know how to check out a local copy of the CSRG > repository? I can view it with ViewVC, but I would really like local > access. It doesn't seem to be available on the usual repo.FreeBSD.org > or svn.FreeBSD.org. > > $ svn checkout https://svn.FreeBSD.org/csrg csrg > svn: E170013: Unable to connect to a repository at URL > 'https://svn.freebsd.org/csrg' > svn: E175009: The XML response contains invalid XML > svn: E130003: Malformed XML: no element found at line 1 > > $ svn co svn+ssh://asom...@repo.freebsd.org/csrg csrg > svn: E170013: Unable to connect to a repository at URL > 'svn+ssh://asom...@repo.freebsd.org/csrg' > svn: E210005: No repository found in 'svn+ssh://asom...@repo.freebsd.org/csrg' > > -Alan > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" You can browse the history at http://svnweb.freebsd.org/csrg/ The repository is also available via FTP: ftp://ftp.freebsd.org/pub/FreeBSD/development/CSRG/csrg_svn.tbz Hope this helps, Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: supeblock hash failure - CURRENT wrecking disks
> To: Enji Cooper > cc: "O. Hartmann" , > freebsd-current , mckus...@mckusick.com > Subject: Re: CURRENT: supeblock hash failure - CURRENT wrecking disks > From: "Poul-Henning Kamp" > > In message <39fb31e6-a8ec-484c-b297-39c19a787...@gmail.com>, Enji Cooper > writes > : > > There is an "interesting" failure-mechanism when you move a disk > between 13/current and older systems which do not support ufs-hashes. > > It will be prudent to make 11 and 12 clear the "use hashes" flags > in the superblocks of all filesystems they mount R/W, to limit > the amount havoc this will cause when people start playing with 13. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > p...@freebsd.org | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. Both stable-11 and stable-12 clear the "use hashes" flags. If the disk is moved back to a 13-head system they remain disabled until reenabled by running fsck in interactive mode and requesting that they be enabled. Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: supeblock hash failure - CURRENT wrecking disks
> Date: Wed, 7 Aug 2019 10:37:29 +0200 > From: "O. Hartmann" > To: freebsd-current > Subject: CURRENT: supeblock hash failure - CURRENT wrecking disks > > Hello, > > Today I ran into a ctastrophy with r350671. After installing a fresh > compiled system and rebooted the box, UEFI loader dropped a bunch > of errors, like some hex numbers stating, that a hash/superblock > has is wrong and then the booting stopped at the OK loader prompt. > > Rebooting the machine with the FreeBSD-13-CURRENT image from 1st > August 2019 and trying to fsck the filesystem(s) on the boot SSD > (UFS2, journaling and trim on), lots of unresolved block errors > occured. But that didn't help much. Further, after several checks, > I saw some commits to the ffs code recently adn tried to restore a > copy of the superblock of each filesystem (in contrary to the man > page for fsck_ufs, the first backup superblock resides in 192, not > 160!). But things then get even worse, it seems the whole /boot > structure is corrupted, the loader can not find the recent kernel > and kernel.old is crashing. > > What's wrong here :-( > > The box in question has been setup 6 weeks ago with FreeBSD 13-CURRENT > natively. It is now a wreck. Other systems running CURRENT (as of > the most recent revision as of today) were partially installed as > 12-STABLE/12-CURRENT and "moved on" to what is now 13-CURRENT. They > do not(!) indicate such problems reported. > > Either I hit the crap installing a new system whilst there was a > problem, or something really strange happened. > > The bad thing is that kernel.old exits/dies with an exception and > /boot/kernel/ seems to be completely corrupted. Tomorrow I try to > install a prepared pkg tar arcive FreeBSD-kernel from a CURRENT pkg > base and hope this will fix the issue. > > Regards, > > oh The boot code checks the superblock hash and reports if it is wrong, but ignores the error and continues to try and boot. The reason to continue is to allow the system to come up so that the superblock check hash can be fixed by running fsck. So your filesystem had something more seriously wrong than just a bad superblock hash if it could not be booted. The fix in r350671 was to recompute the superblock check hash in a place that I had missed earlier. I discovered the error when someone reported getting superblock check hash errors when booting. But that error did not cause their system to be unbootable for the reasons that I explained in the previous paragraph. If the filesystem started on 12-stable, then moving to 13 would not have enabled superblock check hashes. They are only enabled when you run fsck manually and explicitly say yes to the request to add superblock check hashes. Running fsck -y will not add them, only when you run fsck and explicitly respond yes to the superblock check hash addition request. Filesystems created on 13 will get superblock check hashs. But if you boot a 13 filesystem using a 12-stable kernel, they will be disabled and left disabled even if you boot the filesystem on 13 again. Thanks for pointing out the error on the fsck_ufs manual page. The first backup superblock moved from 160 to 192 when the default block size was raised from 16K to 32K. I have corrected the page in r350682. Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: fsync: giving up on dirty, umount -f fails
> From: "Bjoern A. Zeeb" > To: "FreeBSD Current" > Subject: fsync: giving up on dirty, umount -f fails > Date: Thu, 24 Oct 2019 07:58:39 + > > Hi, > > I am archiving some old disks and while trying to umount [-f] them I am > getting errors and I basically cannot get rid of the mount anymore > without rebooting. This is on a HEAD from mid-end-August (around > r351518M). > > Given there is a lot of work going on at the moment to deal with > “disks dropping out by error” and not to panic I was just wondering > if this is something to address as well? Somehow umount -f should be > able to succeed (in the future)? > > > fsync: giving up on dirty (error = 5) > g_vfs_done():da0s2g[READ(offset=4666441728, length=16384)]error = 5 > 0xf803533b81e0: tag devfs, type VCHR > usecount 1, writecount 0, refcount 1661 rdev 0xf8015372a800 > flags (VI_ACTIVE) > v_object 0xf80365537c00 ref 0 pages 8340 cleanbuf 1561 dirtybuf 97 > lock type devfs: EXCL by thread 0xf80006a57000 (pid 26526, > umount, tid 100091) > dev da0s2g > > /bz In the above example the unmount is failing because it is getting back EIO for one of its dirty buffers. Thus it is not able to get everything written out, so refuses to do the unmount. What we are working on doing is implementing a `very forcible' unmount (which I would love to specify using `umount -F', but regretably -F is already in use to specify an alternate fstab file). A very forcible unmount says to simply abandon dirty buffers that it cannot write. In the event of a disk dying, that would be all of the dirty buffers. Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic "ffs_checkblk: bad block" on recent -head kernels
> Date: Thu, 3 Dec 2015 23:47:52 +0100 > From: Mateusz Guzik > To: Rick Macklem > Cc: FreeBSD Current > Subject: Re: panic "ffs_checkblk: bad block" on recent -head kernels > > On Thu, Dec 03, 2015 at 05:08:27PM -0500, Rick Macklem wrote: >> Hi, >> >> I get a fairly reproducible panic when doing a full kernel build >> on a 256Mbyte single core i386 when running recent kernels from -head. >> >> The panic is "ffs_checkblk: bad block ..". I don't actually have the >> block # (although I think it's just 0xfff, given the backtrace), >> because it runs off the screen. (I looked up the message via the debugger >> from the first arg. to panic.) >> >> Here's the backtrace without all the numbers: >> panic(c14f4b55, , , 0, 64,...) >> ffs_checkblk(, 8000, fff9c, , c4a02454,...) >> ffs_reallocblks >> VOP_REALLOCBLKS_APV >> cluster_write >> ffs_write >> VOP_WRITE_APV >> vn_write >> vn_io_fault_doio >> vn_io_fault1 >> vn_io_fault >> dofilewrite >> kern_writev >> sys_write >> syscall >> >> It doesn't happen on a kernel dated Sep. 30, but does happen on a Nov. 30 >> one. >> (I was away from home, so I didn't upgrade kernels for 2 months.) >> >> I am slowly doing a binary search for the first kernel rev. where it occurs, >> but since each build takes hours, it's going to take a while;-). >> >> At this point, it doesn't appear to happen on r289278 (just before jeff@'s >> buffer >> cache patch). >> With kernels between r289279-->r290480, I get into the "R" state that >> was fixed by r290481 before I get a crash. >> I tried reverting r289405 and r290047 from a recent kernel and the crashes >> still >> occurred, so it doesn't appear to be these commits. >> >> I am currently testing r290481 to see if the crash occurs for this rev. >> >> If anyone has some insight into which commit might cause this, >> please let me know. > > Well, did it crash with r291460 or later? > > If so, try the kernel just before that and if that helps, try: > > diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c > index ff37de8..0ad6ef7 100644 > --- a/sys/kern/vfs_subr.c > +++ b/sys/kern/vfs_subr.c > @@ -2783,6 +2783,7 @@ _vdrop(struct vnode *vp, bool locked) > vp->v_op = NULL; > #endif > bzero(&vp->v_un, sizeof(vp->v_un)); > + vp->v_lasta = vp->v_clen = vp->v_cstart = vp->v_lastw = 0; > vp->v_iflag = 0; > vp->v_vflag = 0; > bo->bo_flag = 0; > > -- > Mateusz Guzik I concur with trying this suggestion. starting with r291460 these fields were no longer zero'ed when allocating the vnode. So you may have some residual values in there that are causing trouble. Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Followup on packaging base with pkg(8)
Glen, I realize that you have put an enormous amount of effort into getting the packaging of base with pkg(8) into the 11.0 release and am sorry to hear that it needs to be delayed. But having watched the mailing lists during these efforts I realize that it is a much more difficult problem than it would at first appear to be. Thank-you for your efforts to date and I look forward to the transition (hopefully in the 11.1 release) as I believe it will be a huge step forward. Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: The futur of the roff toolchain
Thanks for all your work on this project. As I still use roff for our book and for many of my presentations, it is a topic of interest to me. That said, I am fine with roff dropping out of base as I can easily enough bring it in from ports. And I am curious to try using heirloom doctools on our book to see if it works. We do some pretty evil things with diversions, so I can easily believe that it will not work. But it would be great if it does work, because the groff in base has some bugs that are annoying to work around. ~Kirk ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: ffs_blkfree_cg: freeing free block
> Date: Fri, 28 Oct 2011 11:16:59 +0200 > From: "deeptec...@gmail.com" > To: freebsd-current@freebsd.org > Subject: panic: ffs_blkfree_cg: freeing free block > > A panic occured while I was ``rm -rf''ing a large file&directory tree > (that I just created with untar) on an old drive that I have not used > for a long time. Unfortunately I'm not 100% sure that the filesystem > was clean when I mounted it today. Could that result in such a panic? > > I don't have the intermediate object files for the kernel; now I'm > building the kernel again (from the appropriate, exact sources). That > shouldn't harm debugging, should it? Meanwhile, I'll take any debug > info requests, which I'll attempt to address shortly. This panic happens when the free-block bitmap is corrupted. That can happen due to: 1) An unclean filesystem being mounted (though you should get a warning when you attempt to do this). 2) Bit-rot on the disk that is not checked for before mounting. This is typically only an issue for a disk that has been offline for a long time. 3) Write errors to the disk. There have been no changes to the code that manage the filesystem bitmaps in decades (nearly three decades), so a software cause of this panic is unlikely to have been recently introduced. Hence, I would not spend a lot of time trying to get a backtrace, etc. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: dogfooding over in clusteradm land
Rather than changing BKVASIZE, I would try running the cvs2svn conversion on a 16K/2K filesystem and see if that sorts out the problem. If it does, it tells us that doubling the main block size and reducing the number of buffers by half is the problem. If that is the problem, then we will have to increase the KVM allocated to the buffer cache. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: FS hang when creating snapshots on a UFS SU+J setup
> Date: Mon, 9 Jan 2012 18:30:51 +0100 > From: Yamagi Burmeister > To: j...@freebsd.org, mckus...@freebsd.org > Cc: freebsd-current@freebsd.org, br...@bryce.net > Subject: Re: FS hang when creating snapshots on a UFS SU+J setup > > Hello, > > I'm sorry to bother you, but you may not be aware of this thread and > this problem. We are several people experiencing deadlocks, kernel > panics and other problems when creating sanpshots on file systems > with SU+J. It would be nice to get some feedback, e.g. how can we > help debugging and / or fixing this problem. > > Thank you, > Yamagi First step in debugging is to find out if the problem is SU+J specific. To find out, turn off SU+J but leave SU. This change is done by running: umount tunefs -j disable mount cd rm .sujournal You may want to run `fsck -f' on the filesystem while you have it unmounted just to be sure that it is clean. Then run your snapshot request to see if it still fails. If it works, then we have narrowed the problem down to something related to SU+J. If it fails then we have a broader issue to deal with. If you wish to go back to using SU+J after the test, you can reenable SU+J by running: umount tunefs -j enable mount When responding to me, it is best to use my email as I tend to read it more regularly. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: SU+J on 9.1-RC2 ISO
> Date: Sun, 04 Nov 2012 21:13:36 +0900 (JST) > To: freebsd-sta...@freebsd.org > Subject: Re: SU+J on 9.1-RC2 ISO > From: HATANO Tomomi > Cc: j...@koitsu.org, b.smee...@ose.nl, fnwhiteh...@freebsd.org, > freebsd-current@freebsd.org > > Hi all. > > The point is: > > There is completely no way to take a snapshot of SU+J partition > unless modify one's kernel. > > Whether some issue still exist or not, > how about enabling snapshoting SU+J partition > through sysctl variable? > > Would you mind to see patch attached? > > 1. Taking a snapshot of SU+J partition is controlled through sysctl variable. > > 2. Default to disable. >One who want to enable it should set the variable manually. > > 3. The default value in bsdinstall(8) may be left as is. > -- > HATANO Tomomi. > > --- src/sys/ufs/ffs/ffs_snapshot.c.orig 2012-11-04 11:01:58.0 > +0900 > +++ src/sys/ufs/ffs/ffs_snapshot.c2012-11-04 11:13:32.0 +0900 > @@ -182,8 +182,10 @@ > */ > int dopersistence = 0; > > -#ifdef DEBUG > #include > +int snapsuj = 0; > +SYSCTL_INT(_debug, OID_AUTO, snapsuj, CTLFLAG_RW, &snapsuj, 0, ""); > +#ifdef DEBUG > SYSCTL_INT(_debug, OID_AUTO, dopersistence, CTLFLAG_RW, &dopersistence, 0, > ""); > static int snapdebug = 0; > SYSCTL_INT(_debug, OID_AUTO, snapdebug, CTLFLAG_RW, &snapdebug, 0, ""); > @@ -230,7 +232,7 @@ >* At the moment, journaled soft updates cannot support >* taking snapshots. >*/ > - if (MOUNTEDSUJ(mp)) { > + if (MOUNTEDSUJ(mp) && (snapsuj == 0)) { > vfs_mount_error(mp, "%s: Snapshots are not yet supported when " > "running with journaled soft updates", fs->fs_fsmnt); > return (EOPNOTSUPP); > Snapshots are disabled when using SU+J for a reason. That reason is that the journal rollback when a snapshot is active on a filesystem DOES NOT WORK. It leaves your filesystem with duplicate blocks that can only be removed by manually running fsck and correcting the duplicate block entries by hand. If you need to use snapshots, then run with SU and not SU+J. When journal rollback properly handles snapshots, snapshots on SU+J will be enabled. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Fix sbin/fsdb/fsdbutil.c for r247212
> Date: Sun, 24 Feb 2013 22:41:21 +0300 > Subject: Re: [PATCH] Fix sbin/fsdb/fsdbutil.c for r247212 > From: Sergey Kandaurov > To: David Wolfskill > Cc: curr...@freebsd.org, Kirk McKusick > > On 24 February 2013 19:25, David Wolfskill wrote: >> On Sun, Feb 24, 2013 at 07:05:34AM -0800, David Wolfskill wrote: >>> ...hine was: >>> Simple patch attached; world is still building, but at least it got >>> through the "make dependencies" phase this time. >>> ... >> >> That was incomplete, as it didn't (also) address the change to >> getdatablk(). >> >> The attached patch actually made it through buildworld. >> >> Note that it is entirely possible that I erred in specifying >> "BT_UNKNOWN" for the additional "type" argument. > > Hi David. > > Thank you for the proposed fix. I committed it with r247234. > I'm not sure regarding BT_UNKNOWN value either. Well.. at least > it should be not worse that it is now, and it should fix the build. > I have not found any (regressive) changes in fsdb -d `blocks' output. > > -- > wbr, > pluknet Sorry, I am bad about keeping up with my mckus...@freebsd.org email. I do need to watch it right after making commits. I also had no idea that sbin/fsdb shared code with sbin/fsck_ffs. I really do need to get back in the habit of buildworlds before doing any commits. All that said, the changes that you have made are correct. The type is only used for collecting statistics. So, if you do not know the type, using DT_UNKNOWN is correct. If there is ever a desire to collect type-of-I/O statistics in fsdb then that choice will need to be revisited. But, I doubt that type-of-I/O statistics are ever likely to be interesting in fsdb. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: A PRIV_* flag for /dev/mem?
I pointed Robert and Pawel at your discussion on creating a new PRIV_KMEM and adding a check for it in memopen(). I am of the opinion that this is a good idea, but I am hoping that one of Robert or Pawel will comment since they are much more active in this area. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: A PRIV_* flag for /dev/mem?
> Date: Sat, 15 Jun 2013 17:23:50 -0600 > From: Jamie Gritton > To: FreeBSD Current > CC: Kirk McKusick , > Konstantin Belousov , > Alexander Leidinger , > Pawel Jakub Dawidek , > Robert Watson > Subject: Re: A PRIV_* flag for /dev/mem? > > On 05/20/13 16:56, Kirk McKusick wrote: >> I pointed Robert and Pawel at your discussion on creating a new >> PRIV_KMEM and adding a check for it in memopen(). I am of the opinion >> that this is a good idea, but I am hoping that one of Robert or Pawel >> will comment since they are much more active in this area. > > I suppose it's safe to say further comment isn't forthcoming. So with > one vote for and one against (or at least questioning), I'll humbly > leave it up to myself to be the tie-breaker :-). > > Here's a proposed patch. I separate kmem access into read and write, as > I saw other similar splits in the priv list. Perhaps that's overkill, > and I can use a single PRIV_KMEM instead of PRIV_KMEM_READ and > PRIV_KMEM_WRITE. > > Perhaps this is an overreach, because PRIV_KMEM_READ is used where the > default isn't root privilege: the file permission and expected usage are > group kmem gets to read /dev/[k]mem. I'm not about to go hard-coding a > gid into the kernel, so it seems the proper thing to do (not included in > the patch) would be to allow PRIV_KMEM_READ by default. I thought there > might already be such cases where the default is to allow, but no: this > would be the first default-allow permission. So perhaps the best answer > is not worry about that one, and only add PRIV_KMEM_WRITE (leaving reads > controlled by file permission alone as they are now). > > - Jamie With the change from the error noted by Kostik, I concur with your proposed change. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Kernel crash during heavy disk access
> Date: Tue, 9 Jul 2013 18:29:01 -0700 > Subject: Re: Kernel crash during heavy disk access > From: Adrian Chadd > To: Benjamin Kaduk , Jeff Roberson , > Kirk McKusick > Cc: Eric Camachat , curr...@freebsd.org > > Well, best to tell kirk and jeffr. > > Jeffr wrote the journaling stuff. > > .. but I thought they knew there's still problems? > > -adrian Jeff has fixed all the journaling issues for which we have some way of reproducing them. We do still have some reports that there are "problems" but only a vague description and nothing that we can use to reproduce them on our systems. One of the inherit characteristics of any type of journaling is that once it thinks that it has fixed something, it never goes back and checks it again later. So, if there is some inconsistency that gets into your filesystem through media error or an earlier journaling bug, it will stay there and continue to plague you until a full fsck is run to clean it up. So, if you are getting filesystem related crashes, the first thing you should do is a full (fsck -f) check to make sure that you are starting from a clean state. After that, if you find that the journaling is not keeping it consistent, please send Jeff and me a report of what you are doing, what problems it creates, and most importantly transcript of a run of `fsck_ffs -d' first using the journal and then a second time with a full check (fsck_ffs -f -d) so that we can try to analyse what is going wrong. Note that you need to run fsck_ffs explicitly because the fsck front end will not pass the -d (debug output) flag through to fsck_ffs. Kirk McKusick > On 9 July 2013 17:48, Benjamin Kaduk wrote: >> On Tue, 9 Jul 2013, Adrian Chadd wrote: >> >>> On 9 July 2013 09:24, Eric Camachat wrote: >>>> >>>> On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote: >>>>> >>>>> Hi, >>>>> >>>>> Try doing a full, non-journal fsck. >>>>> >>>>> -adrian >>>> >>>> >>>> Thank you, it fixed the problem! >>>> Does it mean journal didn't work? >>> >>> >>> Yup :( >> >> >> So, you are going to tell Kirk about it? >> >> -Ben ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
> Date: Tue, 3 May 2011 22:40:26 -0700 > Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS > partition when filesystem full > From: Garrett Cooper > To: Jeff Roberson , > Marshall Kirk McKusick > Cc: FreeBSD Current > > Hi Jeff and Dr. McKusick, > Ran into this panic when /usr ran out of space doing a make > universe on amd64/r221219 (it took ~15 minutes for the panic to occur > after the filesystem ran out of space -- wasn't quite sure what it was > doing at the time): > > ... > > Let me know what other commands you would like for me to run in kgdb. > Thanks, > -Garrett You did not indicate whether you are running an 8.X system or a 9-current system. It would be helpful to know that. Jeff thinks that there may be a potential race in the locking code for softdep_request_cleanup. If so, this patch for 9-current should fix it: Index: ffs_softdep.c === --- ffs_softdep.c (revision 221385) +++ ffs_softdep.c (working copy) @@ -11380,7 +11380,8 @@ continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, + curthread)) { MNT_ILOCK(mp); continue; } If you are running an 8.X system, hopefully you will be able to apply it. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Reason why "nocache" option is not displayed in "mount"?
The issue has to do with how flags are defined in mount.h. Specifically there are the flags that are externally visible (prefixed with MNT_) and those that are for internal use (prefixed with MNTK_, the K standing for KERNEL). If it is desirable to have MNTK_NULL_NOCACHE visible, then it should be renamed to MNT_NULL_CACHE, added to MNT_VISFLAGMASK, and listed in MNTOPT_NAMES. It probably belongs in the set described as `Flags set by internal operations, but visible to the user.' With this change, it will be displayed by the mount command and show up in the statfs flags. Kirk McKusick
Re: Reason why "nocache" option is not displayed in "mount"?
> Date: Sun, 10 Mar 2024 19:21:54 +0200 > From: Konstantin Belousov > To: Kirk McKusick > Cc: curr...@freebsd.org > Subject: Re: Reason why "nocache" option is not displayed in "mount"? > > On Sun, Mar 10, 2024 at 01:53:05AM +, Kirk McKusick wrote: >> The issue has to do with how flags are defined in mount.h. >> Specifically there are the flags that are externally visible >> (prefixed with MNT_) and those that are for internal use >> (prefixed with MNTK_, the K standing for KERNEL). If it >> is desirable to have MNTK_NULL_NOCACHE visible, then it >> should be renamed to MNT_NULL_CACHE, added to MNT_VISFLAGMASK, >> and listed in MNTOPT_NAMES. It probably belongs in the set >> described as `Flags set by internal operations, but visible >> to the user.' With this change, it will be displayed by >> the mount command and show up in the statfs flags. > > There is no MNTK_NULL_NOCACHE flag in mnt_kern_flags. > > When userspace communicates the "cache" or "nocache" option to the > VFS_MOUNT() op for nullfs, it passes plain C string using the nmount(2) > system call. The strings are explicitly queried by nullfs_mount(), mixed > with the "default" sysctl, and then the nullfs-mount specific data flag > is set, in mp->mnt_data.null_flag. > > There is no space in the struct statfs for ABI extension. > The getfsstat(2) system call cannot report arbitrary fs-specific options. > > If somebody wants to uniformilly report fs-specific options, instead of > scattered fs-specific hacks like MNT_SOFTDEP/MNT_GJOURNAL (UFS) and > nfsstat -m (nfsclient), then some extension for nmount(2) is due, > say MNT_QUERY_OP, which should be passed down to VFS_MOUNT() and back. As you note there are some filesystem specific flags already in mnt_flag that get copied to the statfs f_flags field. My point is that the NOCACHE flag could be moved to mnt_flag and made visible in the f_flags field. While it is currently specific to nullfs, it might be useful to implement it in other filesystems. Kirk McKusick
Re: VM images for 12.0-CURRENT showing checksum failed messages
> Date: Wed, 18 Oct 2017 16:40:22 + > From: Glen Barber > To: John Baldwin > Cc: freebsd-current@freebsd.org, David Boyd , > "mckus...@mckusick.com" > Subject: Re: VM images for 12.0-CURRENT showing checksum failed messages > > On Wed, Oct 18, 2017 at 09:28:40AM -0700, John Baldwin wrote: >> On Wednesday, October 18, 2017 03:01:55 PM Glen Barber wrote: >>> On Wed, Oct 18, 2017 at 07:49:00AM -0700, John Baldwin wrote: >>>> On Tuesday, October 17, 2017 11:57:44 AM David Boyd wrote: >>>>> The FreeBSD-12.0-CURRENT-amd64-20171012-r324542.vmdk image displays >>>>> many checksum failed messages when booted. (see attachment). >>>>> >>>>> I think this started about 20170925. >>>>> >>>>> I have VirtualBox VM's running 10.4-STABLE, 11.1-STABLE and 12.0- >>>>> CURRENT. >>>>> >>>>> Only the 12.0-CURRENT image exhibits this behavior. >>>>> >>>>> This is easily fixed by "fsck -y /" in single-user mode during the boot >>>>> process. >>>>> >>>>> I can test any updates at almost any time. >>>> >>>> I wonder if the tool creating the snapshot images wasn't updated to >>>> generate cg checksums when creating the initial filesystem. Glen, >>>> do you know which tool (makefs or something else?) is used to >>>> generate the UFS filesystem in VM images for snapshots? >>>> (In this case it appears to be a .vmdk image) >>>> >>> >>> mkimg(1) is used. >> >> Does makefs generate the UFS image fed into mkimg or does mkimg generate the >> UFS partition itself? > > Sorry, I may have understated a bit. > > First, mdconfig(8) is used to create a md(4)-backed disk, onto which > newfs(8) is run, followed by the installworld/installkernel targets. > > Next, mkimg(1) is used to feed the resultant md(4)-based .img > filesystem (after umount(8)) to create the final output image. > > Glen Glen, Can you try running fsck on the md(4) disk after you do the unmount to see if it finds any problems (`fsck /dev/md0')? If that comes up clean (as it should), then I can investigate what it is about mkimg that causes problems. If fsck finds problems, then there is an issue in the base UFS infrastructure. Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: kernel: failed: cg 5, cgp: 0xd11ecd0d != bp: 0x63d3ff1d
> From: "Chris H" > Reply-To: bsd-li...@bsdforge.com > To: "FreeBSD Current" > Subject: kernel: failed: cg 5, cgp: 0xd11ecd0d != bp: 0x63d3ff1d > Date: Mon, 19 Feb 2018 14:18:15 -0800 > > I'm seeing a number of messages like the following: > kernel: failed: cg 5, cgp: 0xd11ecd0d != bp: 0x63d3ff1d > > and was wondering if it's anything to be concerned with, or whether > fsck(8) is fixing them. > This began to happen when the power went out on a new install: > FreeBSD dns0 12.0-CURRENT FreeBSD 12.0-CURRENT #0: Wed Dec 13 06:07:59 PST > 2017 > root@dns0:/usr/obj/usr/src/amd64.amd64/sys/DNS0 amd64 > which hadn't yet been hooked up to the UPS. > I performed an fsck in single user mode upon power-up. Which ended with the > mount points being masked CLEAN. I was asked if I wanted to use the JOURNAL. > I answered Y. > FWIW the systems are UFS2 (ffs) have gpart labels, and were newfs'd thusly: > newfs -U -j > > Thank you for all your time, and consideration. > > --Chris This problem should have been fixed with this commit: r328914 | mckusick | 2018-02-05 16:19:46 -0800 (Mon, 05 Feb 2018) You need to update your kernel to get the fix. Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
CFT: TRIM Consolodation on UFS/FFS filesystems
I have recently added TRIM consolodation support for the UFS/FFS filesystem. This feature consolodates large numbers of TRIM commands into a much smaller number of commands covering larger blocks of disk space. Best described by the commit message: Author: mckusick Date: Sun Aug 19 16:56:42 2018 New Revision: 338056 URL: https://svnweb.freebsd.org/changeset/base/338056 Log: Add consolodation of TRIM / BIO_DELETE commands to the UFS/FFS filesystem. When deleting files on filesystems that are stored on flash-memory (solid-state) disk drives, the filesystem notifies the underlying disk of the blocks that it is no longer using. The notification allows the drive to avoid saving these blocks when it needs to flash (zero out) one of its flash pages. These notifications of no-longer-being-used blocks are referred to as TRIM notifications. In FreeBSD these TRIM notifications are sent from the filesystem to the drive using the BIO_DELETE command. Until now, the filesystem would send a separate message to the drive for each block of the file that was deleted. Each Gigabyte of file size resulted in over 3000 TRIM messages being sent to the drive. This burst of messages can overwhelm the drive's task queue causing multiple second delays for read and write requests. This implementation collects runs of contiguous blocks in the file and then consolodates them into a single BIO_DELETE command to the drive. The BIO_DELETE command describes the run of blocks as a single large block being deleted. Each Gigabyte of file size can result in as few as two BIO_DELETE commands and is typically less than ten. Though these larger BIO_DELETE commands take longer to run, they do not clog the drive task queue, so read and write commands can intersperse effectively with them. Though this new feature has been throughly reviewed and tested, it is being added disabled by default so as to minimize the possibility of disrupting the upcoming 12.0 release. It can be enabled by running ``sysctl vfs.ffs.dotrimcons=1''. Users are encouraged to test it. If no problems arise, we will consider requesting that it be enabled by default for 12.0. Reviewed by: kib Tested by:Peter Holm Sponsored by: Netflix This support is off by default, but I am hoping that I can get enough testing to ensure that it (a) works, and (b) is helpful that it will be reasonable to have it turned on by default in 12.0. The cutoff for turning it on by default in 12.0 is September 19th. So I am requesting your testing feedback in the near-term. Please let me know if you have managed to use it successfully (or not) and also if it provided any performance difference (good or bad). To enable TRIM consolodation either use `sysctl vfs.ffs.dotrimcons=1' or just set the `dotrimcons' variable in sys/ufs/ffs/ffs_alloc.c to 1. Everything you need to test TRIM consolodation is obtained by setting the above sysctl. However, if you want to collect statistics on how effective the TRIM consolodation is working, the attached diff will allow you to easily get statitics on how the TRIM is going. Compile your kernel and the mount command. Note that if you do not do a buildworld, you will need to copy /sys/sys/mount.h to /usr/include/sys/mount.h to get the patched mount command to compile. Then run `mount -v' (or `mount -v | grep /mnt' to get just the statistics for /mnt). Removing a 30Mb file without TRIM consolodation: /dev/md0 on /mnt (ufs, local, writes: sync 10 async 482, reads: sync 7 async 0, fsid d43f795b6a7d34fb, TRIM: total 952 total blocks 7616) While removing the same file with TRIM consolodation: /dev/md0 on /mnt (ufs, local, writes: sync 10 async 482, reads: sync 7 async 0, fsid d43f795b6a7d34fb, TRIM: total 3 total blocks 7616) It also tracks pending blocks and pending files. These numbers are only printed out when they are non-zero. Here is an example running with soft updates right after a file has been rm'ed, but its blocks not yet released: /dev/md0 on /mnt (ufs, local, soft-updates, writes: sync 2 async 251, reads: sync 5 async 0, fsid 303f795b1be0c459, pending blocks 7616, pending files 1) Finally it tracks inflight BIO_DELETEs and total blocks represented by those inflight BIO_DELETEs. These numbers are also only printed out when they are non-zero. These statistics let you see how much of a backlog of BIO_DELETEs you have backed up at/in the disk drive and you can track how quickly they drain. Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CFT: TRIM Consolodation on UFS/FFS filesystems
From: Kirk McKusick To: FreeBSD Current , FreeBSD Filesystems Subject: CFT: TRIM Consolodation on UFS/FFS filesystems Date: Mon, 20 Aug 2018 12:40:56 -0700 Oops, forgot that attachments get stripped. Below are the diffs for gathering statistics. Sorry to those of you on Gmail for whom they will be mangled. Kirk McKusick =-=-= Index: sbin/mount/mount.c === --- sbin/mount/mount.c (revision 338054) +++ sbin/mount/mount.c (working copy) @@ -686,6 +686,18 @@ prmount(struct statfs *sfp) for (i = 0; i < sizeof(sfp->f_fsid); i++) printf("%02x", ((u_char *)&sfp->f_fsid)[i]); } + if (sfp->f_trim_total != 0 || sfp->f_trim_total_blks != 0) + (void)printf(", TRIM: total %ju total blocks %ju", + (uintmax_t)sfp->f_trim_total, + (uintmax_t)sfp->f_trim_total_blks); + if (sfp->f_trim_inflight != 0 || sfp->f_trim_inflight_blks != 0) + (void)printf(", TRIM: inflight %ju inflight blocks %ju", + (uintmax_t)sfp->f_trim_inflight, + (uintmax_t)sfp->f_trim_inflight_blks); + if (sfp->f_pendingblks != 0 || sfp->f_pendingfiles != 0) + (void)printf(", pending blocks %ju, pending files %ju", + (uintmax_t)sfp->f_pendingblks, + (uintmax_t)sfp->f_pendingfiles); } (void)printf(")\n"); } Index: sys/sys/mount.h === --- sys/sys/mount.h (revision 338054) +++ sys/sys/mount.h (working copy) @@ -85,7 +85,13 @@ struct statfs { uint64_t f_asyncwrites; /* count of async writes since mount */ uint64_t f_syncreads; /* count of sync reads since mount */ uint64_t f_asyncreads; /* count of async reads since mount */ - uint64_t f_spare[10]; /* unused spare */ + uint64_t f_trim_total; /* count of TRIM ops since mount */ + uint64_t f_trim_total_blks; /* count of TRIM blocks since mount */ + uint64_t f_trim_inflight; /* count of TRIM ops in progress */ + uint64_t f_trim_inflight_blks; /* count of TRIM blocks in progress */ + int64_t f_pendingblks; /* pending free blocks */ + int64_t f_pendingfiles;/* pending free nodes */ + uint64_t f_spare[4];/* unused spare */ uint32_t f_namemax; /* maximum filename length */ uid_t f_owner; /* user that mounted the filesystem */ fsid_tf_fsid; /* filesystem id */ Index: sys/ufs/ffs/ffs_vfsops.c === --- sys/ufs/ffs/ffs_vfsops.c(revision 338081) +++ sys/ufs/ffs/ffs_vfsops.c(working copy) @@ -1398,7 +1398,13 @@ ffs_statfs(mp, sbp) sbp->f_bsize = fs->fs_fsize; sbp->f_iosize = fs->fs_bsize; sbp->f_blocks = fs->fs_dsize; + sbp->f_pendingblks = dbtofsb(fs, fs->fs_pendingblocks); + sbp->f_pendingfiles = fs->fs_pendinginodes; UFS_LOCK(ump); + sbp->f_trim_total = ump->um_trim_total; + sbp->f_trim_total_blks = ump->um_trim_total_blks; + sbp->f_trim_inflight = ump->um_trim_inflight; + sbp->f_trim_inflight_blks = ump->um_trim_inflight_blks; sbp->f_bfree = fs->fs_cstotal.cs_nbfree * fs->fs_frag + fs->fs_cstotal.cs_nffree + dbtofsb(fs, fs->fs_pendingblocks); sbp->f_bavail = freespace(fs, fs->fs_minfree) + ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Typo in hastd.8 man page
> Date: Sat, 21 Dec 2019 12:03:47 -0800 > From: Steve Kargl > To: freebsd-current@freebsd.org > Subject: [PATCH] Typo in hastd.8 man page > > Path should explain the issue. > > Index: hastd.8 > === > --- hastd.8 (revision 355983) > +++ hastd.8 (working copy) > @@ -44,7 +44,7 @@ > daemon is responsible for managing highly available GEOM providers. > .Pp > .Nm > -allows the transpaent storage of data on two physically separated machines > +allows the transparent storage of data on two physically separated machines > connected over a TCP/IP network. > Only one machine (cluster node) can actively use storage provided by > .Nm . > > -- > Steve Fixed in -r355995, thanks. Kirk McKusick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"