On 18 Jul, Matthias Buelow wrote: > Paul Mather <[EMAIL PROTECTED]> writes: > >>Why would that necessarily be more successful? If the outstanding >>buffers count is not reducing between time intervals, it is most likely >>because there is some underlying hardware problem (e.g., a bad block). >>If the count still persists in staying put, it likely means whatever the >>hardware is doing to try and fix things (e.g., write reallocation) isn't >>working, and so the kernel may as well give up. > > So the kernel is relying on guesswork whether the buffers are flushed > or not... > >>You can enumerate the buffers and *try* to write them, but that doesn't >>guarantee they will be written successfully any more than observing the >>relative number left outstanding. > > That's rather nonsensical. If I write each buffer synchronously (and > wait for the disk's response) this is for sure a lot more reliable than > observing changes in the number of remaining buffers. I mean, where's > the sense in the latter? It would be analogous to, in userspace, having > to monitor write(2) continuously over a given time interval and check > whether the number it returns eventually reaches zero. That's complete > madness, imho.
During syncer shutdown, the numbers being printed are actually the number of vnodes that have dirty buffers. The syncer walks the list of vnodes with dirty buffers any synchronously flushes each one to disk (modulo whatever write-caching is done by the drive). The reason that the monitors the number of dirty vnodes instead of just interating once over the list is that with softupdates, flushing one vnode to disk can cause another vnode to be dirtied and put on the list, so it can take multiple passes to flush all the dirty vnodes. Its normal to see this if the machine was at least moderately busy before being shut down. The number of dirty vnodes will start off at a high number, decrease rapidly at first, and then decrease to zero. It is not unusual to see the number bounce from zero back into the low single digits a few times before stabilizing at zero and triggering the syncer termination code. The syncer shutdown algorithm could definitely be improved to speed it up. I didn't want it to push out too many vnodes at the start of the shutdown sequence, but later in the sequence the delay intervals could be shortened and more worklist buckets could be visited per interval to speed the shutdown. One possible complication that I worry about is that the new vnodes being added to the list might not be added synchronously, so if the syncer processes the worklist and shuts down too quickly it might miss vnodes that got added too late. I've never seen a syncer shutdown timeout, though it could happen if either the underlying media became unwriteable or if a process got wedged while holding a vnode lock. In either case, it might never be possible to flush the dirty vnodes in question. The final sync code in boot() just iterated over the dirty buffers, but it was not unusual for it to get stuck on mutually dependent buffers. I would see this quite frequently if I did a shutdown immediately after running mergemaster. The final sync code would flush all but the last few buffers and finally time out. This problem was my motivation for adding the shutdown code to the syncer so that the final sync code would hopefully not have anything to do. The final sync code also gets confused if you have any ext2 file systems mounted (even read-only) and times out while waiting for the ext2 file system to release its private buffers (which only happens when the file system is unmounted). _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"