> On 07 Nov 2016, at 21:53, Jaromír Doleček <jaromir.dole...@gmail.com> wrote: > > 2016-11-07 12:11 GMT+01:00 J. Hannken-Illjes <hann...@eis.cs.tu-bs.de>: >> It gets used when a block of block pointers has been deallocated and >> brelsed, that is NOT written to disk. If we allow the deallocation of >> this block to fail and ufs_truncate_retry() runs ffs_truncate again >> it will read the block from disk and free already freed blocks. > > Good point. The same is however true for ALL the calls in the chain, > so then we should have to really use FORCE everywhere.
No. If one of the other calls (clearing one pointer in a block) fails we already bwrite() (if (error != 0 ...). In the wapbl case writing all blocks would result in transitions too big for the log. I tried it ... > One solution is to always bwrite() or bdwrite() back even fully zeroed > block for wapbl case instead of brelse(BC_INVAL). I think that for > fsck to reliably recover from crash within truncate, this might > actually be needed also for !wapbl case. No. The !wapbl case is safe. The block is either written after the copy (before we change it) or from ffs_truncate() when it sets "sync = 1". For wapbl case see above. > Another way how to solve this would be to try to register the > deallocation and on error bail out, before the diving. It would > require cancelling the registration if the diving call returns EAGAIN > however. > >> You mean this: >> >> error = ffs_indirtrunc(ip, nlbn, FFS_FSBTODB(fs, nb), >> (daddr_t)-1, level - 1, countp); >> - if (error) >> - goto out; >> + if (error == EAGAIN) >> + goto out; >> + else if (error && !allerror) >> + allerror = error; > > No, I mean the copy logic and big blocks with condition on > ip->i_ump->um_mountp->mnt_wapbl. The copy logic is here to prevent corruption beyond fsck capabilities and therefore we need in the !wapbl case. There IS a reason it was here since the beginning of time. Not sure if it would make sense to split it into ffs_indir_trunc() and ffs_indir_trunc_wapbl(). Possibly easier to understand but prone to errors fixed in only one of them. >> I don't understand ... do you want to split into two diffs? > > I prefer to split commits into incremental changes if reasonably > possible, so it's easier to review and revise. That's all. That's why > I prefer the fix for immediate corruption to go separately. My patch contains corruption issues only and it passes ATF and it passes my stress test which is a bit more than just some fsx. As -current currently corrupts file systems we should either fix it very soon or revert your changes completely. -- J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)