-------- In message <dc23d104-f5f3-4844-8638-4644dc9dd...@samsco.org>, Scott Long writes:
> Why is overloading EIO so bad? brelse() will call bdirty() when a BIO_WRITE > command has failed with EIO. Calling bdirty() has the effect of retrying the > I/O. > This disregards the fact that disk drivers only return EIO when they’ve > decided > that the I/O cannot be retried. It has no termination condition for the > retries, and > will endlessly retry I/O in vain; I’ve seen this quite frequently. The really annoying thing about this particular class of errors, is that if we propagated them up to the filesystems, very often things could be relocated to different blocks and we would avoid the unnecessary filesystem corruption. The real fundamental deficiency is that we do not have a way to say "give up if this bio cannot be completed in X time" which is what people actually want. That is suprisingly hard to provide, there are far too many corner-cases for me to enumerate them all, but let me just give one example: Imagine you issue a deadlined write to a RAID5 thing. Thee component writes happen smoothly, but the last two fail the deadline, with no way to predict how long time it will take before they complete or fail. * Does the bio write transaction fail ? * Does the bio write transaction time out ? * Do you attempt to complete the write to the RAID5 ? * Where do you store a copy of the data if you do ? * What happens next time a read happens on this bio's extent ? Then for an encore, imagine it was a read bio: Three DMAs go smoothly, two are outstanding and you don't know if/when they will complete/fail. * If you fail or time out the bio, how do you "taint" the space being read into until the two remaining DMAs are outstanding? * What if that space is mapped into userland ? * What if that space is being executed ? * What if one of the two outstanding DMAs later return garbage ? My conclusion back when I did GEOM, was that the only way to do something like this sanely, is to have a special GEOM do it for you, which always allocates a temp-space: allocate temp buffer if (write) copy write data to temp buffer issue bio downwards on temp buffer if timeout park temp buffer until biodone return(timeout) if (read) copy temp buffer to read space return (ok/error) -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. _______________________________________________ freebsd-geom@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-geom To unsubscribe, send any mail to "freebsd-geom-unsubscr...@freebsd.org"