On Wed, 2013-07-24 at 13:34 +1000, NeilBrown wrote: > On Wed, 24 Jul 2013 04:02:15 +0100 Ben Hutchings <b...@decadent.org.uk> wrote: > > > Neil, does the report below sound like the bug you fixed with: > > > > commit 7bb23c4934059c64cbee2e41d5d24ce122285176 > > Author: NeilBrown <ne...@suse.de> > > Date: Tue Jul 16 16:50:47 2013 +1000 > > > > md/raid10: fix two problems with RAID10 resync. > > > > or any of the others you've recently fixed? > > Yes. The bug below sounds exactly like the one fixed by the commit above. > > NeilBrown
Thanks very much. Thomas, please test the attached patch, following the instructions at <http://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s-common-official>. Ben. -- Ben Hutchings Once a job is fouled up, anything done to improve it makes it worse.
From: NeilBrown <ne...@suse.de> Date: Tue, 16 Jul 2013 16:50:47 +1000 Subject: md/raid10: fix two problems with RAID10 resync. Origin: https://git.kernel.org/linus/7bb23c4934059c64cbee2e41d5d24ce122285176 1/ When an different between blocks is found, data is copied from one bio to the other. However bv_len is used as the length to copy and this could be zero. So use r10_bio->sectors to calculate length instead. Using bv_len was probably always a bit dubious, but the introduction of bio_advance made it much more likely to be a problem. 2/ When preparing some blocks for sync, we don't set BIO_UPTODATE except on bios that we schedule for a read. This ensures that missing/failed devices don't confuse the loop at the top of sync_request write. Commit 8be185f2c9d54d6 "raid10: Use bio_reset()" removed a loop which set BIO_UPTDATE on all appropriate bios. So we need to re-add that flag. These bugs were introduced in 3.10, so this patch is suitable for 3.10-stable, and can remove a potential for data corruption. Cc: sta...@vger.kernel.org (3.10) Reported-by: Brassow Jonathan <jbras...@redhat.com> Signed-off-by: NeilBrown <ne...@suse.de> --- drivers/md/raid10.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -2075,11 +2075,17 @@ static void sync_request_write(struct md * both 'first' and 'i', so we just compare them. * All vec entries are PAGE_SIZE; */ - for (j = 0; j < vcnt; j++) + int sectors = r10_bio->sectors; + for (j = 0; j < vcnt; j++) { + int len = PAGE_SIZE; + if (sectors < (len / 512)) + len = sectors * 512; if (memcmp(page_address(fbio->bi_io_vec[j].bv_page), page_address(tbio->bi_io_vec[j].bv_page), - fbio->bi_io_vec[j].bv_len)) + len)) break; + sectors -= len/512; + } if (j == vcnt) continue; atomic64_add(r10_bio->sectors, &mddev->resync_mismatches); @@ -3386,6 +3392,7 @@ static sector_t sync_request(struct mdde if (bio->bi_end_io == end_sync_read) { md_sync_acct(bio->bi_bdev, nr_sectors); + set_bit(BIO_UPTODATE, &bio->bi_flags); generic_make_request(bio); } }
signature.asc
Description: This is a digitally signed message part