Re: [Lsf-pc] [LSF/MM ATTEND] Online Logical Head Depop and SMR disks chunked writepages
On Wed 24-02-16 01:53:24, Damien Le Moal wrote: > > >On Tue 23-02-16 05:31:13, Damien Le Moal wrote: > >> > >> >On 02/22/16 18:56, Damien Le Moal wrote: > >> >> 2) Write back of dirty pages to SMR block devices: > >> >> > >> >> Dirty pages of a block device inode are currently processed using the > >> >> generic_writepages function, which can be executed simultaneously > >> >> by multiple contexts (e.g sync, fsync, msync, sync_file_range, etc). > >> >> Mutual exclusion of the dirty page processing being achieved only at > >> >> the page level (page lock & page writeback flag), multiple processes > >> >> executing a "sync" of overlapping block ranges over the same zone of > >> >> an SMR disk can cause an out-of-LBA-order sequence of write requests > >> >> being sent to the underlying device. On a host managed SMR disk, where > >> >> sequential write to disk zones is mandatory, this result in errors and > >> >> the impossibility for an application using raw sequential disk write > >> >> accesses to be guaranteed successful completion of its write or fsync > >> >> requests. > >> >> > >> >> Using the zone information attached to the SMR block device queue > >> >> (introduced by Hannes), calls to the generic_writepages function can > >> >> be made mutually exclusive on a per zone basis by locking the zones. > >> >> This guarantees sequential request generation for each zone and avoid > >> >> write errors without any modification to the generic code implementing > >> >> generic_writepages. > >> >> > >> >> This is but one possible solution for supporting SMR host-managed > >> >> devices without any major rewrite of page cache management and > >> >> write-back processing. The opinion of the audience regarding this > >> >> solution and discussing other potential solutions would be greatly > >> >> appreciated. > >> > > >> >Hello Damien, > >> > > >> >Is it sufficient to support filesystems like BTRFS on top of SMR drives > >> >or would you also like to see that filesystems like ext4 can use SMR > >> >drives ? In the latter case: the behavior of SMR drives differs so > >> >significantly from that of other block devices that I'm not sure that we > >> >should try to support these directly from infrastructure like the page > >> >cache. If we look e.g. at NAND SSDs then we see that the characteristics > >> >of NAND do not match what filesystems expect (e.g. large erase blocks). > >> >That is why every SSD vendor provides an FTL (Flash Translation Layer), > >> >either inside the SSD or as a separate software driver. An FTL > >> >implements a so-called LFS (log-structured filesystem). With what I know > >> >about SMR this technology looks also suitable for implementation of a > >> >LFS. Has it already been considered to implement an LFS driver for SMR > >> >drives ? That would make it possible for any filesystem to access an SMR > >> >drive as any other block device. I'm not sure of this but maybe it will > >> >be possible to share some infrastructure with the LightNVM driver > >> >(directory drivers/lightnvm in the Linux kernel tree). This driver > >> >namely implements an FTL. > >> > >> I totally agree with you that trying to support SMR disks by only modifying > >> the page cache so that unmodified standard file systems like BTRFS or ext4 > >> remain operational is not realistic at best, and more likely simply > >> impossible. > >> For this kind of use case, as you said, an FTL or a device mapper driver > >> are > >> much more suitable. > >> > >> The case I am considering for this discussion is for raw block device > >> accesses > >> by an application (writes from user space to /dev/sdxx). This is a very > >> likely > >> use case scenario for high capacity SMR disks with applications like > >> distributed > >> object stores / key value stores. > >> > >> In this case, write-back of dirty pages in the block device file inode > >> mapping > >> is handled in fs/block_dev.c using the generic helper function > >> generic_writepages. > >> This does not guarantee the generation of the required sequential write > >> pattern > >> per zone necessary for host-managed disks. As I explained, aligning calls > >> of this > >> function to zone boundaries while locking the zones under write-back solves > >> simply the problem (implemented and tested). This is of course only one > >> possible > >> solution. Pushing modifications deeper in the code or providing a > >> "generic_sequential_writepages" helper function are other potential > >> solutions > >> that in my opinion are worth discussing as other types of devices may > >> benefit also > >> in terms of performance (e.g. regular disk drives prefer sequential > >> writes, and > >> SSDs as well) and/or lighten the overhead on an underlying FTL or device > >> mapper > >> driver. > >> > >> For a file system, an SMR compliant implementation of a file inode mapping > >> writepages method should be provided by the file system itself as the > >> sequentialit
[PATCH] osd: remove deadcode
The variable is_ver1 is always true and so OSD_CAP_LEN can never be used. Signed-off-by: Sudip Mukherjee --- drivers/scsi/osd/osd_initiator.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/scsi/osd/osd_initiator.c b/drivers/scsi/osd/osd_initiator.c index d8a2b51..3b11aad 100644 --- a/drivers/scsi/osd/osd_initiator.c +++ b/drivers/scsi/osd/osd_initiator.c @@ -2006,9 +2006,8 @@ EXPORT_SYMBOL(osd_sec_init_nosec_doall_caps); */ void osd_set_caps(struct osd_cdb *cdb, const void *caps) { - bool is_ver1 = true; /* NOTE: They start at same address */ - memcpy(&cdb->v1.caps, caps, is_ver1 ? OSDv1_CAP_LEN : OSD_CAP_LEN); + memcpy(&cdb->v1.caps, caps, OSDv1_CAP_LEN); } bool osd_is_sec_alldata(struct osd_security_parameters *sec_parms __unused) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] imm: check parport_claim
parport_claim() can fail and we should be checking if we were able to claim the port. Signed-off-by: Sudip Mukherjee --- drivers/scsi/imm.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/imm.c b/drivers/scsi/imm.c index f8b88fa..9164ce12 100644 --- a/drivers/scsi/imm.c +++ b/drivers/scsi/imm.c @@ -77,9 +77,10 @@ static void imm_wakeup(void *ref) spin_lock_irqsave(&arbitration_lock, flags); if (dev->wanted) { - parport_claim(dev->dev); - got_it(dev); - dev->wanted = 0; + if (parport_claim(dev->dev) == 0) { + got_it(dev); + dev->wanted = 0; + } } spin_unlock_irqrestore(&arbitration_lock, flags); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] osd: remove deadcode
The variable is_ver1 is always true and so OSD_CAP_LEN can never be used. Reported by Coverity. Signed-off-by: Sudip Mukherjee --- v2: Joe Perches asked to mention the tool used in the commit log. drivers/scsi/osd/osd_initiator.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/scsi/osd/osd_initiator.c b/drivers/scsi/osd/osd_initiator.c index d8a2b51..3b11aad 100644 --- a/drivers/scsi/osd/osd_initiator.c +++ b/drivers/scsi/osd/osd_initiator.c @@ -2006,9 +2006,8 @@ EXPORT_SYMBOL(osd_sec_init_nosec_doall_caps); */ void osd_set_caps(struct osd_cdb *cdb, const void *caps) { - bool is_ver1 = true; /* NOTE: They start at same address */ - memcpy(&cdb->v1.caps, caps, is_ver1 ? OSDv1_CAP_LEN : OSD_CAP_LEN); + memcpy(&cdb->v1.caps, caps, OSDv1_CAP_LEN); } bool osd_is_sec_alldata(struct osd_security_parameters *sec_parms __unused) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] imm: check parport_claim
Reviewed-by: Matthew R. Ochs -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] osd: remove deadcode
Reviewed-by: Matthew R. Ochs -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] osd: remove deadcode
On 02/24/2016 01:21 PM, Sudip Mukherjee wrote: > The variable is_ver1 is always true and so OSD_CAP_LEN can never be > used. > Reported by Coverity. > > Signed-off-by: Sudip Mukherjee ACK-by: Boaz harrosh Thanks > --- > > v2: Joe Perches asked to mention the tool used in the commit log. > > drivers/scsi/osd/osd_initiator.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/scsi/osd/osd_initiator.c > b/drivers/scsi/osd/osd_initiator.c > index d8a2b51..3b11aad 100644 > --- a/drivers/scsi/osd/osd_initiator.c > +++ b/drivers/scsi/osd/osd_initiator.c > @@ -2006,9 +2006,8 @@ EXPORT_SYMBOL(osd_sec_init_nosec_doall_caps); > */ > void osd_set_caps(struct osd_cdb *cdb, const void *caps) > { > - bool is_ver1 = true; > /* NOTE: They start at same address */ > - memcpy(&cdb->v1.caps, caps, is_ver1 ? OSDv1_CAP_LEN : OSD_CAP_LEN); > + memcpy(&cdb->v1.caps, caps, OSDv1_CAP_LEN); > } > > bool osd_is_sec_alldata(struct osd_security_parameters *sec_parms __unused) > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dm-multipath test scripts
On Thu, Feb 18 2016 at 7:33pm -0500, Junichi Nomura wrote: > Hi Mike, > > On 02/19/16 02:17, Mike Snitzer wrote: > > > Taking a step back: > > These scripts don't belong in Documentation/device-mapper/mptest/ (or > > anywhere in the kernel tree for that matter). > > > > I'd really prefer it if we could port your scripts over to the > > device-mapper-test-suite, see: > > https://github.com/jthornber/device-mapper-test-suite > > Yes, I agree such a project is better place for this to live. I was going to attempt porting your scripts to device-mapper-test-suite but I'll have to come back to that (I have more important tasks at this time). So I've created a guthub repo for your scripts: https://github.com/snitm/mptest I'll let you know once I've ported to device-mapper-test-suite. But in the meantime I'll take any changes you or others have to 'mptest'. Thanks, Mike -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/35 v4] separate operations from flags in the bio/request structs
The following patches begin to cleanup the request->cmd_flags and bio->bi_rw mess. We currently use cmd_flags to specify the operation, attributes and state of the request. For bi_rw we use it for similar info and also the priority but then also have another bi_flags field for state. At some point, we abused them so much we just made cmd_flags 64 bits, so we could add more. The following patches seperate the operation (read, write discard, flush, etc) from cmd_flags/bi_rw. This patchset was made against linux-next from today Feb 24 2016 (git tag next-20160224). I put a git tree here: https://github.com/mikechristie/linux-kernel.git v4: 1. Rebased to current linux-next tree. v3: 1. Used "=" instead of "|=" to setup bio bi_rw. 2. Removed __get_request cmd_flags compat code. 3. Merged initial dm related changes requested by Mike Snitzer. 4. Fixed ubd kbuild errors in flush related patches. 5. Fix 80 char col issues in several patches. 6. Fix issue with one of the btrfs patches where it looks like I reverted a patch when trying to fix a merge error. v2 1. Dropped arguments from submit_bio, and had callers setup bio. 2. Add REQ_OP_FLUSH for request_fn users and renamed REQ_FLUSH to REQ_PREFLUSH for make_request_fn users. 3. Dropped bio/rq_data_dir functions, and added a op_is_write function instead. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/35] fs: have ll_rw_block users pass in op and flags separately
From: Mike Christie This has ll_rw_block users pass in the operation and flags separately, so we can setup the bio->bi_op and bio-bi_rw flags. v2: 1. Fix for kbuild error in ll_rw_block comments. Signed-off-by: Mike Christie --- fs/buffer.c | 19 ++- fs/ext4/inode.c | 6 +++--- fs/ext4/namei.c | 3 ++- fs/ext4/super.c | 2 +- fs/gfs2/bmap.c | 2 +- fs/gfs2/meta_io.c | 4 ++-- fs/gfs2/quota.c | 2 +- fs/isofs/compress.c | 2 +- fs/jbd2/journal.c | 2 +- fs/jbd2/recovery.c | 4 ++-- fs/ocfs2/aops.c | 2 +- fs/ocfs2/super.c| 2 +- fs/reiserfs/journal.c | 8 fs/reiserfs/stree.c | 4 ++-- fs/reiserfs/super.c | 2 +- fs/squashfs/block.c | 4 ++-- fs/udf/dir.c| 2 +- fs/udf/directory.c | 2 +- fs/udf/inode.c | 2 +- fs/ufs/balloc.c | 2 +- include/linux/buffer_head.h | 2 +- 21 files changed, 40 insertions(+), 38 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 3492de4..5408ca6 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -588,7 +588,7 @@ void write_boundary_block(struct block_device *bdev, struct buffer_head *bh = __find_get_block(bdev, bblock + 1, blocksize); if (bh) { if (buffer_dirty(bh)) - ll_rw_block(WRITE, 1, &bh); + ll_rw_block(REQ_OP_WRITE, 0, 1, &bh); put_bh(bh); } } @@ -1395,7 +1395,7 @@ void __breadahead(struct block_device *bdev, sector_t block, unsigned size) { struct buffer_head *bh = __getblk(bdev, block, size); if (likely(bh)) { - ll_rw_block(READA, 1, &bh); + ll_rw_block(REQ_OP_READ, READA, 1, &bh); brelse(bh); } } @@ -1955,7 +1955,7 @@ int __block_write_begin(struct page *page, loff_t pos, unsigned len, if (!buffer_uptodate(bh) && !buffer_delay(bh) && !buffer_unwritten(bh) && (block_start < from || block_end > to)) { - ll_rw_block(READ, 1, &bh); + ll_rw_block(REQ_OP_READ, 0, 1, &bh); *wait_bh++=bh; } } @@ -2852,7 +2852,7 @@ int block_truncate_page(struct address_space *mapping, if (!buffer_uptodate(bh) && !buffer_delay(bh) && !buffer_unwritten(bh)) { err = -EIO; - ll_rw_block(READ, 1, &bh); + ll_rw_block(REQ_OP_READ, 0, 1, &bh); wait_on_buffer(bh); /* Uhhuh. Read error. Complain and punt. */ if (!buffer_uptodate(bh)) @@ -3052,7 +3052,8 @@ EXPORT_SYMBOL(submit_bh); /** * ll_rw_block: low-level access to block devices (DEPRECATED) - * @rw: whether to %READ or %WRITE or maybe %READA (readahead) + * @op: whether to %READ or %WRITE + * @op_flags: rq_flag_bits or %READA (readahead) * @nr: number of &struct buffer_heads in the array * @bhs: array of pointers to &struct buffer_head * @@ -3075,7 +3076,7 @@ EXPORT_SYMBOL(submit_bh); * All of the buffers must be for the same device, and must also be a * multiple of the current approved size for the device. */ -void ll_rw_block(int rw, int nr, struct buffer_head *bhs[]) +void ll_rw_block(int op, int op_flags, int nr, struct buffer_head *bhs[]) { int i; @@ -3084,18 +3085,18 @@ void ll_rw_block(int rw, int nr, struct buffer_head *bhs[]) if (!trylock_buffer(bh)) continue; - if (rw == WRITE) { + if (op == WRITE) { if (test_clear_buffer_dirty(bh)) { bh->b_end_io = end_buffer_write_sync; get_bh(bh); - submit_bh(rw, 0, bh); + submit_bh(op, op_flags, bh); continue; } } else { if (!buffer_uptodate(bh)) { bh->b_end_io = end_buffer_read_sync; get_bh(bh); - submit_bh(rw, 0, bh); + submit_bh(op, op_flags, bh); continue; } } diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 8a91e98..d916f2c 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -838,7 +838,7 @@ struct buffer_head *ext4_bread(handle_t *handle, struct inode *inode, return bh; if (!bh || buffer_uptodate(bh)) return bh; - ll_rw_block(READ | REQ_META | REQ_PRIO, 1, &bh); + ll_rw_block(REQ_OP_READ, REQ_META | REQ_PRIO, 1, &bh); wait_on_buffer(bh); if (buffer_uptodate(bh)) return bh; @@ -992,7 +992,7 @@
[PATCH 32/35] block: shrink bi_rw and bi_op
From: Mike Christie There is no need for bi_op/op and bi_rw to be so large now, so this patch shrinks them. Signed-off-by: Mike Christie --- block/blk-core.c | 2 +- drivers/md/dm-flakey.c | 2 +- drivers/md/raid5.c | 13 +++-- fs/btrfs/check-integrity.c | 4 ++-- fs/btrfs/inode.c | 2 +- include/linux/bio.h| 13 ++--- include/linux/blk_types.h | 11 +++ include/linux/blkdev.h | 2 +- 8 files changed, 18 insertions(+), 31 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index bba1a69..5436c19 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1853,7 +1853,7 @@ static void handle_bad_sector(struct bio *bio) char b[BDEVNAME_SIZE]; printk(KERN_INFO "attempt to access beyond end of device\n"); - printk(KERN_INFO "%s: rw=%d,%ld, want=%Lu, limit=%Lu\n", + printk(KERN_INFO "%s: rw=%d,%u, want=%Lu, limit=%Lu\n", bdevname(bio->bi_bdev, b), bio->bi_op, bio->bi_rw, (unsigned long long)bio_end_sector(bio), diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c index b7341de..29b99fb 100644 --- a/drivers/md/dm-flakey.c +++ b/drivers/md/dm-flakey.c @@ -266,7 +266,7 @@ static void corrupt_bio_data(struct bio *bio, struct flakey_c *fc) data[fc->corrupt_bio_byte - 1] = fc->corrupt_bio_value; DMDEBUG("Corrupting data bio=%p by writing %u to byte %u " - "(rw=%c bi_rw=%lu bi_sector=%llu cur_bytes=%u)\n", + "(rw=%c bi_rw=%u bi_sector=%llu cur_bytes=%u)\n", bio, fc->corrupt_bio_value, fc->corrupt_bio_byte, (bio_data_dir(bio) == WRITE) ? 'w' : 'r', bio->bi_rw, (unsigned long long)bio->bi_iter.bi_sector, bio_bytes); diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 0c53ca2..e9bc323 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -1015,9 +1015,9 @@ again: : raid5_end_read_request; bi->bi_private = sh; - pr_debug("%s: for %llu schedule op %ld on disc %d\n", + pr_debug("%s: for %llu schedule op %d,%u on disc %d\n", __func__, (unsigned long long)sh->sector, - bi->bi_rw, i); + bi->bi_op, bi->bi_rw, i); atomic_inc(&sh->count); if (sh != head_sh) atomic_inc(&head_sh->count); @@ -1067,10 +1067,10 @@ again: rbi->bi_end_io = raid5_end_write_request; rbi->bi_private = sh; - pr_debug("%s: for %llu schedule op %ld on " + pr_debug("%s: for %llu schedule op %d,%u on " "replacement disc %d\n", __func__, (unsigned long long)sh->sector, - rbi->bi_rw, i); + rbi->bi_op, rbi->bi_rw, i); atomic_inc(&sh->count); if (sh != head_sh) atomic_inc(&head_sh->count); @@ -1102,8 +1102,9 @@ again: if (!rdev && !rrdev) { if (op_is_write(op)) set_bit(STRIPE_DEGRADED, &sh->state); - pr_debug("skip op %ld on disc %d for sector %llu\n", - bi->bi_rw, i, (unsigned long long)sh->sector); + pr_debug("skip op %d,%u on disc %d for sector %llu\n", +bi->bi_op, bi->bi_rw, i, +(unsigned long long)sh->sector); clear_bit(R5_LOCKED, &sh->dev[i].flags); set_bit(STRIPE_HANDLE, &sh->state); } diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c index d95c323..9dc4394 100644 --- a/fs/btrfs/check-integrity.c +++ b/fs/btrfs/check-integrity.c @@ -2941,7 +2941,7 @@ static void __btrfsic_submit_bio(struct bio *bio) if (dev_state->state->print_mask & BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH) printk(KERN_INFO - "submit_bio(rw=%d,0x%lx, bi_vcnt=%u," + "submit_bio(rw=%d,0x%x, bi_vcnt=%u," " bi_sector=%llu (bytenr %llu), bi_bdev=%p)\n", bio->bi_op, bio->bi_rw, bio->bi_vcnt, (unsigned long long)bio->bi_iter.bi_sector, @@ -2984,7 +2984,7 @@ static void __btrfsic_submit_bio(struct bio *bio) if (dev_state->state->print_mask & BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH) printk(KERN_INFO - "submit_bio(rw=
[PATCH 35/35] block, drivers, fs: rename REQ_FLUSH to REQ_PREFLUSH
From: Mike Christie To avoid confusion between REQ_OP_FLUSH, which is handled by request_fn drivers, and upper layers requesting the block layer perform a flush sequence along with possibly a WRITE, this patch renames REQ_FLUSH to REQ_PREFLUSH. Signed-off-by: Mike Christie --- Documentation/block/writeback_cache_control.txt | 31 + Documentation/device-mapper/log-writes.txt | 10 block/blk-core.c| 12 +- block/blk-flush.c | 16 ++--- block/blk-mq.c | 4 ++-- drivers/block/drbd/drbd_actlog.c| 4 ++-- drivers/block/drbd/drbd_main.c | 2 +- drivers/block/drbd/drbd_protocol.h | 2 +- drivers/block/drbd/drbd_receiver.c | 2 +- drivers/block/drbd/drbd_req.c | 2 +- drivers/md/bcache/journal.c | 2 +- drivers/md/bcache/request.c | 8 +++ drivers/md/dm-cache-target.c| 12 +- drivers/md/dm-crypt.c | 7 +++--- drivers/md/dm-era-target.c | 4 ++-- drivers/md/dm-io.c | 2 +- drivers/md/dm-log-writes.c | 2 +- drivers/md/dm-raid1.c | 5 ++-- drivers/md/dm-region-hash.c | 4 ++-- drivers/md/dm-snap.c| 6 ++--- drivers/md/dm-stripe.c | 2 +- drivers/md/dm-thin.c| 8 +++ drivers/md/dm.c | 12 +- drivers/md/linear.c | 2 +- drivers/md/md.c | 2 +- drivers/md/md.h | 2 +- drivers/md/multipath.c | 2 +- drivers/md/raid0.c | 2 +- drivers/md/raid1.c | 3 ++- drivers/md/raid10.c | 2 +- drivers/md/raid5-cache.c| 2 +- drivers/md/raid5.c | 2 +- fs/btrfs/check-integrity.c | 8 +++ fs/jbd2/journal.c | 2 +- fs/xfs/xfs_buf.c| 2 +- include/linux/blk_types.h | 8 +++ include/linux/fs.h | 4 ++-- include/trace/events/f2fs.h | 2 +- kernel/trace/blktrace.c | 5 ++-- 39 files changed, 107 insertions(+), 102 deletions(-) diff --git a/Documentation/block/writeback_cache_control.txt b/Documentation/block/writeback_cache_control.txt index ea5550f..9869f18 100644 --- a/Documentation/block/writeback_cache_control.txt +++ b/Documentation/block/writeback_cache_control.txt @@ -20,11 +20,11 @@ a forced cache flush, and the Force Unit Access (FUA) flag for requests. Explicit cache flushes -- -The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from +The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from the filesystem and will make sure the volatile cache of the storage device has been flushed before the actual I/O operation is started. This explicitly guarantees that previously completed write requests are on non-volatile -storage before the flagged bio starts. In addition the REQ_FLUSH flag can be +storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be set on an otherwise empty bio structure, which causes only an explicit cache flush without any dependent I/O. It is recommend to use the blkdev_issue_flush() helper for a pure cache flush. @@ -41,21 +41,21 @@ signaled after the data has been committed to non-volatile storage. Implementation details for filesystems -- -Filesystems can simply set the REQ_FLUSH and REQ_FUA bits and do not have to +Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to worry if the underlying devices need any explicit cache flushing and how -the Forced Unit Access is implemented. The REQ_FLUSH and REQ_FUA flags +the Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags may both be set on a single bio. Implementation details for make_request_fn based block drivers -- -These drivers will always see the REQ_FLUSH and REQ_FUA bits as they sit +These drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit directly below the submit_bio interface. For remapping drivers the REQ_FUA bits need to be propagated to underlying devices, and a global flush needs -to be implemented for bios with the REQ_FLUSH bit set. For real device -drivers that do not have a volatile cache the REQ_FLUSH and REQ_FUA bits -on non-empty bios can simply b
[PATCH 34/35] block: add QUEUE_FLAGs for flush and fua
From: Mike Christie The last patch added a REQ_OP_FLUSH for request_fn drivers and the next patch renames REQ_FLUSH to REQ_PREFLUSH which will be used by file systems and make_request_fn drivers. This leaves REQ_FLUSH/REQ_FUA defined for drivers to tell the block layer if flush/fua is supported. The names are confusing and I bet will will accidentally be used by people to request flushes. To avoid that, this patch adds QUEUE_FLAGs for flush and fua which drivers will use to indicate what they support. v2: 1. Fix kbuild failures. Forgot to update ubd driver. v3: 1. Rename dm_table_supports_flush callout function argument to callout_fn. Signed-off-by: Mike Christie --- arch/um/drivers/ubd_kern.c | 2 +- block/blk-core.c| 3 +- block/blk-flush.c | 12 block/blk-settings.c| 20 -- drivers/block/drbd/drbd_main.c | 3 +- drivers/block/loop.c| 2 +- drivers/block/mtip32xx/mtip32xx.c | 3 +- drivers/block/nbd.c | 6 ++-- drivers/block/osdblk.c | 2 +- drivers/block/ps3disk.c | 2 +- drivers/block/skd_main.c| 3 +- drivers/block/virtio_blk.c | 4 +-- drivers/block/xen-blkback/xenbus.c | 2 +- drivers/block/xen-blkfront.c| 55 ++--- drivers/ide/ide-disk.c | 6 ++-- drivers/md/bcache/super.c | 4 +-- drivers/md/dm-table.c | 32 + drivers/md/md.c | 3 +- drivers/md/raid5-cache.c| 3 +- drivers/mmc/card/block.c| 3 +- drivers/mtd/mtd_blkdevs.c | 2 +- drivers/nvme/host/core.c| 6 ++-- drivers/scsi/sd.c | 13 + drivers/target/target_core_iblock.c | 6 ++-- include/linux/blkdev.h | 6 ++-- 25 files changed, 108 insertions(+), 95 deletions(-) diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c index a7dc382..44380d6 100644 --- a/arch/um/drivers/ubd_kern.c +++ b/arch/um/drivers/ubd_kern.c @@ -862,7 +862,7 @@ static int ubd_add(int n, char **error_out) goto out; } ubd_dev->queue->queuedata = ubd_dev; - blk_queue_flush(ubd_dev->queue, REQ_FLUSH); + queue_flag_set_unlocked(QUEUE_FLAG_FLUSH, ubd_dev->queue); blk_queue_max_segments(ubd_dev->queue, MAX_SG); err = ubd_disk_register(UBD_MAJOR, ubd_dev->size, n, &ubd_gendisk[n]); diff --git a/block/blk-core.c b/block/blk-core.c index 5436c19..8640b35 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1968,7 +1968,8 @@ generic_make_request_checks(struct bio *bio) * drivers without flush support don't have to worry * about them. */ - if ((bio->bi_rw & (REQ_FLUSH | REQ_FUA)) && !q->flush_flags) { + if ((bio->bi_rw & (REQ_FLUSH | REQ_FUA)) && + !(blk_queue_flush(q) || blk_queue_fua(q))) { bio->bi_rw &= ~(REQ_FLUSH | REQ_FUA); if (!nr_sectors) { err = 0; diff --git a/block/blk-flush.c b/block/blk-flush.c index 070d7c7..e07ca6c 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -95,17 +95,18 @@ enum { static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq); -static unsigned int blk_flush_policy(unsigned int fflags, struct request *rq) +static unsigned int blk_flush_policy(struct request *rq) { + struct request_queue *q = rq->q; unsigned int policy = 0; if (blk_rq_sectors(rq)) policy |= REQ_FSEQ_DATA; - if (fflags & REQ_FLUSH) { + if (blk_queue_flush(q)) { if (rq->cmd_flags & REQ_FLUSH) policy |= REQ_FSEQ_PREFLUSH; - if (!(fflags & REQ_FUA) && (rq->cmd_flags & REQ_FUA)) + if (!blk_queue_fua(q) && (rq->cmd_flags & REQ_FUA)) policy |= REQ_FSEQ_POSTFLUSH; } return policy; @@ -385,8 +386,7 @@ static void mq_flush_data_end_io(struct request *rq, int error) void blk_insert_flush(struct request *rq) { struct request_queue *q = rq->q; - unsigned int fflags = q->flush_flags; /* may change, cache */ - unsigned int policy = blk_flush_policy(fflags, rq); + unsigned int policy = blk_flush_policy(rq); struct blk_flush_queue *fq = blk_get_flush_queue(q, rq->mq_ctx); /* @@ -394,7 +394,7 @@ void blk_insert_flush(struct request *rq) * REQ_FLUSH and FUA for the driver. */ rq->cmd_flags &= ~REQ_FLUSH; - if (!(fflags & REQ_FUA)) + if (!blk_queue_fua(q)) rq->cmd_flags &= ~REQ_FUA; /* diff --git a/block/blk-settings.c b/block/blk-settings.c index c7bb666..77dd6da 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -820,26 +820,6 @@ void blk_queue_update_dma_alignment(struct request_queue
[PATCH 31/35] block, fs: remove old REQ definitions.
From: Mike Christie We no longer use REQ_WRITE. REQ_WRITE_SAME and REQ_DISCARD, so this patch removes them. Signed-off-by: Mike Christie --- include/linux/blk_types.h | 21 ++--- include/linux/fs.h | 21 +++-- include/trace/events/f2fs.h | 1 - 3 files changed, 17 insertions(+), 26 deletions(-) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 6e49c91..b4251ed 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -151,7 +151,6 @@ struct bio { */ enum rq_flag_bits { /* common flags */ - __REQ_WRITE,/* not set, read. set, write */ __REQ_FAILFAST_DEV, /* no driver retries of device errors */ __REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */ __REQ_FAILFAST_DRIVER, /* no driver retries of driver errors */ @@ -159,9 +158,7 @@ enum rq_flag_bits { __REQ_SYNC, /* request is sync (sync write or read) */ __REQ_META, /* metadata io request */ __REQ_PRIO, /* boost priority in cfq */ - __REQ_DISCARD, /* request to discard sectors */ - __REQ_SECURE, /* secure discard (used with __REQ_DISCARD) */ - __REQ_WRITE_SAME, /* write same block many times */ + __REQ_SECURE, /* secure discard (used with REQ_OP_DISCARD) */ __REQ_NOIDLE, /* don't anticipate more IO after this one */ __REQ_INTEGRITY,/* I/O includes block integrity payload */ @@ -197,28 +194,22 @@ enum rq_flag_bits { __REQ_NR_BITS, /* stops here */ }; -#define REQ_WRITE (1ULL << __REQ_WRITE) #define REQ_FAILFAST_DEV (1ULL << __REQ_FAILFAST_DEV) #define REQ_FAILFAST_TRANSPORT (1ULL << __REQ_FAILFAST_TRANSPORT) #define REQ_FAILFAST_DRIVER(1ULL << __REQ_FAILFAST_DRIVER) #define REQ_SYNC (1ULL << __REQ_SYNC) #define REQ_META (1ULL << __REQ_META) #define REQ_PRIO (1ULL << __REQ_PRIO) -#define REQ_DISCARD(1ULL << __REQ_DISCARD) -#define REQ_WRITE_SAME (1ULL << __REQ_WRITE_SAME) #define REQ_NOIDLE (1ULL << __REQ_NOIDLE) #define REQ_INTEGRITY (1ULL << __REQ_INTEGRITY) #define REQ_FAILFAST_MASK \ (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER) #define REQ_COMMON_MASK \ - (REQ_WRITE | REQ_FAILFAST_MASK | REQ_SYNC | REQ_META | REQ_PRIO | \ -REQ_DISCARD | REQ_WRITE_SAME | REQ_NOIDLE | REQ_FLUSH | REQ_FUA | \ -REQ_SECURE | REQ_INTEGRITY) + (REQ_FAILFAST_MASK | REQ_SYNC | REQ_META | REQ_PRIO | REQ_NOIDLE | \ +REQ_FLUSH | REQ_FUA | REQ_SECURE | REQ_INTEGRITY) #define REQ_CLONE_MASK REQ_COMMON_MASK -#define BIO_NO_ADVANCE_ITER_MASK (REQ_DISCARD|REQ_WRITE_SAME) - /* This mask is used for both bio and request merge checking */ #define REQ_NOMERGE_FLAGS \ (REQ_NOMERGE | REQ_STARTED | REQ_SOFTBARRIER | REQ_FLUSH | REQ_FUA | REQ_FLUSH_SEQ) @@ -250,9 +241,9 @@ enum rq_flag_bits { enum req_op { REQ_OP_READ, - REQ_OP_WRITE= REQ_WRITE, - REQ_OP_DISCARD = REQ_DISCARD, - REQ_OP_WRITE_SAME = REQ_WRITE_SAME, + REQ_OP_WRITE, + REQ_OP_DISCARD, /* request to discard sectors */ + REQ_OP_WRITE_SAME, /* write same block many times */ }; typedef unsigned int blk_qc_t; diff --git a/include/linux/fs.h b/include/linux/fs.h index 0a10de0..7b57bb3 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -153,9 +153,10 @@ typedef void (dax_iodone_t)(struct buffer_head *bh_map, int uptodate); #define CHECK_IOVEC_ONLY -1 /* - * The below are the various read and write types that we support. Some of + * The below are the various read and write flags that we support. Some of * them include behavioral modifiers that send information down to the - * block layer and IO scheduler. Terminology: + * block layer and IO scheduler. They should be used along with a req_op. + * Terminology: * * The block layer uses device plugging to defer IO a little bit, in * the hope that we will see more IO very shortly. This increases @@ -194,19 +195,19 @@ typedef void (dax_iodone_t)(struct buffer_head *bh_map, int uptodate); * non-volatile media on completion. * */ -#define RW_MASKREQ_WRITE +#define RW_MASKREQ_OP_WRITE #define RWA_MASK REQ_RAHEAD -#define READ 0 +#define READ REQ_OP_READ #define WRITE RW_MASK #define READA RWA_MASK -#define READ_SYNC (READ | REQ_SYNC) -#define WRITE_SYNC (WRITE | REQ_SYNC | REQ_NOIDLE) -#define WRITE_ODIRECT (WRITE | REQ_SYNC) -#define WRITE_FLUSH(WRITE | REQ_SYNC | REQ_NOIDLE | REQ_FLUSH) -#define WRITE_FUA (WRITE | REQ_SYNC
[PATCH 30/35] block, fs, drivers: do not test bi_rw for REQ_OPs
From: Mike Christie We no longer use the bio->bi_rw field for REQ_OPs: REQ_WRITE, REQ_DISCARD, REQ_WRITE_SAME, so this patch stops checking for them in bi_rw and also removes the related compat code. v2: 1. Remove compat code in __get_request. Signed-off-by: Mike Christie --- block/bio.c | 6 ++--- block/blk-core.c| 34 - block/blk-merge.c | 14 ++-- block/blk-mq.c | 3 +-- drivers/ata/libata-scsi.c | 2 +- drivers/block/brd.c | 2 +- drivers/block/drbd/drbd_main.c | 15 +++-- drivers/block/drbd/drbd_worker.c| 4 ++-- drivers/block/loop.c| 6 ++--- drivers/block/rbd.c | 2 +- drivers/block/rsxx/dma.c| 2 +- drivers/block/umem.c| 2 +- drivers/block/zram/zram_drv.c | 2 +- drivers/ide/ide-floppy.c| 2 +- drivers/lightnvm/rrpc.c | 2 +- drivers/md/bcache/request.c | 10 - drivers/md/dm-cache-target.c| 10 + drivers/md/dm-crypt.c | 2 +- drivers/md/dm-log-writes.c | 2 +- drivers/md/dm-raid1.c | 8 +++ drivers/md/dm-region-hash.c | 4 ++-- drivers/md/dm-stripe.c | 4 ++-- drivers/md/dm-thin.c| 15 - drivers/md/dm.c | 6 ++--- drivers/md/linear.c | 2 +- drivers/md/raid0.c | 2 +- drivers/scsi/osd/osd_initiator.c| 4 ++-- drivers/staging/lustre/lustre/llite/lloop.c | 8 +++ include/linux/bio.h | 15 - include/linux/fs.h | 25 +++-- 30 files changed, 100 insertions(+), 115 deletions(-) diff --git a/block/bio.c b/block/bio.c index 68df2df..fba4c08 100644 --- a/block/bio.c +++ b/block/bio.c @@ -669,10 +669,10 @@ struct bio *bio_clone_bioset(struct bio *bio_src, gfp_t gfp_mask, bio->bi_iter.bi_sector = bio_src->bi_iter.bi_sector; bio->bi_iter.bi_size= bio_src->bi_iter.bi_size; - if (bio->bi_rw & REQ_DISCARD) + if (bio->bi_op == REQ_OP_DISCARD) goto integrity_clone; - if (bio->bi_rw & REQ_WRITE_SAME) { + if (bio->bi_op == REQ_OP_WRITE_SAME) { bio->bi_io_vec[bio->bi_vcnt++] = bio_src->bi_io_vec[0]; goto integrity_clone; } @@ -1795,7 +1795,7 @@ struct bio *bio_split(struct bio *bio, int sectors, * Discards need a mutable bio_vec to accommodate the payload * required by the DSM TRIM and UNMAP commands. */ - if (bio->bi_rw & REQ_DISCARD) + if (bio->bi_op == REQ_OP_DISCARD) split = bio_clone_bioset(bio, gfp, bs); else split = bio_clone_fast(bio, gfp, bs); diff --git a/block/blk-core.c b/block/blk-core.c index 60a0edb..bba1a69 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1151,8 +1151,7 @@ static struct request *__get_request(struct request_list *rl, int op, blk_rq_init(q, rq); blk_rq_set_rl(rq, rl); - /* tmp compat - allow users to check either one for the op */ - rq->cmd_flags = op | op_flags | REQ_ALLOCED; + rq->cmd_flags = op_flags | REQ_ALLOCED; rq->op = op; /* init elvpriv */ @@ -1704,8 +1703,7 @@ void init_request_from_bio(struct request *req, struct bio *bio) { req->cmd_type = REQ_TYPE_FS; - /* tmp compat. Allow users to set bi_op or bi_rw */ - req->cmd_flags |= (bio->bi_rw | bio->bi_op) & REQ_COMMON_MASK; + req->cmd_flags |= bio->bi_rw & REQ_COMMON_MASK; if (bio->bi_rw & REQ_RAHEAD) req->cmd_flags |= REQ_FAILFAST_MASK; @@ -1855,9 +1853,9 @@ static void handle_bad_sector(struct bio *bio) char b[BDEVNAME_SIZE]; printk(KERN_INFO "attempt to access beyond end of device\n"); - printk(KERN_INFO "%s: rw=%ld, want=%Lu, limit=%Lu\n", + printk(KERN_INFO "%s: rw=%d,%ld, want=%Lu, limit=%Lu\n", bdevname(bio->bi_bdev, b), - bio->bi_rw, + bio->bi_op, bio->bi_rw, (unsigned long long)bio_end_sector(bio), (long long)(i_size_read(bio->bi_bdev->bd_inode) >> 9)); } @@ -1978,14 +1976,14 @@ generic_make_request_checks(struct bio *bio) } } - if ((bio->bi_rw & REQ_DISCARD) && + if ((bio->bi_op == REQ_OP_DISCARD) && (!blk_queue_discard(q) || ((bio->bi_rw & REQ_SECURE) && !blk_queue_secdiscard(q { err = -EOPNOTSUPP; goto end_io; } - if
[PATCH 25/35] target: set bi_op to REQ_OP
From: Mike Christie This patch has the target modules set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. This patch is compile tested only. Signed-off-by: Mike Christie Acked-by: Nicholas Bellinger --- drivers/target/target_core_iblock.c | 38 ++--- drivers/target/target_core_pscsi.c | 2 +- 2 files changed, 24 insertions(+), 16 deletions(-) diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c index c352a64..8d3d197 100644 --- a/drivers/target/target_core_iblock.c +++ b/drivers/target/target_core_iblock.c @@ -312,7 +312,8 @@ static void iblock_bio_done(struct bio *bio) } static struct bio * -iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num, int rw) +iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num, int op, + int op_flags) { struct iblock_dev *ib_dev = IBLOCK_DEV(cmd->se_dev); struct bio *bio; @@ -334,7 +335,8 @@ iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num, int rw) bio->bi_private = cmd; bio->bi_end_io = &iblock_bio_done; bio->bi_iter.bi_sector = lba; - bio->bi_rw = rw; + bio->bi_op = op; + bio->bi_rw = op_flags; return bio; } @@ -446,7 +448,7 @@ iblock_execute_write_same(struct se_cmd *cmd) goto fail; cmd->priv = ibr; - bio = iblock_get_bio(cmd, block_lba, 1, WRITE); + bio = iblock_get_bio(cmd, block_lba, 1, REQ_OP_WRITE, 0); if (!bio) goto fail_free_ibr; @@ -459,7 +461,8 @@ iblock_execute_write_same(struct se_cmd *cmd) while (bio_add_page(bio, sg_page(sg), sg->length, sg->offset) != sg->length) { - bio = iblock_get_bio(cmd, block_lba, 1, WRITE); + bio = iblock_get_bio(cmd, block_lba, 1, REQ_OP_WRITE, +0); if (!bio) goto fail_put_bios; @@ -645,7 +648,8 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, struct scatterlist *sg; u32 sg_num = sgl_nents; unsigned bio_cnt; - int rw = 0; + int op_flags = 0; + int op = 0; int i; if (data_direction == DMA_TO_DEVICE) { @@ -656,17 +660,20 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, * is not enabled, or if initiator set the Force Unit Access bit. */ if (q->flush_flags & REQ_FUA) { - if (cmd->se_cmd_flags & SCF_FUA) - rw = WRITE_FUA; - else if (!(q->flush_flags & REQ_FLUSH)) - rw = WRITE_FUA; - else - rw = WRITE; + if (cmd->se_cmd_flags & SCF_FUA) { + op = REQ_OP_WRITE; + op_flags = WRITE_FUA; + } else if (!(q->flush_flags & REQ_FLUSH)) { + op = REQ_OP_WRITE; + op_flags = WRITE_FUA; + } else { + op = REQ_OP_WRITE; + } } else { - rw = WRITE; + op = REQ_OP_WRITE; } } else { - rw = READ; + op = REQ_OP_READ; } ibr = kzalloc(sizeof(struct iblock_req), GFP_KERNEL); @@ -680,7 +687,7 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, return 0; } - bio = iblock_get_bio(cmd, block_lba, sgl_nents, rw); + bio = iblock_get_bio(cmd, block_lba, sgl_nents, op, op_flags); if (!bio) goto fail_free_ibr; @@ -704,7 +711,8 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, bio_cnt = 0; } - bio = iblock_get_bio(cmd, block_lba, sg_num, rw); + bio = iblock_get_bio(cmd, block_lba, sg_num, op, +op_flags); if (!bio) goto fail_put_bios; diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c index de18790..2cf915c 100644 --- a/drivers/target/target_core_pscsi.c +++ b/drivers/target/target_core_pscsi.c @@ -922,7 +922,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, goto fail; if (rw) - bio->bi_rw |= REQ_WRITE; + bio->bi_op = REQ_OP_WRITE; pr_debug("PSCSI: Allocated bio: %p,"
[PATCH 33/35] block, drivers: add REQ_OP_FLUSH operation
From: Mike Christie This adds a REQ_OP_FLUSH operation that is sent to request_fn based drivers by the block layer's flush code, instead of sending requests with the request->cmd_flags REQ_FLUSH bit set. For the following 3 flush related patches, I have not tested every driver. I have only tested scsi with xfs and btrfs. v2. 1. Fix kbuild failures. Forgot to update ubd driver. Signed-off-by: Mike Christie --- Documentation/block/writeback_cache_control.txt | 6 +++--- arch/um/drivers/ubd_kern.c | 2 +- block/blk-flush.c | 6 +++--- drivers/block/loop.c| 4 ++-- drivers/block/nbd.c | 2 +- drivers/block/osdblk.c | 2 +- drivers/block/ps3disk.c | 4 ++-- drivers/block/skd_main.c| 2 +- drivers/block/virtio_blk.c | 2 +- drivers/block/xen-blkfront.c| 8 drivers/ide/ide-disk.c | 2 +- drivers/md/dm.c | 2 +- drivers/mmc/card/block.c| 5 ++--- drivers/mmc/card/queue.h| 2 +- drivers/mtd/mtd_blkdevs.c | 2 +- drivers/nvme/host/pci.c | 2 +- drivers/scsi/sd.c | 7 +++ include/linux/blk_types.h | 1 + include/linux/blkdev.h | 3 +++ kernel/trace/blktrace.c | 5 - 20 files changed, 37 insertions(+), 32 deletions(-) diff --git a/Documentation/block/writeback_cache_control.txt b/Documentation/block/writeback_cache_control.txt index 83407d3..ea5550f 100644 --- a/Documentation/block/writeback_cache_control.txt +++ b/Documentation/block/writeback_cache_control.txt @@ -73,9 +73,9 @@ doing: blk_queue_flush(sdkp->disk->queue, REQ_FLUSH); -and handle empty REQ_FLUSH requests in its prep_fn/request_fn. Note that +and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn. Note that REQ_FLUSH requests with a payload are automatically turned into a sequence -of an empty REQ_FLUSH request followed by the actual write by the block +of an empty REQ_OP_FLUSH request followed by the actual write by the block layer. For devices that also support the FUA bit the block layer needs to be told to pass through the REQ_FUA bit using: @@ -83,4 +83,4 @@ to be told to pass through the REQ_FUA bit using: and the driver must handle write requests that have the REQ_FUA bit set in prep_fn/request_fn. If the FUA bit is not natively supported the block -layer turns it into an empty REQ_FLUSH request after the actual write. +layer turns it into an empty REQ_OP_FLUSH request after the actual write. diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c index 39ba207..a7dc382 100644 --- a/arch/um/drivers/ubd_kern.c +++ b/arch/um/drivers/ubd_kern.c @@ -1286,7 +1286,7 @@ static void do_ubd_request(struct request_queue *q) req = dev->request; - if (req->cmd_flags & REQ_FLUSH) { + if (req->op == REQ_OP_FLUSH) { io_req = kmalloc(sizeof(struct io_thread_req), GFP_ATOMIC); if (io_req == NULL) { diff --git a/block/blk-flush.c b/block/blk-flush.c index e01d3ac..070d7c7 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -29,7 +29,7 @@ * The actual execution of flush is double buffered. Whenever a request * needs to execute PRE or POSTFLUSH, it queues at * fq->flush_queue[fq->flush_pending_idx]. Once certain criteria are met, a - * flush is issued and the pending_idx is toggled. When the flush + * REQ_OP_FLUSH is issued and the pending_idx is toggled. When the flush * completes, all the requests which were pending are proceeded to the next * step. This allows arbitrary merging of different types of FLUSH/FUA * requests. @@ -329,8 +329,8 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq) } flush_rq->cmd_type = REQ_TYPE_FS; - flush_rq->cmd_flags = WRITE_FLUSH | REQ_FLUSH_SEQ; - flush_rq->op = REQ_OP_WRITE; + flush_rq->cmd_flags = REQ_SYNC | REQ_NOIDLE | REQ_FLUSH_SEQ; + flush_rq->op = REQ_OP_FLUSH; flush_rq->rq_disk = first_rq->rq_disk; flush_rq->end_io = flush_end_io; diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 1afc03c..a3d1293 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -536,7 +536,7 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset; if (op_is_write(rq->op)) { - if (rq->cmd_flags & REQ_FLUSH) + if (rq->op == REQ_OP_FLUSH) ret = lo_req_flush(lo, rq); else if (rq->op == REQ_OP_D
[PATCH 27/35] drivers: set request op to REQ_OP
From: Mike Christie This patch has the block drivers use the request->op for REQ_OP operations and cmd_flags for rq_flag_bits. I have only tested scsi and rbd. Signed-off-by: Mike Christie --- drivers/block/loop.c | 6 +++--- drivers/block/mtip32xx/mtip32xx.c | 2 +- drivers/block/nbd.c | 2 +- drivers/block/rbd.c | 2 +- drivers/block/skd_main.c | 11 --- drivers/block/xen-blkfront.c | 8 +--- drivers/md/dm.c | 2 +- drivers/mmc/card/block.c | 7 +++ drivers/mmc/card/queue.c | 6 ++ drivers/mmc/card/queue.h | 5 - drivers/mtd/mtd_blkdevs.c | 2 +- drivers/nvme/host/pci.c | 4 ++-- drivers/scsi/sd.c | 25 - 13 files changed, 44 insertions(+), 38 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 423f4ca..e771bab 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -538,7 +538,7 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) if (rq->cmd_flags & REQ_WRITE) { if (rq->cmd_flags & REQ_FLUSH) ret = lo_req_flush(lo, rq); - else if (rq->cmd_flags & REQ_DISCARD) + else if (rq->op == REQ_OP_DISCARD) ret = lo_discard(lo, rq, pos); else if (lo->transfer) ret = lo_write_transfer(lo, rq, pos); @@ -1653,8 +1653,8 @@ static int loop_queue_rq(struct blk_mq_hw_ctx *hctx, if (lo->lo_state != Lo_bound) return -EIO; - if (lo->use_dio && !(cmd->rq->cmd_flags & (REQ_FLUSH | - REQ_DISCARD))) + if (lo->use_dio && (!(cmd->rq->cmd_flags & REQ_FLUSH) || +cmd->rq->op == REQ_OP_DISCARD)) cmd->use_aio = true; else cmd->use_aio = false; diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 9b180db..3995a9e 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -3670,7 +3670,7 @@ static int mtip_submit_request(struct blk_mq_hw_ctx *hctx, struct request *rq) return -ENXIO; } - if (rq->cmd_flags & REQ_DISCARD) { + if (rq->op == REQ_OP_DISCARD) { int err; err = mtip_send_trim(dd, blk_rq_pos(rq), blk_rq_sectors(rq)); diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index e4c5cc1..dd8f3e9 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -242,7 +242,7 @@ static int nbd_send_req(struct nbd_device *nbd, struct request *req) if (req->cmd_type == REQ_TYPE_DRV_PRIV) type = NBD_CMD_DISC; - else if (req->cmd_flags & REQ_DISCARD) + else if (req->op == REQ_OP_DISCARD) type = NBD_CMD_TRIM; else if (req->cmd_flags & REQ_FLUSH) type = NBD_CMD_FLUSH; diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 4a87678..1d0f464 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3373,7 +3373,7 @@ static void rbd_queue_workfn(struct work_struct *work) goto err; } - if (rq->cmd_flags & REQ_DISCARD) + if (rq->op == REQ_OP_DISCARD) op_type = OBJ_OP_DISCARD; else if (rq->cmd_flags & REQ_WRITE) op_type = OBJ_OP_WRITE; diff --git a/drivers/block/skd_main.c b/drivers/block/skd_main.c index 586f916..f89a0c8 100644 --- a/drivers/block/skd_main.c +++ b/drivers/block/skd_main.c @@ -576,7 +576,6 @@ static void skd_request_fn(struct request_queue *q) struct request *req = NULL; struct skd_scsi_request *scsi_req; struct page *page; - unsigned long io_flags; int error; u32 lba; u32 count; @@ -624,12 +623,11 @@ static void skd_request_fn(struct request_queue *q) lba = (u32)blk_rq_pos(req); count = blk_rq_sectors(req); data_dir = rq_data_dir(req); - io_flags = req->cmd_flags; - if (io_flags & REQ_FLUSH) + if (req->cmd_flags & REQ_FLUSH) flush++; - if (io_flags & REQ_FUA) + if (req->cmd_flags & REQ_FUA) fua++; pr_debug("%s:%s:%d new req=%p lba=%u(0x%x) " @@ -735,7 +733,7 @@ static void skd_request_fn(struct request_queue *q) else skreq->sg_data_dir = SKD_DATA_DIR_HOST_TO_CARD; - if (io_flags & REQ_DISCARD) { + if (req->op == REQ_OP_DISCARD) { page = alloc_page(GFP_ATOMIC | __GFP_ZERO); if (!page) { pr_err("request_fn:Page allocation failed.\n"); @@ -852,9 +850,8 @@ static void skd_end_request(struct skd_device *skdev,
[PATCH 11/35] f2fs: set bi_op to REQ_OP
From: Mike Christie This patch has f2fs set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. This patch is compile tested only. Signed-off-by: Mike Christie --- fs/f2fs/checkpoint.c| 10 ++ fs/f2fs/data.c | 33 - fs/f2fs/f2fs.h | 5 +++-- fs/f2fs/gc.c| 9 ++--- fs/f2fs/inline.c| 3 ++- fs/f2fs/node.c | 8 +--- fs/f2fs/segment.c | 10 +++--- fs/f2fs/trace.c | 7 --- include/trace/events/f2fs.h | 34 +- 9 files changed, 74 insertions(+), 45 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index f55355d..12ca43e 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -55,14 +55,15 @@ static struct page *__get_meta_page(struct f2fs_sb_info *sbi, pgoff_t index, struct f2fs_io_info fio = { .sbi = sbi, .type = META, - .rw = READ_SYNC | REQ_META | REQ_PRIO, + .op = REQ_OP_READ, + .op_flags = READ_SYNC | REQ_META | REQ_PRIO, .old_blkaddr = index, .new_blkaddr = index, .encrypted_page = NULL, }; if (unlikely(!is_meta)) - fio.rw &= ~REQ_META; + fio.op_flags &= ~REQ_META; repeat: page = grab_cache_page(mapping, index); if (!page) { @@ -149,13 +150,14 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages, struct f2fs_io_info fio = { .sbi = sbi, .type = META, - .rw = sync ? (READ_SYNC | REQ_META | REQ_PRIO) : READA, + .op = REQ_OP_READ, + .op_flags = sync ? (READ_SYNC | REQ_META | REQ_PRIO) : READA, .encrypted_page = NULL, }; struct blk_plug plug; if (unlikely(type == META_POR)) - fio.rw &= ~REQ_META; + fio.op_flags &= ~REQ_META; blk_start_plug(&plug); for (; nrpages-- > 0; blkno++) { diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index ff623b2..586658c 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -105,11 +105,12 @@ static void __submit_merged_bio(struct f2fs_bio_info *io) if (!io->bio) return; - if (is_read_io(fio->rw)) + if (is_read_io(fio->op)) trace_f2fs_submit_read_bio(io->sbi->sb, fio, io->bio); else trace_f2fs_submit_write_bio(io->sbi->sb, fio, io->bio); - io->bio->bi_rw = fio->rw; + io->bio->bi_op = fio->op; + io->bio->bi_rw = fio->op_flags; submit_bio(io->bio); io->bio = NULL; @@ -177,10 +178,12 @@ static void __f2fs_submit_merged_bio(struct f2fs_sb_info *sbi, /* change META to META_FLUSH in the checkpoint procedure */ if (type >= META_FLUSH) { io->fio.type = META_FLUSH; + io->fio.op = REQ_OP_WRITE; if (test_opt(sbi, NOBARRIER)) - io->fio.rw = WRITE_FLUSH | REQ_META | REQ_PRIO; + io->fio.op_flags = WRITE_FLUSH | REQ_META | REQ_PRIO; else - io->fio.rw = WRITE_FLUSH_FUA | REQ_META | REQ_PRIO; + io->fio.op_flags = WRITE_FLUSH_FUA | REQ_META | + REQ_PRIO; } __submit_merged_bio(io); out: @@ -215,13 +218,14 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio) f2fs_trace_ios(fio, 0); /* Allocate a new bio */ - bio = __bio_alloc(fio->sbi, fio->new_blkaddr, 1, is_read_io(fio->rw)); + bio = __bio_alloc(fio->sbi, fio->new_blkaddr, 1, is_read_io(fio->op)); if (bio_add_page(bio, page, PAGE_CACHE_SIZE, 0) < PAGE_CACHE_SIZE) { bio_put(bio); return -EFAULT; } - bio->bi_rw = fio->rw; + bio->bi_op = fio->op; + bio->bi_rw = fio->op_flags; submit_bio(bio); return 0; @@ -232,7 +236,7 @@ void f2fs_submit_page_mbio(struct f2fs_io_info *fio) struct f2fs_sb_info *sbi = fio->sbi; enum page_type btype = PAGE_TYPE_OF_BIO(fio->type); struct f2fs_bio_info *io; - bool is_read = is_read_io(fio->rw); + bool is_read = is_read_io(fio->op); struct page *bio_page; io = is_read ? &sbi->read_io : &sbi->write_io[btype]; @@ -247,7 +251,7 @@ void f2fs_submit_page_mbio(struct f2fs_io_info *fio) inc_page_count(sbi, F2FS_WRITEBACK); if (io->bio && (io->last_block_in_bio != fio->new_blkaddr - 1 || - io->fio.rw != fio->rw)) + (io->fio.op != fio->op || io->fio.op_flags != fio->op_flags))) __submit_merged_bio(io); alloc_new: if (io->bio == NULL) { @@ -345,7 +349,7 @@ int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index
[PATCH 26/35] block: set op to REQ_OP
From: Mike Christie This patch converts the request related block layer code to set request->op to a REQ_OP and cmd_flags to rq_flag_bits. There is some tmp compat code when setting up cmd_flags so it still carries both the op and flags. It will be removed in in later patches in this set when I have converted all drivers. I have not been able to test the mq paths with real mq hardware. Signed-off-by: Mike Christie --- block/blk-core.c | 60 ++ block/blk-flush.c | 1 + block/blk-merge.c | 10 block/blk-mq.c | 38 - block/cfq-iosched.c| 53 +++- block/elevator.c | 8 +++ include/linux/blk-cgroup.h | 13 +- include/linux/blkdev.h | 28 +++--- include/linux/elevator.h | 4 ++-- 9 files changed, 120 insertions(+), 95 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 74aa201..60a0edb 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -959,10 +959,10 @@ static void __freed_request(struct request_list *rl, int sync) * A request has just been released. Account for it, update the full and * congestion status, wake up any waiters. Called under q->queue_lock. */ -static void freed_request(struct request_list *rl, unsigned int flags) +static void freed_request(struct request_list *rl, int op, unsigned int flags) { struct request_queue *q = rl->q; - int sync = rw_is_sync(flags); + int sync = rw_is_sync(op, flags); q->nr_rqs[sync]--; rl->count[sync]--; @@ -1054,7 +1054,8 @@ static struct io_context *rq_ioc(struct bio *bio) /** * __get_request - get a free request * @rl: request list to allocate from - * @rw_flags: RW and SYNC flags + * @op: REQ_OP_READ/REQ_OP_WRITE + * @op_flags: rq_flag_bits * @bio: bio to allocate request for (can be %NULL) * @gfp_mask: allocation mask * @@ -1065,21 +1066,22 @@ static struct io_context *rq_ioc(struct bio *bio) * Returns ERR_PTR on failure, with @q->queue_lock held. * Returns request pointer on success, with @q->queue_lock *not held*. */ -static struct request *__get_request(struct request_list *rl, int rw_flags, -struct bio *bio, gfp_t gfp_mask) +static struct request *__get_request(struct request_list *rl, int op, +int op_flags, struct bio *bio, +gfp_t gfp_mask) { struct request_queue *q = rl->q; struct request *rq; struct elevator_type *et = q->elevator->type; struct io_context *ioc = rq_ioc(bio); struct io_cq *icq = NULL; - const bool is_sync = rw_is_sync(rw_flags) != 0; + const bool is_sync = rw_is_sync(op, op_flags) != 0; int may_queue; if (unlikely(blk_queue_dying(q))) return ERR_PTR(-ENODEV); - may_queue = elv_may_queue(q, rw_flags); + may_queue = elv_may_queue(q, op, op_flags); if (may_queue == ELV_MQUEUE_NO) goto rq_starved; @@ -1123,7 +1125,7 @@ static struct request *__get_request(struct request_list *rl, int rw_flags, /* * Decide whether the new request will be managed by elevator. If -* so, mark @rw_flags and increment elvpriv. Non-zero elvpriv will +* so, mark @op_flags and increment elvpriv. Non-zero elvpriv will * prevent the current elevator from being destroyed until the new * request is freed. This guarantees icq's won't be destroyed and * makes creating new ones safe. @@ -1132,14 +1134,14 @@ static struct request *__get_request(struct request_list *rl, int rw_flags, * it will be created after releasing queue_lock. */ if (blk_rq_should_init_elevator(bio) && !blk_queue_bypass(q)) { - rw_flags |= REQ_ELVPRIV; + op_flags |= REQ_ELVPRIV; q->nr_rqs_elvpriv++; if (et->icq_cache && ioc) icq = ioc_lookup_icq(ioc, q); } if (blk_queue_io_stat(q)) - rw_flags |= REQ_IO_STAT; + op_flags |= REQ_IO_STAT; spin_unlock_irq(q->queue_lock); /* allocate and init request */ @@ -1149,10 +1151,12 @@ static struct request *__get_request(struct request_list *rl, int rw_flags, blk_rq_init(q, rq); blk_rq_set_rl(rq, rl); - rq->cmd_flags = rw_flags | REQ_ALLOCED; + /* tmp compat - allow users to check either one for the op */ + rq->cmd_flags = op | op_flags | REQ_ALLOCED; + rq->op = op; /* init elvpriv */ - if (rw_flags & REQ_ELVPRIV) { + if (op_flags & REQ_ELVPRIV) { if (unlikely(et->icq_cache && !icq)) { if (ioc) icq = ioc_create_icq(ioc, q, gfp_mask); @@ -1178,7 +1182,7 @@ out: if
[PATCH 28/35] blktrace: get op from req->op/bio->bi_op
From: Mike Christie The bio and request struct now store the operation in bio->bi_op/request->op. This patch has blktrace not check bi_rw/cmd_flags. This patch is only compile tested. Signed-off-by: Mike Christie --- include/linux/blktrace_api.h | 2 +- include/trace/events/bcache.h | 12 ++ include/trace/events/block.h | 31 +- kernel/trace/blktrace.c | 52 +++ 4 files changed, 57 insertions(+), 40 deletions(-) diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h index afc1343..ee25ba4 100644 --- a/include/linux/blktrace_api.h +++ b/include/linux/blktrace_api.h @@ -109,7 +109,7 @@ static inline int blk_cmd_buf_len(struct request *rq) } extern void blk_dump_cmd(char *buf, struct request *rq); -extern void blk_fill_rwbs(char *rwbs, u32 rw, int bytes); +extern void blk_fill_rwbs(char *rwbs, int op, u32 rw, int bytes); #endif /* CONFIG_EVENT_TRACING && CONFIG_BLOCK */ diff --git a/include/trace/events/bcache.h b/include/trace/events/bcache.h index 981acf7..8abe564 100644 --- a/include/trace/events/bcache.h +++ b/include/trace/events/bcache.h @@ -27,7 +27,8 @@ DECLARE_EVENT_CLASS(bcache_request, __entry->sector = bio->bi_iter.bi_sector; __entry->orig_sector= bio->bi_iter.bi_sector - 16; __entry->nr_sector = bio->bi_iter.bi_size >> 9; - blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size); + blk_fill_rwbs(__entry->rwbs, bio->bi_op, bio->bi_rw, + bio->bi_iter.bi_size); ), TP_printk("%d,%d %s %llu + %u (from %d,%d @ %llu)", @@ -101,7 +102,8 @@ DECLARE_EVENT_CLASS(bcache_bio, __entry->dev= bio->bi_bdev->bd_dev; __entry->sector = bio->bi_iter.bi_sector; __entry->nr_sector = bio->bi_iter.bi_size >> 9; - blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size); + blk_fill_rwbs(__entry->rwbs, bio->bi_op, bio->bi_rw, + bio->bi_iter.bi_size); ), TP_printk("%d,%d %s %llu + %u", @@ -136,7 +138,8 @@ TRACE_EVENT(bcache_read, __entry->dev= bio->bi_bdev->bd_dev; __entry->sector = bio->bi_iter.bi_sector; __entry->nr_sector = bio->bi_iter.bi_size >> 9; - blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size); + blk_fill_rwbs(__entry->rwbs, bio->bi_op, bio->bi_rw, + bio->bi_iter.bi_size); __entry->cache_hit = hit; __entry->bypass = bypass; ), @@ -167,7 +170,8 @@ TRACE_EVENT(bcache_write, __entry->inode = inode; __entry->sector = bio->bi_iter.bi_sector; __entry->nr_sector = bio->bi_iter.bi_size >> 9; - blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size); + blk_fill_rwbs(__entry->rwbs, bio->bi_op, bio->bi_rw, + bio->bi_iter.bi_size); __entry->writeback = writeback; __entry->bypass = bypass; ), diff --git a/include/trace/events/block.h b/include/trace/events/block.h index e8a5eca..4416dcd 100644 --- a/include/trace/events/block.h +++ b/include/trace/events/block.h @@ -84,7 +84,8 @@ DECLARE_EVENT_CLASS(block_rq_with_error, 0 : blk_rq_sectors(rq); __entry->errors= rq->errors; - blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, blk_rq_bytes(rq)); + blk_fill_rwbs(__entry->rwbs, rq->op, rq->cmd_flags, + blk_rq_bytes(rq)); blk_dump_cmd(__get_str(cmd), rq); ), @@ -162,7 +163,7 @@ TRACE_EVENT(block_rq_complete, __entry->nr_sector = nr_bytes >> 9; __entry->errors= rq->errors; - blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, nr_bytes); + blk_fill_rwbs(__entry->rwbs, rq->op, rq->cmd_flags, nr_bytes); blk_dump_cmd(__get_str(cmd), rq); ), @@ -198,7 +199,8 @@ DECLARE_EVENT_CLASS(block_rq, __entry->bytes = (rq->cmd_type == REQ_TYPE_BLOCK_PC) ? blk_rq_bytes(rq) : 0; - blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, blk_rq_bytes(rq)); + blk_fill_rwbs(__entry->rwbs, rq->op, rq->cmd_flags, + blk_rq_bytes(rq)); blk_dump_cmd(__get_str(cmd), rq); memcpy(__entry->comm, current->comm, TASK_COMM_LEN); ), @@ -272,7 +274,8 @@ TRACE_EVENT(block_bio_bounce, bio->bi_bdev->bd_dev : 0; __entry->sector = bio->bi_iter.bi_sector; __entry->nr_sector
[PATCH 20/35] dm: pass dm stats data dir instead of bi_rw
From: Mike Christie It looks like dm stats cares about the data direction (READ vs WRITE) and does not need the bio/request flags. Commands like REQ_FLUSH, REQ_DISCARD and REQ_WRITE_SAME are currently always set with REQ_WRITE, so the extra check for REQ_DISCARD in dm_stats_account_io is not needed. This patch has it use the bio and request data_dir helpers instead of accessing the bi_rw/cmd_flags directly. This makes the next patches that remove the operation from the cmd_flags and bi_rw easier, because we will no longer have the REQ_WRITE bit set for operations like discards. This patch is compile tested only. v2: 1. Merged Mike Snitzer's fixes to pass in int instead of unsigned long. 2. Fix 80 char col issues. Signed-off-by: Mike Christie --- drivers/md/dm-stats.c | 9 - drivers/md/dm.c | 21 - 2 files changed, 16 insertions(+), 14 deletions(-) diff --git a/drivers/md/dm-stats.c b/drivers/md/dm-stats.c index 8289804..4fba26c 100644 --- a/drivers/md/dm-stats.c +++ b/drivers/md/dm-stats.c @@ -514,11 +514,10 @@ static void dm_stat_round(struct dm_stat *s, struct dm_stat_shared *shared, } static void dm_stat_for_entry(struct dm_stat *s, size_t entry, - unsigned long bi_rw, sector_t len, + int idx, sector_t len, struct dm_stats_aux *stats_aux, bool end, unsigned long duration_jiffies) { - unsigned long idx = bi_rw & REQ_WRITE; struct dm_stat_shared *shared = &s->stat_shared[entry]; struct dm_stat_percpu *p; @@ -584,7 +583,7 @@ static void dm_stat_for_entry(struct dm_stat *s, size_t entry, #endif } -static void __dm_stat_bio(struct dm_stat *s, unsigned long bi_rw, +static void __dm_stat_bio(struct dm_stat *s, int bi_rw, sector_t bi_sector, sector_t end_sector, bool end, unsigned long duration_jiffies, struct dm_stats_aux *stats_aux) @@ -645,8 +644,8 @@ void dm_stats_account_io(struct dm_stats *stats, unsigned long bi_rw, last = raw_cpu_ptr(stats->last); stats_aux->merged = (bi_sector == (ACCESS_ONCE(last->last_sector) && - ((bi_rw & (REQ_WRITE | REQ_DISCARD)) == - (ACCESS_ONCE(last->last_rw) & (REQ_WRITE | REQ_DISCARD))) + ((bi_rw == WRITE) == + (ACCESS_ONCE(last->last_rw) == WRITE)) )); ACCESS_ONCE(last->last_sector) = end_sector; ACCESS_ONCE(last->last_rw) = bi_rw; diff --git a/drivers/md/dm.c b/drivers/md/dm.c index fe03b26..38c7d93 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -723,8 +723,9 @@ static void start_io_acct(struct dm_io *io) atomic_inc_return(&md->pending[rw])); if (unlikely(dm_stats_used(&md->stats))) - dm_stats_account_io(&md->stats, bio->bi_rw, bio->bi_iter.bi_sector, - bio_sectors(bio), false, 0, &io->stats_aux); + dm_stats_account_io(&md->stats, bio_data_dir(bio), + bio->bi_iter.bi_sector, bio_sectors(bio), + false, 0, &io->stats_aux); } static void end_io_acct(struct dm_io *io) @@ -738,8 +739,9 @@ static void end_io_acct(struct dm_io *io) generic_end_io_acct(rw, &dm_disk(md)->part0, io->start_time); if (unlikely(dm_stats_used(&md->stats))) - dm_stats_account_io(&md->stats, bio->bi_rw, bio->bi_iter.bi_sector, - bio_sectors(bio), true, duration, &io->stats_aux); + dm_stats_account_io(&md->stats, bio_data_dir(bio), + bio->bi_iter.bi_sector, bio_sectors(bio), + true, duration, &io->stats_aux); /* * After this is decremented the bio must not be touched if it is @@ -1121,9 +1123,9 @@ static void rq_end_stats(struct mapped_device *md, struct request *orig) if (unlikely(dm_stats_used(&md->stats))) { struct dm_rq_target_io *tio = tio_from_request(orig); tio->duration_jiffies = jiffies - tio->duration_jiffies; - dm_stats_account_io(&md->stats, orig->cmd_flags, blk_rq_pos(orig), - tio->n_sectors, true, tio->duration_jiffies, - &tio->stats_aux); + dm_stats_account_io(&md->stats, rq_data_dir(orig), + blk_rq_pos(orig), tio->n_sectors, true, + tio->duration_jiffies, &tio->stats_aux); } } @@ -2069,8 +2071,9 @@ static void dm_start_request(struct mapped_device *md, struct request *orig) struct dm_rq_
[PATCH 24/35] xen: set bi_op to REQ_OP
From: Mike Christie This patch has xen set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. This patch is compile tested only. Signed-off-by: Mike Christie --- drivers/block/xen-blkback/blkback.c | 29 + 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 79fe493..854ecca 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -501,7 +501,7 @@ static int xen_vbd_translate(struct phys_req *req, struct xen_blkif *blkif, struct xen_vbd *vbd = &blkif->vbd; int rc = -EACCES; - if ((operation != READ) && vbd->readonly) + if ((operation != REQ_OP_READ) && vbd->readonly) goto out; if (likely(req->nr_sects)) { @@ -1014,7 +1014,7 @@ static int dispatch_discard_io(struct xen_blkif_ring *ring, preq.sector_number = req->u.discard.sector_number; preq.nr_sects = req->u.discard.nr_sectors; - err = xen_vbd_translate(&preq, blkif, WRITE); + err = xen_vbd_translate(&preq, blkif, REQ_OP_WRITE); if (err) { pr_warn("access denied: DISCARD [%llu->%llu] on dev=%04x\n", preq.sector_number, @@ -1229,6 +1229,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring, struct bio **biolist = pending_req->biolist; int i, nbio = 0; int operation; + int operation_flags = 0; struct blk_plug plug; bool drain = false; struct grant_page **pages = pending_req->segments; @@ -1247,17 +1248,19 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring, switch (req_operation) { case BLKIF_OP_READ: ring->st_rd_req++; - operation = READ; + operation = REQ_OP_READ; break; case BLKIF_OP_WRITE: ring->st_wr_req++; - operation = WRITE_ODIRECT; + operation = REQ_OP_WRITE; + operation_flags = WRITE_ODIRECT; break; case BLKIF_OP_WRITE_BARRIER: drain = true; case BLKIF_OP_FLUSH_DISKCACHE: ring->st_f_req++; - operation = WRITE_FLUSH; + operation = REQ_OP_WRITE; + operation_flags = WRITE_FLUSH; break; default: operation = 0; /* make gcc happy */ @@ -1269,7 +1272,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring, nseg = req->operation == BLKIF_OP_INDIRECT ? req->u.indirect.nr_segments : req->u.rw.nr_segments; - if (unlikely(nseg == 0 && operation != WRITE_FLUSH) || + if (unlikely(nseg == 0 && operation_flags != WRITE_FLUSH) || unlikely((req->operation != BLKIF_OP_INDIRECT) && (nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) || unlikely((req->operation == BLKIF_OP_INDIRECT) && @@ -1310,7 +1313,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring, if (xen_vbd_translate(&preq, ring->blkif, operation) != 0) { pr_debug("access denied: %s of [%llu,%llu] on dev=%04x\n", -operation == READ ? "read" : "write", +operation == REQ_OP_READ ? "read" : "write", preq.sector_number, preq.sector_number + preq.nr_sects, ring->blkif->vbd.pdevice); @@ -1369,7 +1372,8 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring, bio->bi_private = pending_req; bio->bi_end_io = end_block_io_op; bio->bi_iter.bi_sector = preq.sector_number; - bio->bi_rw = operation; + bio->bi_op = operation; + bio->bi_rw = operation_flags; } preq.sector_number += seg[i].nsec; @@ -1377,7 +1381,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring, /* This will be hit if the operation was a flush or discard. */ if (!bio) { - BUG_ON(operation != WRITE_FLUSH); + BUG_ON(operation_flags != WRITE_FLUSH); bio = bio_alloc(GFP_KERNEL, 0); if (unlikely(bio == NULL)) @@ -1387,7 +1391,8 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring, bio->bi_bdev= preq.bdev; bio->bi_private = pending_req; bio->bi_end_io = end_block_io_op; - bio->bi_rw = operation; + bio->bi_op = operation; + bio->bi_rw = operation_flags; } atomic_set(&pending_req->pendcnt, nbio); @@ -1399,9 +1404,9 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring, /* Let the I/Os go.. */ blk_finish_plug(&plug); -
[PATCH 09/35] btrfs: update __btrfs_map_block for bi_op transition
From: Mike Christie We no longer pass in a bitmap of rq_flag_bits bits to __btrfs_map_block. It will always be a REQ_OP, or the btrfs specific REQ_GET_READ_MIRRORS, so this drops the bit tests. Signed-off-by: Mike Christie --- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/inode.c | 2 +- fs/btrfs/volumes.c | 55 +++--- fs/btrfs/volumes.h | 4 ++-- 4 files changed, 34 insertions(+), 29 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 083783b..1db6bd0 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2043,7 +2043,7 @@ int btrfs_discard_extent(struct btrfs_root *root, u64 bytenr, /* Tell the block device(s) that the sectors can be discarded */ - ret = btrfs_map_block(root->fs_info, REQ_DISCARD, + ret = btrfs_map_block(root->fs_info, REQ_OP_DISCARD, bytenr, &num_bytes, &bbio, 0); /* Error condition is -ENOMEM */ if (!ret) { diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 19f38f5..49842d2 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8279,7 +8279,7 @@ static int btrfs_submit_direct_hook(int rw, struct btrfs_dio_private *dip, int i; map_length = orig_bio->bi_iter.bi_size; - ret = btrfs_map_block(root->fs_info, rw, start_sector << 9, + ret = btrfs_map_block(root->fs_info, orig_bio->bi_op, start_sector << 9, &map_length, NULL, 0); if (ret) return -EIO; diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 2be39f6..62fcbd2 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -5207,7 +5207,7 @@ void btrfs_put_bbio(struct btrfs_bio *bbio) kfree(bbio); } -static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw, +static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int op, u64 logical, u64 *length, struct btrfs_bio **bbio_ret, int mirror_num, int need_raid_map) @@ -5285,7 +5285,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw, raid56_full_stripe_start *= full_stripe_len; } - if (rw & REQ_DISCARD) { + if (op == REQ_OP_DISCARD) { /* we don't discard raid56 yet */ if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) { ret = -EOPNOTSUPP; @@ -5298,7 +5298,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw, For other RAID types and for RAID[56] reads, just allow a single stripe (on a single disk). */ if ((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) && - (rw & REQ_WRITE)) { + (op == REQ_OP_WRITE)) { max_len = stripe_len * nr_data_stripes(map) - (offset - raid56_full_stripe_start); } else { @@ -5323,8 +5323,8 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw, btrfs_dev_replace_set_lock_blocking(dev_replace); if (dev_replace_is_ongoing && mirror_num == map->num_stripes + 1 && - !(rw & (REQ_WRITE | REQ_DISCARD | REQ_GET_READ_MIRRORS)) && - dev_replace->tgtdev != NULL) { + op != REQ_OP_WRITE && op != REQ_OP_DISCARD && + op != REQ_GET_READ_MIRRORS && dev_replace->tgtdev != NULL) { /* * in dev-replace case, for repair case (that's the only * case where the mirror is selected explicitly when @@ -5411,15 +5411,17 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw, (offset + *length); if (map->type & BTRFS_BLOCK_GROUP_RAID0) { - if (rw & REQ_DISCARD) + if (op == REQ_OP_DISCARD) num_stripes = min_t(u64, map->num_stripes, stripe_nr_end - stripe_nr_orig); stripe_nr = div_u64_rem(stripe_nr, map->num_stripes, &stripe_index); - if (!(rw & (REQ_WRITE | REQ_DISCARD | REQ_GET_READ_MIRRORS))) + if (op != REQ_OP_WRITE && op != REQ_OP_DISCARD && + op != REQ_GET_READ_MIRRORS) mirror_num = 1; } else if (map->type & BTRFS_BLOCK_GROUP_RAID1) { - if (rw & (REQ_WRITE | REQ_DISCARD | REQ_GET_READ_MIRRORS)) + if (op == REQ_OP_WRITE || op == REQ_OP_DISCARD || + op == REQ_GET_READ_MIRRORS) num_stripes = map->num_stripes; else if (mirror_num) stripe_index = mirror_num - 1; @@ -5432,7 +5434,8 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw, } } else if (map->type & BTRFS_BLOCK_GROUP_DUP) {
[PATCH 07/35] btrfs: have submit_one_bio users setup bio bi_op
From: Mike Christie This patch has btrfs's submit_one_bio callers set the bio->bi_op to a REQ_OP and the bi_rw to rq_flag_bits. The next patches will continue to convert btrfs, so submit_bio_hook and merge_bio_hook related code will be modified to take only the bio. I did not do it in this patch to try and keep it smaller. Signed-off-by: Mike Christie --- fs/btrfs/extent_io.c | 88 +++- 1 file changed, 45 insertions(+), 43 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 454100e..4472d69 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2377,7 +2377,7 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset, int read_mode; int ret; - BUG_ON(failed_bio->bi_rw & REQ_WRITE); + BUG_ON(failed_bio->bi_op == REQ_OP_WRITE); ret = btrfs_get_io_failure_record(inode, start, end, &failrec); if (ret) @@ -2403,6 +2403,8 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset, free_io_failure(inode, failrec); return -EIO; } + bio->bi_op = REQ_OP_READ; + bio->bi_rw = read_mode; pr_debug("Repair Read Error: submitting new read[%#x] to this_mirror=%d, in_validation=%d\n", read_mode, failrec->this_mirror, failrec->in_validation); @@ -2714,8 +2716,8 @@ struct bio *btrfs_io_bio_alloc(gfp_t gfp_mask, unsigned int nr_iovecs) } -static int __must_check submit_one_bio(int rw, struct bio *bio, - int mirror_num, unsigned long bio_flags) +static int __must_check submit_one_bio(struct bio *bio, int mirror_num, + unsigned long bio_flags) { int ret = 0; struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1; @@ -2726,12 +2728,12 @@ static int __must_check submit_one_bio(int rw, struct bio *bio, start = page_offset(page) + bvec->bv_offset; bio->bi_private = NULL; - bio->bi_rw = rw; bio_get(bio); if (tree->ops && tree->ops->submit_bio_hook) - ret = tree->ops->submit_bio_hook(page->mapping->host, rw, bio, - mirror_num, bio_flags, start); + ret = tree->ops->submit_bio_hook(page->mapping->host, +bio->bi_rw, bio, mirror_num, +bio_flags, start); else btrfsic_submit_bio(bio); @@ -2739,20 +2741,20 @@ static int __must_check submit_one_bio(int rw, struct bio *bio, return ret; } -static int merge_bio(int rw, struct extent_io_tree *tree, struct page *page, +static int merge_bio(struct extent_io_tree *tree, struct page *page, unsigned long offset, size_t size, struct bio *bio, unsigned long bio_flags) { int ret = 0; if (tree->ops && tree->ops->merge_bio_hook) - ret = tree->ops->merge_bio_hook(rw, page, offset, size, bio, - bio_flags); + ret = tree->ops->merge_bio_hook(bio->bi_op, page, offset, size, + bio, bio_flags); BUG_ON(ret < 0); return ret; } -static int submit_extent_page(int rw, struct extent_io_tree *tree, +static int submit_extent_page(int op, int op_flags, struct extent_io_tree *tree, struct writeback_control *wbc, struct page *page, sector_t sector, size_t size, unsigned long offset, @@ -2780,10 +2782,9 @@ static int submit_extent_page(int rw, struct extent_io_tree *tree, if (prev_bio_flags != bio_flags || !contig || force_bio_submit || - merge_bio(rw, tree, page, offset, page_size, bio, bio_flags) || + merge_bio(tree, page, offset, page_size, bio, bio_flags) || bio_add_page(bio, page, page_size, offset) < page_size) { - ret = submit_one_bio(rw, bio, mirror_num, -prev_bio_flags); + ret = submit_one_bio(bio, mirror_num, prev_bio_flags); if (ret < 0) { *bio_ret = NULL; return ret; @@ -2804,6 +2805,8 @@ static int submit_extent_page(int rw, struct extent_io_tree *tree, bio_add_page(bio, page, page_size, offset); bio->bi_end_io = end_io_func; bio->bi_private = tree; + bio->bi_op = op; + bio->bi_rw = op_flags; if (wbc) { wbc_init_bio(wbc, bio); wbc_account_io(wbc, page, page_size); @@ -2812,7 +2815,7 @@ static int submit_extent_page(int rw, struct extent_io_tree *tree, if (bio_ret) *bio_ret = bio;
[PATCH 06/35] direct-io: set bi_op to REQ_OP
From: Mike Christie This patch has the dio code set the bio bi_op to a REQ_OP. It also begins to convert btrfs's dio_submit_t related code, because of the submit_io callout use. In the btrfs_submit_direct change, I OR'd the op and flag back together. It is only temporary. The next patch will completely convert all the btrfs code paths. Signed-off-by: Mike Christie --- fs/btrfs/inode.c | 9 + fs/direct-io.c | 35 +-- include/linux/fs.h | 2 +- 3 files changed, 27 insertions(+), 19 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 25dcff7..3a7fe66 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8380,14 +8380,14 @@ out_err: return 0; } -static void btrfs_submit_direct(int rw, struct bio *dio_bio, - struct inode *inode, loff_t file_offset) +static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, + loff_t file_offset) { struct btrfs_dio_private *dip = NULL; struct bio *io_bio = NULL; struct btrfs_io_bio *btrfs_bio; int skip_sum; - int write = rw & REQ_WRITE; + bool write = (dio_bio->bi_op == REQ_OP_WRITE); int ret = 0; skip_sum = BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM; @@ -8438,7 +8438,8 @@ static void btrfs_submit_direct(int rw, struct bio *dio_bio, dio_data->unsubmitted_oe_range_end; } - ret = btrfs_submit_direct_hook(rw, dip, skip_sum); + ret = btrfs_submit_direct_hook(dio_bio->bi_op | dio_bio->bi_rw, dip, + skip_sum); if (!ret) return; diff --git a/fs/direct-io.c b/fs/direct-io.c index 1bce5c3..7cabf74 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -108,7 +108,8 @@ struct dio_submit { /* dio_state communicated between submission path and end_io */ struct dio { int flags; /* doesn't change */ - int rw; + int op; + int op_flags; blk_qc_t bio_cookie; struct block_device *bio_bdev; struct inode *inode; @@ -163,7 +164,7 @@ static inline int dio_refill_pages(struct dio *dio, struct dio_submit *sdio) ret = iov_iter_get_pages(sdio->iter, dio->pages, LONG_MAX, DIO_PAGES, &sdio->from); - if (ret < 0 && sdio->blocks_available && (dio->rw & WRITE)) { + if (ret < 0 && sdio->blocks_available && (dio->op == REQ_OP_WRITE)) { struct page *page = ZERO_PAGE(0); /* * A memory fault, but the filesystem has some outstanding @@ -242,7 +243,8 @@ static ssize_t dio_complete(struct dio *dio, loff_t offset, ssize_t ret, transferred = dio->result; /* Check for short read case */ - if ((dio->rw == READ) && ((offset + transferred) > dio->i_size)) + if ((dio->op == REQ_OP_READ) && + ((offset + transferred) > dio->i_size)) transferred = dio->i_size - offset; } @@ -265,7 +267,7 @@ static ssize_t dio_complete(struct dio *dio, loff_t offset, ssize_t ret, inode_dio_end(dio->inode); if (is_async) { - if (dio->rw & WRITE) { + if (dio->op == REQ_OP_WRITE) { int err; err = generic_write_sync(dio->iocb->ki_filp, offset, @@ -374,7 +376,8 @@ dio_bio_alloc(struct dio *dio, struct dio_submit *sdio, bio->bi_bdev = bdev; bio->bi_iter.bi_sector = first_sector; - bio->bi_rw = dio->rw; + bio->bi_op = dio->op; + bio->bi_rw = dio->op_flags; if (dio->is_async) bio->bi_end_io = dio_bio_end_aio; else @@ -402,14 +405,13 @@ static inline void dio_bio_submit(struct dio *dio, struct dio_submit *sdio) dio->refcount++; spin_unlock_irqrestore(&dio->bio_lock, flags); - if (dio->is_async && dio->rw == READ && dio->should_dirty) + if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) bio_set_pages_dirty(bio); dio->bio_bdev = bio->bi_bdev; if (sdio->submit_io) { - sdio->submit_io(dio->rw, bio, dio->inode, - sdio->logical_offset_in_bio); + sdio->submit_io(bio, dio->inode, sdio->logical_offset_in_bio); dio->bio_cookie = BLK_QC_T_NONE; } else dio->bio_cookie = submit_bio(bio); @@ -477,14 +479,14 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio) if (bio->bi_error) dio->io_error = -EIO; - if (dio->is_async && dio->rw == READ && dio->should_dirty) { + if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) { err = bio->bi_error; bio_check_pages_dirty(bio); /* transfers ownership */ } else {
[PATCH 23/35] md/raid: set bi_op to REQ_OP
From: Mike Christie This patch has md/raid set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. This patch is compile tested only. Signed-off-by: Mike Christie --- drivers/md/bitmap.c | 2 +- drivers/md/dm-raid.c | 5 +++-- drivers/md/md.c | 11 +++ drivers/md/md.h | 3 ++- drivers/md/raid1.c | 34 drivers/md/raid10.c | 50 ++-- drivers/md/raid5-cache.c | 25 +++- drivers/md/raid5.c | 48 ++ 8 files changed, 101 insertions(+), 77 deletions(-) diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c index c8e4124..6c241f8 100644 --- a/drivers/md/bitmap.c +++ b/drivers/md/bitmap.c @@ -160,7 +160,7 @@ static int read_sb_page(struct mddev *mddev, loff_t offset, if (sync_page_io(rdev, target, roundup(size, bdev_logical_block_size(rdev->bdev)), -page, READ, true)) { +page, REQ_OP_READ, 0, true)) { page->index = index; return 0; } diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index a090121..43a749c 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -792,7 +792,7 @@ static int read_disk_sb(struct md_rdev *rdev, int size) if (rdev->sb_loaded) return 0; - if (!sync_page_io(rdev, 0, size, rdev->sb_page, READ, 1)) { + if (!sync_page_io(rdev, 0, size, rdev->sb_page, REQ_OP_READ, 0, 1)) { DMERR("Failed to read superblock of device at position %d", rdev->raid_disk); md_error(rdev->mddev, rdev); @@ -1646,7 +1646,8 @@ static void attempt_restore_of_faulty_devices(struct raid_set *rs) for (i = 0; i < rs->md.raid_disks; i++) { r = &rs->dev[i].rdev; if (test_bit(Faulty, &r->flags) && r->sb_page && - sync_page_io(r, 0, r->sb_size, r->sb_page, READ, 1)) { + sync_page_io(r, 0, r->sb_size, r->sb_page, REQ_OP_READ, 0, +1)) { DMINFO("Faulty %s device #%d has readable super block." " Attempting to revive it.", rs->raid_type->name, i); diff --git a/drivers/md/md.c b/drivers/md/md.c index 6f6102e..8cdd37f 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -391,6 +391,7 @@ static void submit_flushes(struct work_struct *ws) bi->bi_end_io = md_end_flush; bi->bi_private = rdev; bi->bi_bdev = rdev->bdev; + bi->bi_op = REQ_OP_WRITE; bi->bi_rw = WRITE_FLUSH; atomic_inc(&mddev->flush_pending); submit_bio(bi); @@ -737,6 +738,7 @@ void md_super_write(struct mddev *mddev, struct md_rdev *rdev, bio_add_page(bio, page, size, 0); bio->bi_private = rdev; bio->bi_end_io = super_written; + bio->bi_op = REQ_OP_WRITE; bio->bi_rw = WRITE_FLUSH_FUA; atomic_inc(&mddev->pending_writes); @@ -750,14 +752,15 @@ void md_super_wait(struct mddev *mddev) } int sync_page_io(struct md_rdev *rdev, sector_t sector, int size, -struct page *page, int rw, bool metadata_op) +struct page *page, int op, int op_flags, bool metadata_op) { struct bio *bio = bio_alloc_mddev(GFP_NOIO, 1, rdev->mddev); int ret; bio->bi_bdev = (metadata_op && rdev->meta_bdev) ? rdev->meta_bdev : rdev->bdev; - bio->bi_rw = rw; + bio->bi_op = op; + bio->bi_rw = op_flags; if (metadata_op) bio->bi_iter.bi_sector = sector + rdev->sb_start; else if (rdev->mddev->reshape_position != MaxSector && @@ -783,7 +786,7 @@ static int read_disk_sb(struct md_rdev *rdev, int size) if (rdev->sb_loaded) return 0; - if (!sync_page_io(rdev, 0, size, rdev->sb_page, READ, true)) + if (!sync_page_io(rdev, 0, size, rdev->sb_page, REQ_OP_READ, 0, true)) goto fail; rdev->sb_loaded = 1; return 0; @@ -1469,7 +1472,7 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_ return -EINVAL; bb_sector = (long long)offset; if (!sync_page_io(rdev, bb_sector, sectors << 9, - rdev->bb_page, READ, true)) + rdev->bb_page, REQ_OP_READ, 0, true)) return -EIO; bbp = (u64 *)page_address(rdev->bb_page); rdev->badblocks.shift = sb->bblog_shift; diff --git a/drivers/md/md.h b/drivers/md/md.h index b5c4be7..2e0918f 100644 --- a/drivers/md/md.h +++ b/d
[PATCH 29/35] ide cd: do not set REQ_WRITE on requests.
From: Mike Christie The block layer will set the correct READ/WRITE operation flags/fields when creating a request, so there is not need for drivers to set the REQ_WRITE flag. This patch is compile tested only. Signed-off-by: Mike Christie --- drivers/ide/ide-cd_ioctl.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/ide/ide-cd_ioctl.c b/drivers/ide/ide-cd_ioctl.c index 474173e..5887a7a 100644 --- a/drivers/ide/ide-cd_ioctl.c +++ b/drivers/ide/ide-cd_ioctl.c @@ -459,9 +459,6 @@ int ide_cdrom_packet(struct cdrom_device_info *cdi, layer. the packet must be complete, as we do not touch it at all. */ - if (cgc->data_direction == CGC_DATA_WRITE) - flags |= REQ_WRITE; - if (cgc->sense) memset(cgc->sense, 0, sizeof(struct request_sense)); -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 17/35] ocfs2: set bi_op to REQ_OP
From: Mike Christie This patch has ocfs2 set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. This patch is compile tested only. Signed-off-by: Mike Christie --- fs/ocfs2/cluster/heartbeat.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c index bbc6203..4635f89 100644 --- a/fs/ocfs2/cluster/heartbeat.c +++ b/fs/ocfs2/cluster/heartbeat.c @@ -531,7 +531,8 @@ static void o2hb_bio_end_io(struct bio *bio) static struct bio *o2hb_setup_one_bio(struct o2hb_region *reg, struct o2hb_bio_wait_ctxt *wc, unsigned int *current_slot, - unsigned int max_slots, int rw) + unsigned int max_slots, int op, + int op_flags) { int len, current_page; unsigned int vec_len, vec_start; @@ -557,7 +558,8 @@ static struct bio *o2hb_setup_one_bio(struct o2hb_region *reg, bio->bi_bdev = reg->hr_bdev; bio->bi_private = wc; bio->bi_end_io = o2hb_bio_end_io; - bio->bi_rw = rw; + bio->bi_op = op; + bio->bi_rw = op_flags; vec_start = (cs << bits) % PAGE_CACHE_SIZE; while(cs < max_slots) { @@ -594,7 +596,7 @@ static int o2hb_read_slots(struct o2hb_region *reg, while(current_slot < max_slots) { bio = o2hb_setup_one_bio(reg, &wc, ¤t_slot, max_slots, -READ); +REQ_OP_READ, 0); if (IS_ERR(bio)) { status = PTR_ERR(bio); mlog_errno(status); @@ -626,7 +628,8 @@ static int o2hb_issue_node_write(struct o2hb_region *reg, slot = o2nm_this_node(); - bio = o2hb_setup_one_bio(reg, write_wc, &slot, slot+1, WRITE_SYNC); + bio = o2hb_setup_one_bio(reg, write_wc, &slot, slot+1, REQ_OP_WRITE, +WRITE_SYNC); if (IS_ERR(bio)) { status = PTR_ERR(bio); mlog_errno(status); -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/35] xfs: set bi_op to REQ_OP
From: Mike Christie This patch has xfs set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. Signed-off-by: Mike Christie Acked-by: Dave Chinner --- fs/xfs/xfs_aops.c | 3 ++- fs/xfs/xfs_buf.c | 27 +++ 2 files changed, 17 insertions(+), 13 deletions(-) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 9cd38d1..21867fc 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -393,7 +393,8 @@ xfs_submit_ioend_bio( atomic_inc(&ioend->io_remaining); bio->bi_private = ioend; bio->bi_end_io = xfs_end_bio; - bio->bi_rw = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE); + bio->bi_op = REQ_OP_WRITE; + bio->bi_rw = WB_SYNC_ALL ? WRITE_SYNC : 0; submit_bio(bio); } diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index 079bb77..917774e 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -1131,7 +1131,8 @@ xfs_buf_ioapply_map( int map, int *buf_offset, int *count, - int rw) + int op, + int op_flags) { int page_index; int total_nr_pages = bp->b_page_count; @@ -1170,7 +1171,8 @@ next_chunk: bio->bi_iter.bi_sector = sector; bio->bi_end_io = xfs_buf_bio_end_io; bio->bi_private = bp; - bio->bi_rw = rw; + bio->bi_op = op; + bio->bi_rw = op_flags; for (; size && nr_pages; nr_pages--, page_index++) { int rbytes, nbytes = PAGE_SIZE - offset; @@ -1214,7 +1216,8 @@ _xfs_buf_ioapply( struct xfs_buf *bp) { struct blk_plug plug; - int rw; + int op; + int op_flags = 0; int offset; int size; int i; @@ -1233,14 +1236,13 @@ _xfs_buf_ioapply( bp->b_ioend_wq = bp->b_target->bt_mount->m_buf_workqueue; if (bp->b_flags & XBF_WRITE) { + op = REQ_OP_WRITE; if (bp->b_flags & XBF_SYNCIO) - rw = WRITE_SYNC; - else - rw = WRITE; + op_flags = WRITE_SYNC; if (bp->b_flags & XBF_FUA) - rw |= REQ_FUA; + op_flags |= REQ_FUA; if (bp->b_flags & XBF_FLUSH) - rw |= REQ_FLUSH; + op_flags |= REQ_FLUSH; /* * Run the write verifier callback function if it exists. If @@ -1270,13 +1272,14 @@ _xfs_buf_ioapply( } } } else if (bp->b_flags & XBF_READ_AHEAD) { - rw = READA; + op = REQ_OP_READ; + op_flags = REQ_RAHEAD; } else { - rw = READ; + op = REQ_OP_READ; } /* we only use the buffer cache for meta-data */ - rw |= REQ_META; + op_flags |= REQ_META; /* * Walk all the vectors issuing IO on them. Set up the initial offset @@ -1288,7 +1291,7 @@ _xfs_buf_ioapply( size = BBTOB(bp->b_io_length); blk_start_plug(&plug); for (i = 0; i < bp->b_map_count; i++) { - xfs_buf_ioapply_map(bp, i, &offset, &size, rw); + xfs_buf_ioapply_map(bp, i, &offset, &size, op, op_flags); if (bp->b_error) break; if (size <= 0) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 15/35] mpage: set bi_op to REQ_OP
From: Mike Christie This patch has the mpage.c code set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. I have run xfstest with xfs, but I am not sure if I have stressed these code paths well. Signed-off-by: Mike Christie --- fs/mpage.c | 41 + 1 file changed, 21 insertions(+), 20 deletions(-) diff --git a/fs/mpage.c b/fs/mpage.c index 9479e73..f92df71 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -56,11 +56,12 @@ static void mpage_end_io(struct bio *bio) bio_put(bio); } -static struct bio *mpage_bio_submit(int rw, struct bio *bio) +static struct bio *mpage_bio_submit(int op, int op_flags, struct bio *bio) { bio->bi_end_io = mpage_end_io; - bio->bi_rw = rw; - guard_bio_eod(rw, bio); + bio->bi_op = op; + bio->bi_rw = op_flags; + guard_bio_eod(op, bio); submit_bio(bio); return NULL; } @@ -270,7 +271,7 @@ do_mpage_readpage(struct bio *bio, struct page *page, unsigned nr_pages, * This page will go to BIO. Do we need to send this BIO off first? */ if (bio && (*last_block_in_bio != blocks[0] - 1)) - bio = mpage_bio_submit(READ, bio); + bio = mpage_bio_submit(REQ_OP_READ, 0, bio); alloc_new: if (bio == NULL) { @@ -287,7 +288,7 @@ alloc_new: length = first_hole << blkbits; if (bio_add_page(bio, page, length, 0) < length) { - bio = mpage_bio_submit(READ, bio); + bio = mpage_bio_submit(REQ_OP_READ, 0, bio); goto alloc_new; } @@ -295,7 +296,7 @@ alloc_new: nblocks = map_bh->b_size >> blkbits; if ((buffer_boundary(map_bh) && relative_block == nblocks) || (first_hole != blocks_per_page)) - bio = mpage_bio_submit(READ, bio); + bio = mpage_bio_submit(REQ_OP_READ, 0, bio); else *last_block_in_bio = blocks[blocks_per_page - 1]; out: @@ -303,7 +304,7 @@ out: confused: if (bio) - bio = mpage_bio_submit(READ, bio); + bio = mpage_bio_submit(REQ_OP_READ, 0, bio); if (!PageUptodate(page)) block_read_full_page(page, get_block); else @@ -385,7 +386,7 @@ mpage_readpages(struct address_space *mapping, struct list_head *pages, } BUG_ON(!list_empty(pages)); if (bio) - mpage_bio_submit(READ, bio); + mpage_bio_submit(REQ_OP_READ, 0, bio); return 0; } EXPORT_SYMBOL(mpage_readpages); @@ -406,7 +407,7 @@ int mpage_readpage(struct page *page, get_block_t get_block) bio = do_mpage_readpage(bio, page, 1, &last_block_in_bio, &map_bh, &first_logical_block, get_block, gfp); if (bio) - mpage_bio_submit(READ, bio); + mpage_bio_submit(REQ_OP_READ, 0, bio); return 0; } EXPORT_SYMBOL(mpage_readpage); @@ -487,7 +488,7 @@ static int __mpage_writepage(struct page *page, struct writeback_control *wbc, struct buffer_head map_bh; loff_t i_size = i_size_read(inode); int ret = 0; - int wr = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE); + int op_flags = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : 0); if (page_has_buffers(page)) { struct buffer_head *head = page_buffers(page); @@ -596,7 +597,7 @@ page_is_mapped: * This page will go to BIO. Do we need to send this BIO off first? */ if (bio && mpd->last_block_in_bio != blocks[0] - 1) - bio = mpage_bio_submit(wr, bio); + bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio); alloc_new: if (bio == NULL) { @@ -623,7 +624,7 @@ alloc_new: wbc_account_io(wbc, page, PAGE_SIZE); length = first_unmapped << blkbits; if (bio_add_page(bio, page, length, 0) < length) { - bio = mpage_bio_submit(wr, bio); + bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio); goto alloc_new; } @@ -633,7 +634,7 @@ alloc_new: set_page_writeback(page); unlock_page(page); if (boundary || (first_unmapped != blocks_per_page)) { - bio = mpage_bio_submit(wr, bio); + bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio); if (boundary_block) { write_boundary_block(boundary_bdev, boundary_block, 1 << blkbits); @@ -645,7 +646,7 @@ alloc_new: confused: if (bio) - bio = mpage_bio_submit(wr, bio); + bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio); if (mpd->use_writepage) { ret = mapping->a_ops->writepage(page, wbc); @@ -702,9 +703,9 @@ mpage_writepages(struct address_space *mapping, ret = write_cache_pages(mapping, wbc, __mpage_writepage, &mpd); if (mpd.bio) { -
[PATCH 19/35] dm: set bi_op to REQ_OP
From: Mike Christie This patch has dm set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. I did some basic dm tests, but I think this patch should be considered compile tested only. I have not tested all the dm targets and I did not stress every code path I have touched. Signed-off-by: Mike Christie --- drivers/md/dm-bufio.c | 8 +++--- drivers/md/dm-crypt.c | 1 + drivers/md/dm-io.c | 57 ++--- drivers/md/dm-kcopyd.c | 25 +- drivers/md/dm-log-writes.c | 6 ++--- drivers/md/dm-log.c | 5 ++-- drivers/md/dm-raid1.c | 11 +--- drivers/md/dm-snap-persistent.c | 24 + drivers/md/dm-thin.c| 7 ++--- drivers/md/dm.c | 1 + include/linux/dm-io.h | 3 ++- 11 files changed, 82 insertions(+), 66 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 9d3ee7f..b6055f2 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -574,7 +574,8 @@ static void use_dmio(struct dm_buffer *b, int rw, sector_t block, { int r; struct dm_io_request io_req = { - .bi_rw = rw, + .bi_op = rw, + .bi_op_flags = 0, .notify.fn = dmio_complete, .notify.context = b, .client = b->c->dm_io, @@ -634,7 +635,7 @@ static void use_inline_bio(struct dm_buffer *b, int rw, sector_t block, * the dm_buffer's inline bio is local to bufio. */ b->bio.bi_private = end_io; - b->bio.bi_rw = rw; + b->bio.bi_op = rw; /* * We assume that if len >= PAGE_SIZE ptr is page-aligned. @@ -1327,7 +1328,8 @@ EXPORT_SYMBOL_GPL(dm_bufio_write_dirty_buffers); int dm_bufio_issue_flush(struct dm_bufio_client *c) { struct dm_io_request io_req = { - .bi_rw = WRITE_FLUSH, + .bi_op = REQ_OP_WRITE, + .bi_op_flags = WRITE_FLUSH, .mem.type = DM_IO_KMEM, .mem.ptr.addr = NULL, .client = c->dm_io, diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 4f3cb35..70fbf11 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -1136,6 +1136,7 @@ static void clone_init(struct dm_crypt_io *io, struct bio *clone) clone->bi_private = io; clone->bi_end_io = crypt_endio; clone->bi_bdev= cc->dev->bdev; + clone->bi_op = io->base_bio->bi_op; clone->bi_rw = io->base_bio->bi_rw; } diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c index 50f17e3..0f723ca 100644 --- a/drivers/md/dm-io.c +++ b/drivers/md/dm-io.c @@ -278,8 +278,9 @@ static void km_dp_init(struct dpages *dp, void *data) /*- * IO routines that accept a list of pages. *---*/ -static void do_region(int rw, unsigned region, struct dm_io_region *where, - struct dpages *dp, struct io *io) +static void do_region(int op, int op_flags, unsigned region, + struct dm_io_region *where, struct dpages *dp, + struct io *io) { struct bio *bio; struct page *page; @@ -295,24 +296,25 @@ static void do_region(int rw, unsigned region, struct dm_io_region *where, /* * Reject unsupported discard and write same requests. */ - if (rw & REQ_DISCARD) + if (op == REQ_OP_DISCARD) special_cmd_max_sectors = q->limits.max_discard_sectors; - else if (rw & REQ_WRITE_SAME) + else if (op == REQ_OP_WRITE_SAME) special_cmd_max_sectors = q->limits.max_write_same_sectors; - if ((rw & (REQ_DISCARD | REQ_WRITE_SAME)) && special_cmd_max_sectors == 0) { + if ((op == REQ_OP_DISCARD || op == REQ_OP_WRITE_SAME) && + special_cmd_max_sectors == 0) { dec_count(io, region, -EOPNOTSUPP); return; } /* -* where->count may be zero if rw holds a flush and we need to +* where->count may be zero if op holds a flush and we need to * send a zero-sized flush. */ do { /* * Allocate a suitably sized-bio. */ - if ((rw & REQ_DISCARD) || (rw & REQ_WRITE_SAME)) + if ((op == REQ_OP_DISCARD) || (op == REQ_OP_WRITE_SAME)) num_bvecs = 1; else num_bvecs = min_t(int, BIO_MAX_PAGES, @@ -322,14 +324,15 @@ static void do_region(int rw, unsigned region, struct dm_io_region *where, bio->bi_iter.bi_sector = where->sector + (where->count - remaining); bio->bi_bdev = where->bdev; bio->bi_end_io = endio; - bio->bi_rw = rw; + bio->bi_op = op;
[PATCH 22/35] drbd: set bi_op to REQ_OP
From: Mike Christie This patch has drbd set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. Lars and Philip, I might have split this patch up a little weird. Thisi patch handles setting up the bio, and then patch 30 (0030-block-fs-drivers-do-not-test-bi_rw-for-REQ_OPs.patch) handles where were check/read bio->bi_rw. This patch is compile tested only. Signed-off-by: Mike Christie --- drivers/block/drbd/drbd_actlog.c | 29 - drivers/block/drbd/drbd_bitmap.c | 6 +++--- drivers/block/drbd/drbd_int.h | 4 ++-- drivers/block/drbd/drbd_main.c | 5 +++-- drivers/block/drbd/drbd_receiver.c | 37 + drivers/block/drbd/drbd_worker.c | 3 ++- 6 files changed, 51 insertions(+), 33 deletions(-) diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c index 6069e15..2fa8534 100644 --- a/drivers/block/drbd/drbd_actlog.c +++ b/drivers/block/drbd/drbd_actlog.c @@ -137,19 +137,19 @@ void wait_until_done_or_force_detached(struct drbd_device *device, struct drbd_b static int _drbd_md_sync_page_io(struct drbd_device *device, struct drbd_backing_dev *bdev, -sector_t sector, int rw) +sector_t sector, int op) { struct bio *bio; /* we do all our meta data IO in aligned 4k blocks. */ const int size = 4096; - int err; + int err, op_flags = 0; device->md_io.done = 0; device->md_io.error = -ENODEV; - if ((rw & WRITE) && !test_bit(MD_NO_FUA, &device->flags)) - rw |= REQ_FUA | REQ_FLUSH; - rw |= REQ_SYNC | REQ_NOIDLE; + if ((op == REQ_OP_WRITE) && !test_bit(MD_NO_FUA, &device->flags)) + op_flags |= REQ_FUA | REQ_FLUSH; + op_flags |= REQ_SYNC | REQ_NOIDLE; bio = bio_alloc_drbd(GFP_NOIO); bio->bi_bdev = bdev->md_bdev; @@ -159,9 +159,10 @@ static int _drbd_md_sync_page_io(struct drbd_device *device, goto out; bio->bi_private = device; bio->bi_end_io = drbd_md_endio; - bio->bi_rw = rw; + bio->bi_op = op; + bio->bi_rw = op_flags; - if (!(rw & WRITE) && device->state.disk == D_DISKLESS && device->ldev == NULL) + if (op != REQ_OP_WRITE && device->state.disk == D_DISKLESS && device->ldev == NULL) /* special case, drbd_md_read() during drbd_adm_attach(): no get_ldev */ ; else if (!get_ldev_if_state(device, D_ATTACHING)) { @@ -174,7 +175,7 @@ static int _drbd_md_sync_page_io(struct drbd_device *device, bio_get(bio); /* one bio_put() is in the completion handler */ atomic_inc(&device->md_io.in_use); /* drbd_md_put_buffer() is in the completion handler */ device->md_io.submit_jif = jiffies; - if (drbd_insert_fault(device, (rw & WRITE) ? DRBD_FAULT_MD_WR : DRBD_FAULT_MD_RD)) + if (drbd_insert_fault(device, (op == REQ_OP_WRITE) ? DRBD_FAULT_MD_WR : DRBD_FAULT_MD_RD)) bio_io_error(bio); else submit_bio(bio); @@ -188,7 +189,7 @@ static int _drbd_md_sync_page_io(struct drbd_device *device, } int drbd_md_sync_page_io(struct drbd_device *device, struct drbd_backing_dev *bdev, -sector_t sector, int rw) +sector_t sector, int op) { int err; D_ASSERT(device, atomic_read(&device->md_io.in_use) == 1); @@ -197,19 +198,21 @@ int drbd_md_sync_page_io(struct drbd_device *device, struct drbd_backing_dev *bd dynamic_drbd_dbg(device, "meta_data io: %s [%d]:%s(,%llus,%s) %pS\n", current->comm, current->pid, __func__, -(unsigned long long)sector, (rw & WRITE) ? "WRITE" : "READ", +(unsigned long long)sector, (op == REQ_OP_WRITE) ? "WRITE" : "READ", (void*)_RET_IP_ ); if (sector < drbd_md_first_sector(bdev) || sector + 7 > drbd_md_last_sector(bdev)) drbd_alert(device, "%s [%d]:%s(,%llus,%s) out of range md access!\n", current->comm, current->pid, __func__, -(unsigned long long)sector, (rw & WRITE) ? "WRITE" : "READ"); +(unsigned long long)sector, +(op == REQ_OP_WRITE) ? "WRITE" : "READ"); - err = _drbd_md_sync_page_io(device, bdev, sector, rw); + err = _drbd_md_sync_page_io(device, bdev, sector, op); if (err) { drbd_err(device, "drbd_md_sync_page_io(,%llus,%s) failed with error %d\n", - (unsigned long long)sector, (rw & WRITE) ? "WRITE" : "READ", err); + (unsigned long long)sector, + (op == REQ_OP_WRITE) ? "WRITE" : "READ", err); } return err; } diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c index e8959fe..126bf4a 100644 --- a/drivers/block/drbd/drbd_bitmap
[PATCH 21/35] bcache: set bi_op to REQ_OP
From: Mike Christie This patch has bcache set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. This patch is compile tested only Signed-off-by: Mike Christie --- drivers/md/bcache/btree.c | 2 ++ drivers/md/bcache/debug.c | 2 ++ drivers/md/bcache/io.c| 2 +- drivers/md/bcache/journal.c | 7 --- drivers/md/bcache/movinggc.c | 2 +- drivers/md/bcache/request.c | 9 + drivers/md/bcache/super.c | 26 +++--- drivers/md/bcache/writeback.c | 4 ++-- 8 files changed, 32 insertions(+), 22 deletions(-) diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 22b9e34..752a44f 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -295,6 +295,7 @@ static void bch_btree_node_read(struct btree *b) closure_init_stack(&cl); bio = bch_bbio_alloc(b->c); + bio->bi_op = REQ_OP_READ; bio->bi_rw = REQ_META|READ_SYNC; bio->bi_iter.bi_size = KEY_SIZE(&b->key) << 9; bio->bi_end_io = btree_node_read_endio; @@ -397,6 +398,7 @@ static void do_btree_node_write(struct btree *b) b->bio->bi_end_io = btree_node_write_endio; b->bio->bi_private = cl; + b->bio->bi_op = REQ_OP_WRITE; b->bio->bi_rw = REQ_META|WRITE_SYNC|REQ_FUA; b->bio->bi_iter.bi_size = roundup(set_bytes(i), block_bytes(b->c)); bch_bio_map(b->bio, i); diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c index 52b6bcf..8df9e66 100644 --- a/drivers/md/bcache/debug.c +++ b/drivers/md/bcache/debug.c @@ -52,6 +52,7 @@ void bch_btree_verify(struct btree *b) bio->bi_bdev= PTR_CACHE(b->c, &b->key, 0)->bdev; bio->bi_iter.bi_sector = PTR_OFFSET(&b->key, 0); bio->bi_iter.bi_size= KEY_SIZE(&v->key) << 9; + bio->bi_op = REQ_OP_READ; bio->bi_rw = REQ_META|READ_SYNC; bch_bio_map(bio, sorted); @@ -114,6 +115,7 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio) check = bio_clone(bio, GFP_NOIO); if (!check) return; + check->bi_op = REQ_OP_READ; check->bi_rw |= READ_SYNC; if (bio_alloc_pages(check, GFP_NOIO)) diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c index 86a0bb8..f10a9a0 100644 --- a/drivers/md/bcache/io.c +++ b/drivers/md/bcache/io.c @@ -111,7 +111,7 @@ void bch_bbio_count_io_errors(struct cache_set *c, struct bio *bio, struct bbio *b = container_of(bio, struct bbio, bio); struct cache *ca = PTR_CACHE(c, &b->key, 0); - unsigned threshold = bio->bi_rw & REQ_WRITE + unsigned threshold = op_is_write(bio->bi_op) ? c->congested_write_threshold_us : c->congested_read_threshold_us; diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index af3f9f7..68fa0f0 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -54,7 +54,7 @@ reread: left = ca->sb.bucket_size - offset; bio_reset(bio); bio->bi_iter.bi_sector = bucket + offset; bio->bi_bdev= ca->bdev; - bio->bi_rw = READ; + bio->bi_op = REQ_OP_READ; bio->bi_iter.bi_size= len << 9; bio->bi_end_io = journal_read_endio; @@ -452,7 +452,7 @@ static void do_journal_discard(struct cache *ca) bio->bi_iter.bi_sector = bucket_to_sector(ca->set, ca->sb.d[ja->discard_idx]); bio->bi_bdev= ca->bdev; - bio->bi_rw = REQ_WRITE|REQ_DISCARD; + bio->bi_op = REQ_OP_DISCARD; bio->bi_max_vecs= 1; bio->bi_io_vec = bio->bi_inline_vecs; bio->bi_iter.bi_size= bucket_bytes(ca); @@ -626,7 +626,8 @@ static void journal_write_unlocked(struct closure *cl) bio_reset(bio); bio->bi_iter.bi_sector = PTR_OFFSET(k, i); bio->bi_bdev= ca->bdev; - bio->bi_rw = REQ_WRITE|REQ_SYNC|REQ_META|REQ_FLUSH|REQ_FUA; + bio->bi_op = REQ_OP_WRITE; + bio->bi_rw = REQ_SYNC|REQ_META|REQ_FLUSH|REQ_FUA; bio->bi_iter.bi_size = sectors << 9; bio->bi_end_io = journal_write_endio; diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c index b929fc9..f33860a 100644 --- a/drivers/md/bcache/movinggc.c +++ b/drivers/md/bcache/movinggc.c @@ -163,7 +163,7 @@ static void read_moving(struct cache_set *c) moving_init(io); bio = &io->bio.bio; - bio->bi_rw = READ; + bio->bi_op = REQ_OP_READ; bio->bi_end_io = read_moving_endio; if (bio_alloc_pages(bio, GFP_KERNEL)) dif
[PATCH 14/35] hfsplus: set bi_op to REQ_OP
From: Mike Christie This patch has hfsplus set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. This patch is compile tested only. Signed-off-by: Mike Christie --- fs/hfsplus/hfsplus_fs.h | 2 +- fs/hfsplus/part_tbl.c | 5 +++-- fs/hfsplus/super.c | 6 -- fs/hfsplus/wrapper.c| 15 +-- 4 files changed, 17 insertions(+), 11 deletions(-) diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h index f91a1fa..80154aa 100644 --- a/fs/hfsplus/hfsplus_fs.h +++ b/fs/hfsplus/hfsplus_fs.h @@ -525,7 +525,7 @@ int hfsplus_compare_dentry(const struct dentry *parent, /* wrapper.c */ int hfsplus_submit_bio(struct super_block *sb, sector_t sector, void *buf, - void **data, int rw); + void **data, int op, int op_flags); int hfsplus_read_wrapper(struct super_block *sb); /* time macros */ diff --git a/fs/hfsplus/part_tbl.c b/fs/hfsplus/part_tbl.c index eb355d8..63164eb 100644 --- a/fs/hfsplus/part_tbl.c +++ b/fs/hfsplus/part_tbl.c @@ -112,7 +112,8 @@ static int hfs_parse_new_pmap(struct super_block *sb, void *buf, if ((u8 *)pm - (u8 *)buf >= buf_size) { res = hfsplus_submit_bio(sb, *part_start + HFS_PMAP_BLK + i, -buf, (void **)&pm, READ); +buf, (void **)&pm, REQ_OP_READ, +0); if (res) return res; } @@ -136,7 +137,7 @@ int hfs_part_find(struct super_block *sb, return -ENOMEM; res = hfsplus_submit_bio(sb, *part_start + HFS_PMAP_BLK, -buf, &data, READ); +buf, &data, REQ_OP_READ, 0); if (res) goto out; diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c index 5d54490..01cf313 100644 --- a/fs/hfsplus/super.c +++ b/fs/hfsplus/super.c @@ -219,7 +219,8 @@ static int hfsplus_sync_fs(struct super_block *sb, int wait) error2 = hfsplus_submit_bio(sb, sbi->part_start + HFSPLUS_VOLHEAD_SECTOR, - sbi->s_vhdr_buf, NULL, WRITE_SYNC); + sbi->s_vhdr_buf, NULL, REQ_OP_WRITE, + WRITE_SYNC); if (!error) error = error2; if (!write_backup) @@ -227,7 +228,8 @@ static int hfsplus_sync_fs(struct super_block *sb, int wait) error2 = hfsplus_submit_bio(sb, sbi->part_start + sbi->sect_count - 2, - sbi->s_backup_vhdr_buf, NULL, WRITE_SYNC); + sbi->s_backup_vhdr_buf, NULL, REQ_OP_WRITE, + WRITE_SYNC); if (!error) error2 = error; out: diff --git a/fs/hfsplus/wrapper.c b/fs/hfsplus/wrapper.c index d026bb3..c5c916d 100644 --- a/fs/hfsplus/wrapper.c +++ b/fs/hfsplus/wrapper.c @@ -30,7 +30,8 @@ struct hfsplus_wd { * @sector: block to read or write, for blocks of HFSPLUS_SECTOR_SIZE bytes * @buf: buffer for I/O * @data: output pointer for location of requested data - * @rw: direction of I/O + * @op: direction of I/O + * @op_flags: request op flags * * The unit of I/O is hfsplus_min_io_size(sb), which may be bigger than * HFSPLUS_SECTOR_SIZE, and @buf must be sized accordingly. On reads @@ -44,7 +45,7 @@ struct hfsplus_wd { * will work correctly. */ int hfsplus_submit_bio(struct super_block *sb, sector_t sector, - void *buf, void **data, int rw) + void *buf, void **data, int op, int op_flags) { struct bio *bio; int ret = 0; @@ -65,9 +66,10 @@ int hfsplus_submit_bio(struct super_block *sb, sector_t sector, bio = bio_alloc(GFP_NOIO, 1); bio->bi_iter.bi_sector = sector; bio->bi_bdev = sb->s_bdev; - bio->bi_rw = rw; + bio->bi_op = op; + bio->bi_rw = op_flags; - if (!(rw & WRITE) && data) + if (op != WRITE && data) *data = (u8 *)buf + offset; while (io_size > 0) { @@ -182,7 +184,7 @@ int hfsplus_read_wrapper(struct super_block *sb) reread: error = hfsplus_submit_bio(sb, part_start + HFSPLUS_VOLHEAD_SECTOR, sbi->s_vhdr_buf, (void **)&sbi->s_vhdr, - READ); + REQ_OP_READ, 0); if (error) goto out_free_backup_vhdr; @@ -214,7 +216,8 @@ reread: error = hfsplus_submit_bio(sb, part_start + part_size - 2, sbi->s_backup_vhdr_buf, - (void **)&sbi->s_backup_vhdr, READ); + (void **)&sbi->s_backup_vhdr, REQ_OP_READ, +
[PATCH 10/35] btrfs: don't pass rq_flag_bits if there is a bio
From: Mike Christie The bio bi_op and bi_rw is now setup, so there is no need to pass around the rq_flag_bits bits too. v2: 1. Fix merge_bio issue where instead of removing rw/op argument I passed it in again to the merge_bio related functions. Signed-off-by: Mike Christie --- fs/btrfs/compression.c | 13 ++--- fs/btrfs/ctree.h | 2 +- fs/btrfs/disk-io.c | 30 -- fs/btrfs/disk-io.h | 2 +- fs/btrfs/extent_io.c | 12 +--- fs/btrfs/extent_io.h | 8 fs/btrfs/inode.c | 44 fs/btrfs/volumes.c | 6 +++--- fs/btrfs/volumes.h | 2 +- 9 files changed, 53 insertions(+), 66 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 7e64f3e..90028305 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -374,7 +374,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 start, page = compressed_pages[pg_index]; page->mapping = inode->i_mapping; if (bio->bi_iter.bi_size) - ret = io_tree->ops->merge_bio_hook(WRITE, page, 0, + ret = io_tree->ops->merge_bio_hook(page, 0, PAGE_CACHE_SIZE, bio, 0); else @@ -402,7 +402,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 start, BUG_ON(ret); /* -ENOMEM */ } - ret = btrfs_map_bio(root, WRITE, bio, 0, 1); + ret = btrfs_map_bio(root, bio, 0, 1); BUG_ON(ret); /* -ENOMEM */ bio_put(bio); @@ -433,7 +433,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 start, BUG_ON(ret); /* -ENOMEM */ } - ret = btrfs_map_bio(root, WRITE, bio, 0, 1); + ret = btrfs_map_bio(root, bio, 0, 1); BUG_ON(ret); /* -ENOMEM */ bio_put(bio); @@ -659,7 +659,7 @@ int btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, page->index = em_start >> PAGE_CACHE_SHIFT; if (comp_bio->bi_iter.bi_size) - ret = tree->ops->merge_bio_hook(READ, page, 0, + ret = tree->ops->merge_bio_hook(page, 0, PAGE_CACHE_SIZE, comp_bio, 0); else @@ -690,8 +690,7 @@ int btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, sums += DIV_ROUND_UP(comp_bio->bi_iter.bi_size, root->sectorsize); - ret = btrfs_map_bio(root, READ, comp_bio, - mirror_num, 0); + ret = btrfs_map_bio(root, comp_bio, mirror_num, 0); if (ret) { bio->bi_error = ret; bio_endio(comp_bio); @@ -721,7 +720,7 @@ int btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, BUG_ON(ret); /* -ENOMEM */ } - ret = btrfs_map_bio(root, READ, comp_bio, mirror_num, 0); + ret = btrfs_map_bio(root, comp_bio, mirror_num, 0); if (ret) { bio->bi_error = ret; bio_endio(comp_bio); diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index b69ad13..1c6bae3 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -4084,7 +4084,7 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans, struct btrfs_root *new_root, struct btrfs_root *parent_root, u64 new_dirid); -int btrfs_merge_bio_hook(int rw, struct page *page, unsigned long offset, +int btrfs_merge_bio_hook(struct page *page, unsigned long offset, size_t size, struct bio *bio, unsigned long bio_flags); int btrfs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 49d2f29..9aa2b53 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -124,7 +124,6 @@ struct async_submit_bio { struct list_head list; extent_submit_bio_hook_t *submit_bio_start; extent_submit_bio_hook_t *submit_bio_done; - int rw; int mirror_num; unsigned long bio_flags; /* @@ -789,7 +788,7 @@ static void run_one_async_start(struct btrfs_work *work) int ret; async = container_of(work, struct async_submit_bio, work); - ret = async->submit_bio_start(async->inode, async->rw, async->bio, + ret = async->submit_bio_start(async->inode, async->bio, async->mirror
[PATCH 02/35] block: add REQ_OP definitions and bi_op/op fields
From: Mike Christie The following patches separate the operation (write, read, discard, etc) from the flags in bi_rw/cmd_flags. This patch adds definitions for request/bio operations, adds fields to the request/bio to set them, and some temporary compat code so the kernel/modules can use either one. In the final patches this compat code will be removed when everything is converted. Also, in this patch the REQ_OPs match the REQ rq_flag_bits ones for compat reasons while all the code is converted in this set. In the last patches that will also be removed. Signed-off-by: Mike Christie --- block/blk-core.c | 19 --- include/linux/blk_types.h | 15 ++- include/linux/blkdev.h| 1 + include/linux/fs.h| 37 +++-- 4 files changed, 66 insertions(+), 6 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index f23d1b0..74aa201 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1697,7 +1697,8 @@ void init_request_from_bio(struct request *req, struct bio *bio) { req->cmd_type = REQ_TYPE_FS; - req->cmd_flags |= bio->bi_rw & REQ_COMMON_MASK; + /* tmp compat. Allow users to set bi_op or bi_rw */ + req->cmd_flags |= (bio->bi_rw | bio->bi_op) & REQ_COMMON_MASK; if (bio->bi_rw & REQ_RAHEAD) req->cmd_flags |= REQ_FAILFAST_MASK; @@ -2032,6 +2033,12 @@ blk_qc_t generic_make_request(struct bio *bio) struct bio_list bio_list_on_stack; blk_qc_t ret = BLK_QC_T_NONE; + /* tmp compat. Allow users to set either one or both. +* This will be removed when we have converted +* everyone in the next patches. +*/ + bio->bi_rw |= bio->bi_op; + if (!generic_make_request_checks(bio)) goto out; @@ -2101,6 +2108,12 @@ EXPORT_SYMBOL(generic_make_request); */ blk_qc_t submit_bio(struct bio *bio) { + /* tmp compat. Allow users to set either one or both. +* This will be removed when we have converted +* everyone in the next patches. +*/ + bio->bi_rw |= bio->bi_op; + /* * If it's a regular read/write or a barrier with data attached, * go through the normal accounting stuff before submission. @@ -2974,8 +2987,8 @@ EXPORT_SYMBOL_GPL(__blk_end_request_err); void blk_rq_bio_prep(struct request_queue *q, struct request *rq, struct bio *bio) { - /* Bit 0 (R/W) is identical in rq->cmd_flags and bio->bi_rw */ - rq->cmd_flags |= bio->bi_rw & REQ_WRITE; + /* tmp compat. Allow users to set bi_op or bi_rw */ + rq->cmd_flags |= bio_data_dir(bio); if (bio_has_data(bio)) rq->nr_phys_segments = bio_phys_segments(q, bio); diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 86a38ea..6e49c91 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -48,9 +48,15 @@ struct bio { struct block_device *bi_bdev; unsigned intbi_flags; /* status, command, etc */ int bi_error; - unsigned long bi_rw; /* bottom bits READ/WRITE, + unsigned long bi_rw; /* bottom bits rq_flags_bits * top bits priority */ + /* +* this will be a u8 in the next patches and bi_rw can be shrunk to +* a u32. For compat in these transistional patches op is a int here. +*/ + int bi_op; /* REQ_OP */ + struct bvec_iterbi_iter; @@ -242,6 +248,13 @@ enum rq_flag_bits { #define REQ_HASHED (1ULL << __REQ_HASHED) #define REQ_MQ_INFLIGHT(1ULL << __REQ_MQ_INFLIGHT) +enum req_op { + REQ_OP_READ, + REQ_OP_WRITE= REQ_WRITE, + REQ_OP_DISCARD = REQ_DISCARD, + REQ_OP_WRITE_SAME = REQ_WRITE_SAME, +}; + typedef unsigned int blk_qc_t; #define BLK_QC_T_NONE -1U #define BLK_QC_T_SHIFT 16 diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 40c0241..d4d3b06 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -96,6 +96,7 @@ struct request { struct request_queue *q; struct blk_mq_ctx *mq_ctx; + int op; u64 cmd_flags; unsigned cmd_type; unsigned long atomic_flags; diff --git a/include/linux/fs.h b/include/linux/fs.h index 3d9fdf4..399b22b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2434,15 +2434,48 @@ extern void make_bad_inode(struct inode *); extern bool is_bad_inode(struct inode *); #ifdef CONFIG_BLOCK + +static inline bool op_is_write(int op) +{ + switch (op) { + case REQ_OP_WRITE: + case REQ_OP_WRITE_SAME: + case REQ_OP_DISCARD: + return true; + default: + return false; + } +} +
[PATCH 04/35] fs: have submit_bh users pass in op and flags separately
From: Mike Christie This has submit_bh users pass in the operation and flags separately, so we can setup the bio->bi_op and bio-bi_rw flags. Signed-off-by: Mike Christie --- drivers/md/bitmap.c | 4 ++-- fs/btrfs/check-integrity.c | 24 ++-- fs/btrfs/check-integrity.h | 2 +- fs/btrfs/disk-io.c | 4 ++-- fs/buffer.c | 54 +++-- fs/ext4/balloc.c| 2 +- fs/ext4/ialloc.c| 2 +- fs/ext4/inode.c | 2 +- fs/ext4/mmp.c | 4 ++-- fs/fat/misc.c | 2 +- fs/gfs2/bmap.c | 2 +- fs/gfs2/dir.c | 2 +- fs/gfs2/meta_io.c | 6 ++--- fs/jbd2/commit.c| 6 ++--- fs/jbd2/journal.c | 8 +++ fs/nilfs2/btnode.c | 6 ++--- fs/nilfs2/btnode.h | 2 +- fs/nilfs2/btree.c | 6 +++-- fs/nilfs2/gcinode.c | 5 +++-- fs/nilfs2/mdt.c | 11 - fs/ntfs/aops.c | 6 ++--- fs/ntfs/compress.c | 2 +- fs/ntfs/file.c | 2 +- fs/ntfs/logfile.c | 2 +- fs/ntfs/mft.c | 4 ++-- fs/ocfs2/buffer_head_io.c | 8 +++ fs/reiserfs/inode.c | 4 ++-- fs/reiserfs/journal.c | 6 ++--- fs/ufs/util.c | 2 +- include/linux/buffer_head.h | 9 30 files changed, 103 insertions(+), 96 deletions(-) diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c index d80cce4..c8e4124 100644 --- a/drivers/md/bitmap.c +++ b/drivers/md/bitmap.c @@ -295,7 +295,7 @@ static void write_page(struct bitmap *bitmap, struct page *page, int wait) atomic_inc(&bitmap->pending_writes); set_buffer_locked(bh); set_buffer_mapped(bh); - submit_bh(WRITE | REQ_SYNC, bh); + submit_bh(REQ_OP_WRITE, REQ_SYNC, bh); bh = bh->b_this_page; } @@ -390,7 +390,7 @@ static int read_page(struct file *file, unsigned long index, atomic_inc(&bitmap->pending_writes); set_buffer_locked(bh); set_buffer_mapped(bh); - submit_bh(READ, bh); + submit_bh(REQ_OP_READ, 0, bh); } block++; bh = bh->b_this_page; diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c index 9c51373..1c3c40a 100644 --- a/fs/btrfs/check-integrity.c +++ b/fs/btrfs/check-integrity.c @@ -2854,12 +2854,12 @@ static struct btrfsic_dev_state *btrfsic_dev_state_lookup( return ds; } -int btrfsic_submit_bh(int rw, struct buffer_head *bh) +int btrfsic_submit_bh(int op, int op_flags, struct buffer_head *bh) { struct btrfsic_dev_state *dev_state; if (!btrfsic_is_initialized) - return submit_bh(rw, bh); + return submit_bh(op, op_flags, bh); mutex_lock(&btrfsic_mutex); /* since btrfsic_submit_bh() might also be called before @@ -2868,26 +2868,26 @@ int btrfsic_submit_bh(int rw, struct buffer_head *bh) /* Only called to write the superblock (incl. FLUSH/FUA) */ if (NULL != dev_state && - (rw & WRITE) && bh->b_size > 0) { + (op == REQ_OP_WRITE) && bh->b_size > 0) { u64 dev_bytenr; dev_bytenr = 4096 * bh->b_blocknr; if (dev_state->state->print_mask & BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH) printk(KERN_INFO - "submit_bh(rw=0x%x, blocknr=%llu (bytenr %llu)," - " size=%zu, data=%p, bdev=%p)\n", - rw, (unsigned long long)bh->b_blocknr, + "submit_bh(op=0x%x,0x%x, blocknr=%llu " + "(bytenr %llu), size=%zu, data=%p, bdev=%p)\n", + op, op_flags, (unsigned long long)bh->b_blocknr, dev_bytenr, bh->b_size, bh->b_data, bh->b_bdev); btrfsic_process_written_block(dev_state, dev_bytenr, &bh->b_data, 1, NULL, - NULL, bh, rw); - } else if (NULL != dev_state && (rw & REQ_FLUSH)) { + NULL, bh, op_flags); + } else if (NULL != dev_state && (op_flags & REQ_FLUSH)) { if (dev_state->state->print_mask & BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH) printk(KERN_INFO - "submit_bh(rw=0x%x FLUSH, bdev=%p)\n", - rw, bh->b_bdev); + "submit_bh(op=0x%x,0x%x FLUSH, bdev=%p)\n", + op, op_flags, bh->b_bdev); if (!dev_state->dummy_block_
[PATCH 03/35] block, fs, mm, drivers: set bi_op to REQ_OP
From: Mike Christie This patch converts the simple bi_rw use cases in the block, drivers, mm and fs code to use bi_op for a REQ_OP and bi_rw for rq_flag_bits. These should be simple one liner cases, so I just did them in one patch. The next patches handle the more complicated cases in a module per patch. Signed-off-by: Mike Christie --- block/bio.c | 8 +--- block/blk-flush.c| 1 + block/blk-lib.c | 7 --- block/blk-map.c | 2 +- drivers/block/floppy.c | 2 +- drivers/block/pktcdvd.c | 4 ++-- drivers/lightnvm/rrpc.c | 4 ++-- drivers/scsi/osd/osd_initiator.c | 8 fs/exofs/ore.c | 2 +- fs/ext4/crypto.c | 2 +- fs/ext4/page-io.c| 8 +--- fs/ext4/readpage.c | 2 +- fs/jfs/jfs_logmgr.c | 2 ++ fs/jfs/jfs_metapage.c| 4 ++-- fs/logfs/dev_bdev.c | 12 ++-- fs/nfs/blocklayout/blocklayout.c | 2 +- mm/page_io.c | 4 ++-- 17 files changed, 41 insertions(+), 33 deletions(-) diff --git a/block/bio.c b/block/bio.c index 7e4d050..68df2df 100644 --- a/block/bio.c +++ b/block/bio.c @@ -581,6 +581,7 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src) */ bio->bi_bdev = bio_src->bi_bdev; bio_set_flag(bio, BIO_CLONED); + bio->bi_op = bio_src->bi_op; bio->bi_rw = bio_src->bi_rw; bio->bi_iter = bio_src->bi_iter; bio->bi_io_vec = bio_src->bi_io_vec; @@ -663,6 +664,7 @@ struct bio *bio_clone_bioset(struct bio *bio_src, gfp_t gfp_mask, return NULL; bio->bi_bdev= bio_src->bi_bdev; + bio->bi_op = bio_src->bi_op; bio->bi_rw = bio_src->bi_rw; bio->bi_iter.bi_sector = bio_src->bi_iter.bi_sector; bio->bi_iter.bi_size= bio_src->bi_iter.bi_size; @@ -1171,7 +1173,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q, goto out_bmd; if (iter->type & WRITE) - bio->bi_rw |= REQ_WRITE; + bio->bi_op = REQ_OP_WRITE; ret = 0; @@ -1341,7 +1343,7 @@ struct bio *bio_map_user_iov(struct request_queue *q, * set data direction, and check if mapped pages need bouncing */ if (iter->type & WRITE) - bio->bi_rw |= REQ_WRITE; + bio->bi_op = REQ_OP_WRITE; bio_set_flag(bio, BIO_USER_MAPPED); @@ -1534,7 +1536,7 @@ struct bio *bio_copy_kern(struct request_queue *q, void *data, unsigned int len, bio->bi_private = data; } else { bio->bi_end_io = bio_copy_kern_endio; - bio->bi_rw |= REQ_WRITE; + bio->bi_op = REQ_OP_WRITE; } return bio; diff --git a/block/blk-flush.c b/block/blk-flush.c index f2fbf9a..b05acca 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -484,6 +484,7 @@ int blkdev_issue_flush(struct block_device *bdev, gfp_t gfp_mask, bio = bio_alloc(gfp_mask, 0); bio->bi_bdev = bdev; + bio->bi_op = REQ_OP_WRITE; bio->bi_rw = WRITE_FLUSH; ret = submit_bio_wait(bio); diff --git a/block/blk-lib.c b/block/blk-lib.c index 87e3de4..d01b5f2 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -42,7 +42,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector, { DECLARE_COMPLETION_ONSTACK(wait); struct request_queue *q = bdev_get_queue(bdev); - int type = REQ_WRITE | REQ_DISCARD; + int type = 0; unsigned int granularity; int alignment; struct bio_batch bb; @@ -102,6 +102,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector, bio->bi_end_io = bio_batch_end_io; bio->bi_bdev = bdev; bio->bi_private = &bb; + bio->bi_op = REQ_OP_DISCARD; bio->bi_rw = type; bio->bi_iter.bi_size = req_sects << 9; @@ -178,7 +179,7 @@ int blkdev_issue_write_same(struct block_device *bdev, sector_t sector, bio->bi_io_vec->bv_page = page; bio->bi_io_vec->bv_offset = 0; bio->bi_io_vec->bv_len = bdev_logical_block_size(bdev); - bio->bi_rw = REQ_WRITE | REQ_WRITE_SAME; + bio->bi_op = REQ_OP_WRITE_SAME; if (nr_sects > max_write_same_sectors) { bio->bi_iter.bi_size = max_write_same_sectors << 9; @@ -240,7 +241,7 @@ static int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, bio->bi_bdev = bdev; bio->bi_end_io = bio_batch_end_io; bio->bi_private = &bb; - bio->bi_rw = WRITE; + bio->bi_op = REQ_OP_WRITE; while (nr_sects != 0) { sz = min((sector_t) PAGE_SIZE >> 9 ,
[PATCH 01/35] block/fs/drivers: remove rw argument from submit_bio
From: Mike Christie This has callers of submit_bio/submit_bio_wait set the bio->bi_rw instead of passing it in. This makes that use the same as generic_make_request and how we set the other bio fields. v2. 1. Set bi_rw instead of ORing it. For cloned bios, I still OR it to keep the old behavior incase there bits we wanted to keep. Signed-off-by: Mike Christie Reviewed-by: Bart Van Assche Reviewed-by: Christoph Hellwig --- block/bio.c | 7 +++ block/blk-core.c| 11 --- block/blk-flush.c | 3 ++- block/blk-lib.c | 9 ++--- drivers/block/drbd/drbd_actlog.c| 2 +- drivers/block/drbd/drbd_bitmap.c| 4 ++-- drivers/block/floppy.c | 3 ++- drivers/block/xen-blkback/blkback.c | 4 +++- drivers/block/xen-blkfront.c| 4 ++-- drivers/md/bcache/debug.c | 6 -- drivers/md/bcache/journal.c | 2 +- drivers/md/bcache/super.c | 4 ++-- drivers/md/dm-bufio.c | 3 ++- drivers/md/dm-io.c | 3 ++- drivers/md/dm-log-writes.c | 9 ++--- drivers/md/dm-thin.c| 3 ++- drivers/md/md.c | 10 +++--- drivers/md/raid1.c | 3 ++- drivers/md/raid10.c | 4 +++- drivers/md/raid5-cache.c| 7 --- drivers/target/target_core_iblock.c | 24 +--- fs/btrfs/check-integrity.c | 18 ++ fs/btrfs/check-integrity.h | 4 ++-- fs/btrfs/disk-io.c | 3 ++- fs/btrfs/extent_io.c| 7 --- fs/btrfs/raid56.c | 17 - fs/btrfs/scrub.c| 16 +++- fs/btrfs/volumes.c | 14 +++--- fs/buffer.c | 3 ++- fs/direct-io.c | 3 ++- fs/ext4/crypto.c| 3 ++- fs/ext4/page-io.c | 3 ++- fs/ext4/readpage.c | 9 + fs/f2fs/data.c | 13 - fs/f2fs/segment.c | 6 -- fs/gfs2/lops.c | 3 ++- fs/gfs2/meta_io.c | 3 ++- fs/gfs2/ops_fstype.c| 3 ++- fs/hfsplus/wrapper.c| 3 ++- fs/jfs/jfs_logmgr.c | 6 -- fs/jfs/jfs_metapage.c | 10 ++ fs/logfs/dev_bdev.c | 15 ++- fs/mpage.c | 3 ++- fs/nfs/blocklayout/blocklayout.c| 22 -- fs/nilfs2/segbuf.c | 3 ++- fs/ocfs2/cluster/heartbeat.c| 12 +++- fs/xfs/xfs_aops.c | 3 ++- fs/xfs/xfs_buf.c| 4 ++-- include/linux/bio.h | 2 +- include/linux/fs.h | 2 +- kernel/power/swap.c | 5 +++-- mm/page_io.c| 10 ++ 52 files changed, 212 insertions(+), 141 deletions(-) diff --git a/block/bio.c b/block/bio.c index cf75915..7e4d050 100644 --- a/block/bio.c +++ b/block/bio.c @@ -859,21 +859,20 @@ static void submit_bio_wait_endio(struct bio *bio) /** * submit_bio_wait - submit a bio, and wait until it completes - * @rw: whether to %READ or %WRITE, or maybe to %READA (read ahead) * @bio: The &struct bio which describes the I/O * * Simple wrapper around submit_bio(). Returns 0 on success, or the error from * bio_endio() on failure. */ -int submit_bio_wait(int rw, struct bio *bio) +int submit_bio_wait(struct bio *bio) { struct submit_bio_ret ret; - rw |= REQ_SYNC; init_completion(&ret.event); bio->bi_private = &ret; bio->bi_end_io = submit_bio_wait_endio; - submit_bio(rw, bio); + bio->bi_rw |= REQ_SYNC; + submit_bio(bio); wait_for_completion_io(&ret.event); return ret.error; diff --git a/block/blk-core.c b/block/blk-core.c index 827f8ba..f23d1b0 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2092,7 +2092,6 @@ EXPORT_SYMBOL(generic_make_request); /** * submit_bio - submit a bio to the block device layer for I/O - * @rw: whether to %READ or %WRITE, or maybe to %READA (read ahead) * @bio: The &struct bio which describes the I/O * * submit_bio() is very similar in purpose to generic_make_request(), and @@ -2100,10 +2099,8 @@ EXPORT_SYMBOL(generic_make_request); * interfaces; @bio must be presetup and ready for I/O. * */ -blk_qc_t submit_bio(int rw, struct bio *bio) +blk_qc_t submit_bio(struct bio *bio) { - bio->bi_rw |= rw; - /* * If it's a regular read/write or a barrier with data attached, * go through the normal accounting stuff before submission. @@ -2111,12 +2108,12 @@ blk_qc_t submit_bio(int rw, struct bio *bio) if (bio_has_data(bio)) { unsigned int count; - if (unl
[PATCH 12/35] gfs2: set bi_op to REQ_OP
From: Mike Christie This patch has gfs2 set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. This patch is compile tested only. v2: Bob, I did not add your signed off, because there was the gfs2_submit_bhs changes since last time you reviewed it. Signed-off-by: Mike Christie --- fs/gfs2/log.c| 8 fs/gfs2/lops.c | 12 +++- fs/gfs2/lops.h | 2 +- fs/gfs2/meta_io.c| 8 +--- fs/gfs2/ops_fstype.c | 1 + 5 files changed, 18 insertions(+), 13 deletions(-) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 0ff028c..e58ccef0 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -657,7 +657,7 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 flags) struct gfs2_log_header *lh; unsigned int tail; u32 hash; - int rw = WRITE_FLUSH_FUA | REQ_META; + int op_flags = WRITE_FLUSH_FUA | REQ_META; struct page *page = mempool_alloc(gfs2_page_pool, GFP_NOIO); enum gfs2_freeze_state state = atomic_read(&sdp->sd_freeze_state); lh = page_address(page); @@ -682,12 +682,12 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 flags) if (test_bit(SDF_NOBARRIERS, &sdp->sd_flags)) { gfs2_ordered_wait(sdp); log_flush_wait(sdp); - rw = WRITE_SYNC | REQ_META | REQ_PRIO; + op_flags = WRITE_SYNC | REQ_META | REQ_PRIO; } sdp->sd_log_idle = (tail == sdp->sd_log_flush_head); gfs2_log_write_page(sdp, page); - gfs2_log_flush_bio(sdp, rw); + gfs2_log_flush_bio(sdp, REQ_OP_WRITE, op_flags); log_flush_wait(sdp); if (sdp->sd_log_tail != tail) @@ -738,7 +738,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, gfs2_ordered_write(sdp); lops_before_commit(sdp, tr); - gfs2_log_flush_bio(sdp, WRITE); + gfs2_log_flush_bio(sdp, REQ_OP_WRITE, 0); if (sdp->sd_log_head != sdp->sd_log_flush_head) { log_flush_wait(sdp); diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c index ce28242..c1099b4 100644 --- a/fs/gfs2/lops.c +++ b/fs/gfs2/lops.c @@ -230,17 +230,19 @@ static void gfs2_end_log_write(struct bio *bio) /** * gfs2_log_flush_bio - Submit any pending log bio * @sdp: The superblock - * @rw: The rw flags + * @op: REQ_OP + * @op_flags: rq_flag_bits * * Submit any pending part-built or full bio to the block device. If * there is no pending bio, then this is a no-op. */ -void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int rw) +void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int op, int op_flags) { if (sdp->sd_log_bio) { atomic_inc(&sdp->sd_log_in_flight); - sdp->sd_log_bio->bi_rw = rw; + sdp->sd_log_bio->bi_op = op; + sdp->sd_log_bio->bi_rw = op_flags; submit_bio(sdp->sd_log_bio); sdp->sd_log_bio = NULL; } @@ -300,7 +302,7 @@ static struct bio *gfs2_log_get_bio(struct gfs2_sbd *sdp, u64 blkno) nblk >>= sdp->sd_fsb2bb_shift; if (blkno == nblk) return bio; - gfs2_log_flush_bio(sdp, WRITE); + gfs2_log_flush_bio(sdp, REQ_OP_WRITE, 0); } return gfs2_log_alloc_bio(sdp, blkno); @@ -329,7 +331,7 @@ static void gfs2_log_write(struct gfs2_sbd *sdp, struct page *page, bio = gfs2_log_get_bio(sdp, blkno); ret = bio_add_page(bio, page, size, offset); if (ret == 0) { - gfs2_log_flush_bio(sdp, WRITE); + gfs2_log_flush_bio(sdp, REQ_OP_WRITE, 0); bio = gfs2_log_alloc_bio(sdp, blkno); ret = bio_add_page(bio, page, size, offset); WARN_ON(ret == 0); diff --git a/fs/gfs2/lops.h b/fs/gfs2/lops.h index a65a7ba..e529f53 100644 --- a/fs/gfs2/lops.h +++ b/fs/gfs2/lops.h @@ -27,7 +27,7 @@ extern const struct gfs2_log_operations gfs2_databuf_lops; extern const struct gfs2_log_operations *gfs2_log_ops[]; extern void gfs2_log_write_page(struct gfs2_sbd *sdp, struct page *page); -extern void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int rw); +extern void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int op, int op_flags); extern void gfs2_pin(struct gfs2_sbd *sdp, struct buffer_head *bh); static inline unsigned int buf_limit(struct gfs2_sbd *sdp) diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c index 96c8140..f790f38 100644 --- a/fs/gfs2/meta_io.c +++ b/fs/gfs2/meta_io.c @@ -213,7 +213,8 @@ static void gfs2_meta_read_endio(struct bio *bio) * Submit several consecutive buffer head I/O requests as a single bio I/O * request. (See submit_bh_wbc.) */ -static void gfs2_submit_bhs(int rw, struct buffer_head *bhs[], int num) +static void gfs2_submit_bhs(int op, int op_flags, struct buffer_head *bhs[], + int num) { struct buffer_head *bh = bhs[0]; struct bio *bio; @@ -230,7 +231,8 @@ static void gfs2_submit_bhs(int r
[PATCH 08/35] btrfs: set bi_op tp REQ_OP
From: Mike Christie This patch has btrfs set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. Signed-off-by: Mike Christie --- fs/btrfs/check-integrity.c | 19 +-- fs/btrfs/compression.c | 4 fs/btrfs/disk-io.c | 7 --- fs/btrfs/inode.c | 20 +--- fs/btrfs/raid56.c | 10 +- fs/btrfs/scrub.c | 9 + fs/btrfs/volumes.c | 20 ++-- 7 files changed, 50 insertions(+), 39 deletions(-) diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c index 1c3c40a..d95c323 100644 --- a/fs/btrfs/check-integrity.c +++ b/fs/btrfs/check-integrity.c @@ -1671,7 +1671,7 @@ static int btrfsic_read_block(struct btrfsic_state *state, } bio->bi_bdev = block_ctx->dev->bdev; bio->bi_iter.bi_sector = dev_bytenr >> 9; - bio->bi_rw = READ; + bio->bi_op = REQ_OP_READ; for (j = i; j < num_pages; j++) { ret = bio_add_page(bio, block_ctx->pagev[j], @@ -2920,7 +2920,6 @@ int btrfsic_submit_bh(int op, int op_flags, struct buffer_head *bh) static void __btrfsic_submit_bio(struct bio *bio) { struct btrfsic_dev_state *dev_state; - int rw = bio->bi_rw; if (!btrfsic_is_initialized) return; @@ -2930,7 +2929,7 @@ static void __btrfsic_submit_bio(struct bio *bio) * btrfsic_mount(), this might return NULL */ dev_state = btrfsic_dev_state_lookup(bio->bi_bdev); if (NULL != dev_state && - (rw & WRITE) && NULL != bio->bi_io_vec) { + (bio->bi_op == REQ_OP_WRITE) && NULL != bio->bi_io_vec) { unsigned int i; u64 dev_bytenr; u64 cur_bytenr; @@ -2942,9 +2941,9 @@ static void __btrfsic_submit_bio(struct bio *bio) if (dev_state->state->print_mask & BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH) printk(KERN_INFO - "submit_bio(rw=0x%x, bi_vcnt=%u," + "submit_bio(rw=%d,0x%lx, bi_vcnt=%u," " bi_sector=%llu (bytenr %llu), bi_bdev=%p)\n", - rw, bio->bi_vcnt, + bio->bi_op, bio->bi_rw, bio->bi_vcnt, (unsigned long long)bio->bi_iter.bi_sector, dev_bytenr, bio->bi_bdev); @@ -2975,18 +2974,18 @@ static void __btrfsic_submit_bio(struct bio *bio) btrfsic_process_written_block(dev_state, dev_bytenr, mapped_datav, bio->bi_vcnt, bio, &bio_is_patched, - NULL, rw); + NULL, bio->bi_rw); while (i > 0) { i--; kunmap(bio->bi_io_vec[i].bv_page); } kfree(mapped_datav); - } else if (NULL != dev_state && (rw & REQ_FLUSH)) { + } else if (NULL != dev_state && (bio->bi_rw & REQ_FLUSH)) { if (dev_state->state->print_mask & BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH) printk(KERN_INFO - "submit_bio(rw=0x%x FLUSH, bdev=%p)\n", - rw, bio->bi_bdev); + "submit_bio(rw=%d,0x%lx FLUSH, bdev=%p)\n", + bio->bi_op, bio->bi_rw, bio->bi_bdev); if (!dev_state->dummy_block_for_bio_bh_flush.is_iodone) { if ((dev_state->state->print_mask & (BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH | @@ -3004,7 +3003,7 @@ static void __btrfsic_submit_bio(struct bio *bio) block->never_written = 0; block->iodone_w_error = 0; block->flush_gen = dev_state->last_flush_gen + 1; - block->submit_bio_bh_rw = rw; + block->submit_bio_bh_rw = bio->bi_rw; block->orig_bio_bh_private = bio->bi_private; block->orig_bio_bh_end_io.bio = bio->bi_end_io; block->next_in_same_bio = NULL; diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 3346cd8..7e64f3e 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -363,6 +363,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 start, kfree(cb); return -ENOMEM; } + bio->bi_op = REQ_OP_WRITE; bio->bi_private = cb; bio->bi_end_io = end_compressed_bio_write; atomic_inc(&cb->pending_bios); @@ -408,6 +409,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 start, bio = compressed_bio_alloc(bdev, first_byte, GFP_N
[PATCH 18/35] pm: set bi_op to REQ_OP
From: Mike Christie This patch has the pm swap code set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. This patch is compile tested only. Signed-off-by: Mike Christie --- kernel/power/swap.c | 31 +++ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/kernel/power/swap.c b/kernel/power/swap.c index 4d050eb..adbcb1b 100644 --- a/kernel/power/swap.c +++ b/kernel/power/swap.c @@ -250,7 +250,7 @@ static void hib_end_io(struct bio *bio) bio_put(bio); } -static int hib_submit_io(int rw, pgoff_t page_off, void *addr, +static int hib_submit_io(int op, int op_flags, pgoff_t page_off, void *addr, struct hib_bio_batch *hb) { struct page *page = virt_to_page(addr); @@ -260,7 +260,8 @@ static int hib_submit_io(int rw, pgoff_t page_off, void *addr, bio = bio_alloc(__GFP_RECLAIM | __GFP_HIGH, 1); bio->bi_iter.bi_sector = page_off * (PAGE_SIZE >> 9); bio->bi_bdev = hib_resume_bdev; - bio->bi_rw = rw; + bio->bi_op = op; + bio->bi_rw = op_flags; if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) { printk(KERN_ERR "PM: Adding page to bio failed at %llu\n", @@ -296,7 +297,8 @@ static int mark_swapfiles(struct swap_map_handle *handle, unsigned int flags) { int error; - hib_submit_io(READ_SYNC, swsusp_resume_block, swsusp_header, NULL); + hib_submit_io(REQ_OP_READ, READ_SYNC, swsusp_resume_block, + swsusp_header, NULL); if (!memcmp("SWAP-SPACE",swsusp_header->sig, 10) || !memcmp("SWAPSPACE2",swsusp_header->sig, 10)) { memcpy(swsusp_header->orig_sig,swsusp_header->sig, 10); @@ -305,8 +307,8 @@ static int mark_swapfiles(struct swap_map_handle *handle, unsigned int flags) swsusp_header->flags = flags; if (flags & SF_CRC32_MODE) swsusp_header->crc32 = handle->crc32; - error = hib_submit_io(WRITE_SYNC, swsusp_resume_block, - swsusp_header, NULL); + error = hib_submit_io(REQ_OP_WRITE, WRITE_SYNC, + swsusp_resume_block, swsusp_header, NULL); } else { printk(KERN_ERR "PM: Swap header not found!\n"); error = -ENODEV; @@ -379,7 +381,7 @@ static int write_page(void *buf, sector_t offset, struct hib_bio_batch *hb) } else { src = buf; } - return hib_submit_io(WRITE_SYNC, offset, src, hb); + return hib_submit_io(REQ_OP_WRITE, WRITE_SYNC, offset, src, hb); } static void release_swap_writer(struct swap_map_handle *handle) @@ -982,7 +984,8 @@ static int get_swap_reader(struct swap_map_handle *handle, return -ENOMEM; } - error = hib_submit_io(READ_SYNC, offset, tmp->map, NULL); + error = hib_submit_io(REQ_OP_READ, READ_SYNC, offset, + tmp->map, NULL); if (error) { release_swap_reader(handle); return error; @@ -1006,7 +1009,7 @@ static int swap_read_page(struct swap_map_handle *handle, void *buf, offset = handle->cur->entries[handle->k]; if (!offset) return -EFAULT; - error = hib_submit_io(READ_SYNC, offset, buf, hb); + error = hib_submit_io(REQ_OP_READ, READ_SYNC, offset, buf, hb); if (error) return error; if (++handle->k >= MAP_PAGE_ENTRIES) { @@ -1508,7 +1511,8 @@ int swsusp_check(void) if (!IS_ERR(hib_resume_bdev)) { set_blocksize(hib_resume_bdev, PAGE_SIZE); clear_page(swsusp_header); - error = hib_submit_io(READ_SYNC, swsusp_resume_block, + error = hib_submit_io(REQ_OP_READ, READ_SYNC, + swsusp_resume_block, swsusp_header, NULL); if (error) goto put; @@ -1516,7 +1520,8 @@ int swsusp_check(void) if (!memcmp(HIBERNATE_SIG, swsusp_header->sig, 10)) { memcpy(swsusp_header->sig, swsusp_header->orig_sig, 10); /* Reset swap signature now */ - error = hib_submit_io(WRITE_SYNC, swsusp_resume_block, + error = hib_submit_io(REQ_OP_WRITE, WRITE_SYNC, + swsusp_resume_block, swsusp_header, NULL); } else { error = -EINVAL; @@ -1560,10 +1565,12 @@ int swsusp_unmark(void) { int error; - hib_submit_io(READ_SYNC, swsusp_resume_block, swsusp_header, NULL); + hib_submit_io(REQ_OP_READ, READ_SYNC, swsusp_resume_block, + swsusp_header, NULL); if (!memcmp(HIBERNATE_SIG,s
[PATCH 16/35] nilfs: set bi_op to REQ_OP
From: Mike Christie This patch has nilfs set the bio bi_op to a REQ_OP, and rq_flag_bits to bi_rw. This patch is compile tested only. Signed-off-by: Mike Christie --- fs/nilfs2/segbuf.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/fs/nilfs2/segbuf.c b/fs/nilfs2/segbuf.c index 7666f1d..7b13e14 100644 --- a/fs/nilfs2/segbuf.c +++ b/fs/nilfs2/segbuf.c @@ -350,7 +350,8 @@ static void nilfs_end_bio_write(struct bio *bio) } static int nilfs_segbuf_submit_bio(struct nilfs_segment_buffer *segbuf, - struct nilfs_write_info *wi, int mode) + struct nilfs_write_info *wi, int mode, + int mode_flags) { struct bio *bio = wi->bio; int err; @@ -368,7 +369,8 @@ static int nilfs_segbuf_submit_bio(struct nilfs_segment_buffer *segbuf, bio->bi_end_io = nilfs_end_bio_write; bio->bi_private = segbuf; - bio->bi_rw = mode; + bio->bi_op = mode; + bio->bi_rw = mode_flags; submit_bio(bio); segbuf->sb_nbio++; @@ -442,7 +444,7 @@ static int nilfs_segbuf_submit_bh(struct nilfs_segment_buffer *segbuf, return 0; } /* bio is FULL */ - err = nilfs_segbuf_submit_bio(segbuf, wi, mode); + err = nilfs_segbuf_submit_bio(segbuf, wi, mode, 0); /* never submit current bh */ if (likely(!err)) goto repeat; @@ -466,19 +468,19 @@ static int nilfs_segbuf_write(struct nilfs_segment_buffer *segbuf, { struct nilfs_write_info wi; struct buffer_head *bh; - int res = 0, rw = WRITE; + int res = 0; wi.nilfs = nilfs; nilfs_segbuf_prepare_write(segbuf, &wi); list_for_each_entry(bh, &segbuf->sb_segsum_buffers, b_assoc_buffers) { - res = nilfs_segbuf_submit_bh(segbuf, &wi, bh, rw); + res = nilfs_segbuf_submit_bh(segbuf, &wi, bh, REQ_OP_WRITE); if (unlikely(res)) goto failed_bio; } list_for_each_entry(bh, &segbuf->sb_payload_buffers, b_assoc_buffers) { - res = nilfs_segbuf_submit_bh(segbuf, &wi, bh, rw); + res = nilfs_segbuf_submit_bh(segbuf, &wi, bh, REQ_OP_WRITE); if (unlikely(res)) goto failed_bio; } @@ -488,8 +490,8 @@ static int nilfs_segbuf_write(struct nilfs_segment_buffer *segbuf, * Last BIO is always sent through the following * submission. */ - rw |= REQ_SYNC; - res = nilfs_segbuf_submit_bio(segbuf, &wi, rw); + res = nilfs_segbuf_submit_bio(segbuf, &wi, REQ_OP_WRITE, + REQ_SYNC); } failed_bio: -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] "block: make generic_make_request handle arbitrarily sized bios" breaks boot on parisc-linux
Hi Ming Lei, On 24.02.2016 08:59, Ming Lei wrote: > On Wed, Feb 24, 2016 at 10:28 AM, John David Anglin > wrote: >> The following block change breaks boot on parisc-linux: >> >> commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e >> Author: Kent Overstreet >> Date: Thu Apr 23 22:37:18 2015 -0700 >> >>block: make generic_make_request handle arbitrarily sized bios >> >>The way the block layer is currently written, it goes to great lengths >>to avoid having to split bios; upper layer code (such as bio_add_page()) >>checks what the underlying device can handle and tries to always create >>bios that don't need to be split. >> >>But this approach becomes unwieldy and eventually breaks down with >>stacked devices and devices with dynamic limits, and it adds a lot of >>complexity. If the block layer could split bios as needed, we could >>eliminate a lot of complexity elsewhere - particularly in stacked >>drivers. Code that creates bios can then create whatever size bios are >>convenient, and more importantly stacked drivers don't have to deal with >>both their own bio size limitations and the limitations of the >>(potentially multiple) devices underneath them. In the future this will >>let us delete merge_bvec_fn and a bunch of other code. >> >>We do this by adding calls to blk_queue_split() to the various >>make_request functions that need it - a few can already handle arbitrary >>size bios. Note that we add the call _after_ any call to >>blk_queue_bounce(); this means that blk_queue_split() and >>blk_recalc_rq_segments() don't need to be concerned with bouncing >>affecting segment merging. >> >>Some make_request_fn() callbacks were simple enough to audit and verify >>they don't need blk_queue_split() calls. The skipped ones are: >> >> * nfhd_make_request (arch/m68k/emu/nfblock.c) >> * axon_ram_make_request (arch/powerpc/sysdev/axonram.c) >> * simdisk_make_request (arch/xtensa/platforms/iss/simdisk.c) >> * brd_make_request (ramdisk - drivers/block/brd.c) >> * mtip_submit_request (drivers/block/mtip32xx/mtip32xx.c) >> * loop_make_request >> * null_queue_bio >> * bcache's make_request fns >> >>Some others are almost certainly safe to remove now, but will be left >>for future patches. >> >>Cc: Jens Axboe >>Cc: Christoph Hellwig >>Cc: Al Viro >>Cc: Ming Lei >>Cc: Neil Brown >>Cc: Alasdair Kergon >>Cc: Mike Snitzer >>Cc: dm-de...@redhat.com >>Cc: Lars Ellenberg >>Cc: drbd-u...@lists.linbit.com >>Cc: Jiri Kosina >>Cc: Geoff Levand >>Cc: Jim Paris >>Cc: Philip Kelleher >>Cc: Minchan Kim >>Cc: Nitin Gupta >>Cc: Oleg Drokin >>Cc: Andreas Dilger >>Acked-by: NeilBrown (for the 'md/md.c' bits) >>Acked-by: Mike Snitzer >>Reviewed-by: Martin K. Petersen >>Signed-off-by: Kent Overstreet >>[dpark: skip more mq-based drivers, resolve merge conflicts, etc.] >>Signed-off-by: Dongsu Park >>Signed-off-by: Ming Lin >>Signed-off-by: Jens Axboe >> >> This thread on the linux-parisc has most of the discussion and analysis: >> http://www.spinics.net/lists/linux-parisc/msg06710.html >> >> Essentially, the SCSI layer underestimates the number of sg segments needed >> and we run off the end of the sg list and crash. >> This happens because the protect bit is ignored. As a result 4.3 and later >> kernels fail to boot. This includes the current Debian >> kernel for hppa. >> >> Hopefully, the block group can help resolve this issue. We can help with >> testing if needed. >> > > We fixed several similar bugs, but maybe there is another one, :-( Thanks for your help! > Could you apply the attached debug patch and post the log after the issue is > triggered? Sadly I was not yet able to produce the requested output for you. It seems we have - probably triggered due to the block splitting itself - some kind of memory/stack corruption in here as well. Note, the stack grows upwards(!) on parisc, so maybe local variables get overwritten somehow...? First I applied your patch as is. Here is the system log so far: [ 24.94] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 25.00] sd 3:0:6:0: [sdb] 71132960 512-byte logical blocks: (36.4 GB/33.9 GiB) [ 25.092000] sd 3:0:5:0: [sda] Write Protect is off [ 25.152000] sd 3:0:5:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 25.268000] sd 3:0:6:0: [sdb] Write Protect is off [ 25.328000] sd 3:0:6:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 25.452000] sda: sda1 sda2 sda3 < sda5 sda6 > [ 25.508000] sdb: sdb1 sdb2 sdb3 < sdb5 sdb6 > [ 25.584000] scsi_id(112): unaligned access to 0xfacac009 at ip=0x4100390b [ 25.688000] sd 3:0:6:0: [sdb] Attached SCSI disk [ 25.752000] sd 3:0:5:0: [sda] Attached SCSI disk [ 25.84] scsi_id(113): unaligned
Re: [PATCH 26/35] block: set op to REQ_OP
Hi Mike, [auto build test WARNING on next-20160224] [cannot apply to dm/for-next v4.5-rc5 v4.5-rc4 v4.5-rc3 v4.5-rc5] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/mchristi-redhat-com/separate-operations-from-flags-in-the-bio-request-structs/20160225-041726 reproduce: make htmldocs All warnings (new ones prefixed by >>): lib/crc32.c:148: warning: No description found for parameter 'tab)[256]' lib/crc32.c:148: warning: Excess function parameter 'tab' description in 'crc32_le_generic' lib/crc32.c:293: warning: No description found for parameter 'tab)[256]' lib/crc32.c:293: warning: Excess function parameter 'tab' description in 'crc32_be_generic' lib/crc32.c:1: warning: no structured comments found >> block/blk-core.c:1248: warning: No description found for parameter 'op' >> block/blk-core.c:1248: warning: No description found for parameter 'op' vim +/op +1248 block/blk-core.c da8303c6 block/blk-core.c Tejun Heo 2011-10-19 1232 * @q: request_queue to allocate request from 3cbf3506 block/blk-core.c Mike Christie 2016-02-24 1233 * op: REQ_OP_READ/REQ_OP_WRITE 3cbf3506 block/blk-core.c Mike Christie 2016-02-24 1234 * @op_flags: rq_flag_bits da8303c6 block/blk-core.c Tejun Heo 2011-10-19 1235 * @bio: bio to allocate request for (can be %NULL) a06e05e6 block/blk-core.c Tejun Heo 2012-06-04 1236 * @gfp_mask: allocation mask da8303c6 block/blk-core.c Tejun Heo 2011-10-19 1237 * d0164adc block/blk-core.c Mel Gorman 2015-11-06 1238 * Get a free request from @q. If %__GFP_DIRECT_RECLAIM is set in @gfp_mask, d0164adc block/blk-core.c Mel Gorman 2015-11-06 1239 * this function keeps retrying under memory pressure and fails iff @q is dead. d6344532 drivers/block/ll_rw_blk.c Nick Piggin2005-06-28 1240 * da3dae54 block/blk-core.c Masanari Iida 2014-09-09 1241 * Must be called with @q->queue_lock held and, a492f075 block/blk-core.c Joe Lawrence 2014-08-28 1242 * Returns ERR_PTR on failure, with @q->queue_lock held. a492f075 block/blk-core.c Joe Lawrence 2014-08-28 1243 * Returns request pointer on success, with @q->queue_lock *not held*. ^1da177e drivers/block/ll_rw_blk.c Linus Torvalds 2005-04-16 1244 */ 3cbf3506 block/blk-core.c Mike Christie 2016-02-24 1245 static struct request *get_request(struct request_queue *q, int op, 3cbf3506 block/blk-core.c Mike Christie 2016-02-24 1246 int op_flags, struct bio *bio, 3cbf3506 block/blk-core.c Mike Christie 2016-02-24 1247 gfp_t gfp_mask) ^1da177e drivers/block/ll_rw_blk.c Linus Torvalds 2005-04-16 @1248 { 3cbf3506 block/blk-core.c Mike Christie 2016-02-24 1249 const bool is_sync = rw_is_sync(op, op_flags) != 0; 450991bc drivers/block/ll_rw_blk.c Nick Piggin2005-06-28 1250 DEFINE_WAIT(wait); a051661c block/blk-core.c Tejun Heo 2012-06-26 1251 struct request_list *rl; a06e05e6 block/blk-core.c Tejun Heo 2012-06-04 1252 struct request *rq; a051661c block/blk-core.c Tejun Heo 2012-06-26 1253 a051661c block/blk-core.c Tejun Heo 2012-06-26 1254 rl = blk_get_rl(q, bio);/* transferred to @rq on success */ a06e05e6 block/blk-core.c Tejun Heo 2012-06-04 1255 retry: 3cbf3506 block/blk-core.c Mike Christie 2016-02-24 1256 rq = __get_request(rl, op, op_flags, bio, gfp_mask); :: The code at line 1248 was first introduced by commit :: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 Linux-2.6.12-rc2 :: TO: Linus Torvalds :: CC: Linus Torvalds --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
[PATCH v1 3/3] scsi: allow scsi devices to use direct complete
This allows scsi devices to remain runtime suspended for system suspend. Since runtime suspend is stricter than system suspend callbacks, this is just returning a positive number for the prepare callback. Signed-off-by: Derek Basehore Reviewed-by: Eric Caruso --- drivers/scsi/scsi_pm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c index b44c1bb..7af76ad 100644 --- a/drivers/scsi/scsi_pm.c +++ b/drivers/scsi/scsi_pm.c @@ -178,7 +178,7 @@ static int scsi_bus_prepare(struct device *dev) /* Wait until async scanning is finished */ scsi_complete_async_scans(); } - return 0; + return 1; } static int scsi_bus_suspend(struct device *dev) -- 2.7.0.rc3.207.g0ac5344 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 1/3] PM / sleep: Check legacy pm callbacks for direct complete
This adds checks for legacy pm callbacks when setting no_pm_callbacks. This fixes an issue where these suspend/resume callbacks were incorrectly ignored during suspend/resume with direct complete. Fixes: 4534d9d881f9 "PM / sleep: Go direct_complete if driver has..." Signed-off-by: Derek Basehore --- drivers/base/power/main.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 6e7c3cc..e0017d9 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1764,8 +1764,10 @@ void device_pm_check_callbacks(struct device *dev) { spin_lock_irq(&dev->power.lock); dev->power.no_pm_callbacks = - (!dev->bus || pm_ops_is_empty(dev->bus->pm)) && - (!dev->class || pm_ops_is_empty(dev->class->pm)) && + (!dev->bus || (!dev->bus->resume && !dev->bus->suspend && + pm_ops_is_empty(dev->bus->pm))) && + (!dev->class || (!dev->class->resume && !dev->class->suspend && +pm_ops_is_empty(dev->class->pm))) && (!dev->type || pm_ops_is_empty(dev->type->pm)) && (!dev->pm_domain || pm_ops_is_empty(&dev->pm_domain->ops)) && (!dev->driver || pm_ops_is_empty(dev->driver->pm)); -- 2.7.0.rc3.207.g0ac5344 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 2/3] PM / sleep: try to runtime suspend for direct complete
This tries to runtime suspend devices that are still active for direct complete. This is for cases such as autosuspend delays which leaves devices able to runtime suspend but still active. It's beneficial in this case to runtime suspend the device to take advantage of direct complete when possible. Signed-off-by: Derek Basehore Reviewed-by: Eric Caruso --- drivers/base/power/main.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index e0017d9..9693032 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1380,7 +1380,12 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async) goto Complete; if (dev->power.direct_complete) { - if (pm_runtime_status_suspended(dev)) { + /* +* Check if we're runtime suspended. If not, try to runtime +* suspend for autosuspend cases. +*/ + if (pm_runtime_status_suspended(dev) || + !pm_runtime_suspend(dev)) { pm_runtime_disable(dev); if (pm_runtime_status_suspended(dev)) goto Complete; -- 2.7.0.rc3.207.g0ac5344 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/3] PM / sleep: Check legacy pm callbacks for direct complete
This adds checks for legacy pm callbacks when setting no_pm_callbacks. This fixes an issue where these suspend/resume callbacks were incorrectly ignored during suspend/resume with direct complete. Fixes: aa8e54b55947 "PM / sleep: Go direct_complete if driver has..." Signed-off-by: Derek Basehore --- drivers/base/power/main.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 6e7c3cc..e0017d9 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1764,8 +1764,10 @@ void device_pm_check_callbacks(struct device *dev) { spin_lock_irq(&dev->power.lock); dev->power.no_pm_callbacks = - (!dev->bus || pm_ops_is_empty(dev->bus->pm)) && - (!dev->class || pm_ops_is_empty(dev->class->pm)) && + (!dev->bus || (!dev->bus->resume && !dev->bus->suspend && + pm_ops_is_empty(dev->bus->pm))) && + (!dev->class || (!dev->class->resume && !dev->class->suspend && +pm_ops_is_empty(dev->class->pm))) && (!dev->type || pm_ops_is_empty(dev->type->pm)) && (!dev->pm_domain || pm_ops_is_empty(&dev->pm_domain->ops)) && (!dev->driver || pm_ops_is_empty(dev->driver->pm)); -- 2.7.0.rc3.207.g0ac5344 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/3] scsi: allow scsi devices to use direct complete
This allows scsi devices to remain runtime suspended for system suspend. Since runtime suspend is stricter than system suspend callbacks, this is just returning a positive number for the prepare callback. Signed-off-by: Derek Basehore Reviewed-by: Eric Caruso --- drivers/scsi/scsi_pm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c index b44c1bb..7af76ad 100644 --- a/drivers/scsi/scsi_pm.c +++ b/drivers/scsi/scsi_pm.c @@ -178,7 +178,7 @@ static int scsi_bus_prepare(struct device *dev) /* Wait until async scanning is finished */ scsi_complete_async_scans(); } - return 0; + return 1; } static int scsi_bus_suspend(struct device *dev) -- 2.7.0.rc3.207.g0ac5344 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/3] PM / sleep: try to runtime suspend for direct complete
This tries to runtime suspend devices that are still active for direct complete. This is for cases such as autosuspend delays which leaves devices able to runtime suspend but still active. It's beneficial in this case to runtime suspend the device to take advantage of direct complete when possible. Signed-off-by: Derek Basehore Reviewed-by: Eric Caruso --- drivers/base/power/main.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index e0017d9..9693032 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1380,7 +1380,12 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async) goto Complete; if (dev->power.direct_complete) { - if (pm_runtime_status_suspended(dev)) { + /* +* Check if we're runtime suspended. If not, try to runtime +* suspend for autosuspend cases. +*/ + if (pm_runtime_status_suspended(dev) || + !pm_runtime_suspend(dev)) { pm_runtime_disable(dev); if (pm_runtime_status_suspended(dev)) goto Complete; -- 2.7.0.rc3.207.g0ac5344 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 31/35] block, fs: remove old REQ definitions.
Hi Mike, [auto build test WARNING on next-20160224] [cannot apply to dm/for-next v4.5-rc5 v4.5-rc4 v4.5-rc3 v4.5-rc5] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/mchristi-redhat-com/separate-operations-from-flags-in-the-bio-request-structs/20160225-041726 config: i386-allmodconfig (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=i386 All warnings (new ones prefixed by >>): In file included from include/linux/pagemap.h:8:0, from fs/crypto/crypto.c:24: fs/crypto/crypto.c: In function 'fscrypt_zeroout_range': >> include/linux/fs.h:198:19: warning: passing argument 1 of 'submit_bio_wait' >> makes pointer from integer without a cast [-Wint-conversion] #define RW_MASK REQ_OP_WRITE ^ include/linux/fs.h:202:17: note: in expansion of macro 'RW_MASK' #define WRITE RW_MASK ^ fs/crypto/crypto.c:325:25: note: in expansion of macro 'WRITE' err = submit_bio_wait(WRITE, bio); ^ In file included from fs/crypto/crypto.c:29:0: include/linux/bio.h:449:12: note: expected 'struct bio *' but argument is of type 'int' extern int submit_bio_wait(struct bio *bio); ^ fs/crypto/crypto.c:325:9: error: too many arguments to function 'submit_bio_wait' err = submit_bio_wait(WRITE, bio); ^ In file included from fs/crypto/crypto.c:29:0: include/linux/bio.h:449:12: note: declared here extern int submit_bio_wait(struct bio *bio); ^ vim +/submit_bio_wait +198 include/linux/fs.h 182 * READAUsed for read-ahead operations. Lower priority, and the 183 * block layer could (in theory) choose to ignore this 184 * request if it runs into resource problems. 185 * WRITEA normal async write. Device will be plugged. 186 * WRITE_SYNC Synchronous write. Identical to WRITE, but passes down 187 * the hint that someone will be waiting on this IO 188 * shortly. The write equivalent of READ_SYNC. 189 * WRITE_ODIRECTSpecial case write for O_DIRECT only. 190 * WRITE_FLUSH Like WRITE_SYNC but with preceding cache flush. 191 * WRITE_FUALike WRITE_SYNC but data is guaranteed to be on 192 * non-volatile media on completion. 193 * WRITE_FLUSH_FUA Combination of WRITE_FLUSH and FUA. The IO is preceded 194 * by a cache flush and data is guaranteed to be on 195 * non-volatile media on completion. 196 * 197 */ > 198 #define RW_MASK REQ_OP_WRITE 199 #define RWA_MASKREQ_RAHEAD 200 201 #define READREQ_OP_READ 202 #define WRITE RW_MASK 203 #define READA RWA_MASK 204 205 #define READ_SYNC REQ_SYNC 206 #define WRITE_SYNC (REQ_SYNC | REQ_NOIDLE) --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
Re: [BUG] "block: make generic_make_request handle arbitrarily sized bios" breaks boot on parisc-linux
On Thu, Feb 25, 2016 at 7:28 AM, John David Anglin wrote: > On 2016-02-24, at 4:36 PM, Helge Deller wrote: > >> Maybe Dave has more luck, otherwise I'll continue to try to get some info. > > I tried your patch on the commit in linux-block which first failed to boot. > As with Helge, the > system crashed and no useful data was output on console. I then applied > following patch > to give some extra segments and tired again: > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index b1a2631..b421f03 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -595,6 +595,11 @@ static int scsi_alloc_sgtable(struct scsi_data_buffer > *sdb, int nents, bool mq) > > BUG_ON(!nents); > > + /* Provide extra entries in case of split. */ > + nents += 8; > + if (nents > SCSI_MAX_SG_SEGMENTS) > + nents = SCSI_MAX_SG_SEGMENTS; > + Yeah, this is needed for sake of safety. > if (mq) { > if (nents <= SCSI_MAX_SG_SEGMENTS) { > sdb->table.nents = nents; > > The attached file shows the crash in first boot. The second boot was > successful and various output > was generated by your check code. >From the following log(just select one simple, and looks all are similar) in 2nd boot, the bi_phys_segments is figured out as one by block core , which is wrong because the max segment size is 64k according to your investigation in the below link, but the whole req/bio is 192k(4k*48). http://www.spinics.net/lists/linux-parisc/msg06749.html Looks weird, it shouldn't have happened because blk_bio_segment_split() does respect the max segment size limit. BTW, what is the scsi driver for the device? blk_rq_map_sg: merge bug: 3 1, extra_len 0, dma_drain 0 check_bvec: dump bvec for 7e53c5f0(f:2449, t:1) 0: 0 4096 245852 7e2c4c40 1: 0 4096 245853 7e2c4c40 2: 0 4096 245854 7e2c4c40 3: 0 4096 245855 7e2c4c40 4: 0 4096 245856 7e2c4c40 5: 0 4096 245857 7e2c4c40 6: 0 4096 245858 7e2c4c40 7: 0 4096 245859 7e2c4c40 8: 0 4096 245860 7e2c4c40 9: 0 4096 245861 7e2c4c40 10: 0 4096 245862 7e2c4c40 11: 0 4096 245863 7e2c4c40 12: 0 4096 245864 7e2c4c40 13: 0 4096 245865 7e2c4c40 14: 0 4096 245866 7e2c4c40 15: 0 4096 245867 7e2c4c40 16: 0 4096 245868 7e2c4c40 17: 0 4096 245869 7e2c4c40 18: 0 4096 245870 7e2c4c40 19: 0 4096 245871 7e2c4c40 20: 0 4096 245872 7e2c4c40 21: 0 4096 245873 7e2c4c40 22: 0 4096 245874 7e2c4c40 23: 0 4096 245875 7e2c4c40 24: 0 4096 245876 7e2c4c40 25: 0 4096 245877 7e2c4c40 26: 0 4096 245878 7e2c4c40 27: 0 4096 245879 7e2c4c40 28: 0 4096 245880 7e2c4c40 29: 0 4096 245881 7e2c4c40 30: 0 4096 245882 7e2c4c40 31: 0 4096 245883 7e2c4c40 32: 0 4096 245884 7e2c4c40 33: 0 4096 245885 7e2c4c40 34: 0 4096 245886 7e2c4c40 35: 0 4096 245887 7e2c4c40 36: 0 4096 245888 7e2c4c40 37: 0 4096 245889 7e2c4c40 38: 0 4096 245890 7e2c4c40 39: 0 4096 245891 7e2c4c40 40: 0 4096 245892 7e2c4c40 41: 0 4096 245893 7e2c4c40 42: 0 4096 245894 7e2c4c40 43: 0 4096 245895 7e2c4c40 44: 0 4096 245896 7e2c4c40 45: 0 4096 245897 7e2c4c40 46: 0 4096 245898 7e2c4c40 47: 0 4096 245899 7e2c4c40 Thanks, Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html