the orig patch better.
As I mentioned last time, why can't we fix ->cmd_per_lun in
scsi_add_host() using .can_queue?
--
Ming
f (at_head || blk_rq_is_passthrough(rq)) {
> - if (at_head)
> - list_add(&rq->queuelist, &dd->dispatch);
> - else
> - list_add_tail(&rq->queuelist, &dd->dispatch);
> + if (at_head) {
> + list_add(&rq->queuelist, &dd->dispatch);
> } else {
> deadline_add_rq_rb(dd, rq);
>
> --
> 2.30.2
>
Looks fine:
Reviewed-by: Ming Lei
Thanks,
Ming
/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -272,6 +272,12 @@ static inline bool bio_is_passthrough(struct bio *bio)
> return blk_op_is_scsi(op) || blk_op_is_private(op);
> }
>
> +static inline bool blk_op_is_passthrough(unsigned int op)
> +{
> + return (blk_op_is_
eun, can you check if multiple get_more_blocks() is called for submitting
32MB in your test?
In my 32MB sync dio f2fs test on x86_64 VM, one buffer_head mapping can
hold 32MB, but it is one freshly new f2fs.
I'd suggest to understand the issue completely before figuring out one
solution.
[1]
https://lore.kernel.org/linux-block/20210202041204.28995-1-nanich@samsung.com/
Thanks,
Ming
mpfs, memory is charged appropriately.
>
> This patch also exports cgroup_get_e_css and int_active_memcg so it
> can be used by the loop module.
>
> Signed-off-by: Dan Schatzberg
> Acked-by: Johannes Weiner
Reviewed-by: Ming Lei
--
Ming Lei
gt;lo_work_lock);
> }
>
> static const struct blk_mq_ops loop_mq_ops = {
> .queue_rq = loop_queue_rq,
> - .init_request = loop_init_request,
> .complete = lo_complete_rq,
> };
>
> @@ -2164,6 +2300,7 @@ static int loop_add(struct loop_device **l, int i)
> mutex_init(&lo->lo_mutex);
> lo->lo_number = i;
> spin_lock_init(&lo->lo_lock);
> + spin_lock_init(&lo->lo_work_lock);
> disk->major = LOOP_MAJOR;
> disk->first_minor = i << part_shift;
> disk->fops = &lo_fops;
> diff --git a/drivers/block/loop.h b/drivers/block/loop.h
> index a3c04f310672..9289c1cd6374 100644
> --- a/drivers/block/loop.h
> +++ b/drivers/block/loop.h
> @@ -14,7 +14,6 @@
> #include
> #include
> #include
> -#include
> #include
>
> /* Possible states of device */
> @@ -54,8 +53,13 @@ struct loop_device {
>
> spinlock_t lo_lock;
> int lo_state;
> - struct kthread_worker worker;
> - struct task_struct *worker_task;
> + spinlock_t lo_work_lock;
> + struct workqueue_struct *workqueue;
> + struct work_struct rootcg_work;
> + struct list_headrootcg_cmd_list;
> + struct list_headidle_worker_list;
> + struct rb_root worker_tree;
> + struct timer_list timer;
> booluse_dio;
> boolsysfs_inited;
>
> @@ -66,7 +70,7 @@ struct loop_device {
> };
>
> struct loop_cmd {
> - struct kthread_work work;
> + struct list_head list_entry;
> bool use_aio; /* use AIO interface to handle I/O */
> atomic_t ref; /* only for aio */
> long ret;
> --
> 2.30.2
>
Reviewed-by: Ming Lei
--
Ming Lei
On Sat, Apr 03, 2021 at 04:10:16PM +0800, Ming Lei wrote:
> On Fri, Apr 02, 2021 at 07:27:30PM +0200, Christoph Hellwig wrote:
> > On Wed, Mar 31, 2021 at 08:16:50AM +0800, Ming Lei wrote:
> > > On Tue, Mar 30, 2021 at 06:53:30PM +0200, Christoph Hellwig wrote:
> > > &
On Fri, Apr 02, 2021 at 07:27:30PM +0200, Christoph Hellwig wrote:
> On Wed, Mar 31, 2021 at 08:16:50AM +0800, Ming Lei wrote:
> > On Tue, Mar 30, 2021 at 06:53:30PM +0200, Christoph Hellwig wrote:
> > > On Tue, Mar 23, 2021 at 04:14:39PM +0800, Ming Lei wrote:
> > > &
On Thu, Apr 01, 2021 at 04:27:37PM +, Gulam Mohamed wrote:
> Hi Ming,
>
> Thanks for taking a look into this. Can you please see my inline
> comments in below mail?
>
> Regards,
> Gulam Mohamed.
>
> -Original Message-
> From: Ming Lei
> Se
On Tue, Mar 30, 2021 at 06:53:30PM +0200, Christoph Hellwig wrote:
> On Tue, Mar 23, 2021 at 04:14:39PM +0800, Ming Lei wrote:
> > blktrace may allocate lots of memory, if the process is terminated
> > by user or OOM, we need to provide one chance to remove the trace
> > buf
On Tue, Mar 30, 2021 at 10:57:04AM +0800, Su Yue wrote:
>
> On Tue 23 Mar 2021 at 16:14, Ming Lei wrote:
>
> > On some ARCHs, such as aarch64, page size may be 64K, meantime there may
> > be lots of CPU cores. relay_open() needs to allocate pages on each CPU
> > b
On Tue, Mar 23, 2021 at 04:14:38PM +0800, Ming Lei wrote:
> blktrace may pass big trace buffer size via '-b', meantime the system
> may have lots of CPU cores, so too much memory can be allocated for
> blktrace.
>
> The 1st patch shutdown bltrace in blkdev_close() in ca
I am not sure how people are using partial object accounting. I
believe it is used as a memory usage hint of slabs.
On Mon, Mar 22, 2021 at 6:22 PM Vlastimil Babka wrote:
>
> On 3/22/21 2:46 AM, Shu Ming wrote:
> > More precisely, ss will count partial objects like denty objects wi
0 RCX: 00465d67
> RDX: 7ffda32c37f3 RSI: 004bfab2 RDI: 7ffda32c37e0
> RBP: R08: R09: 7ffda32c35a0
> R10: 7ffda32c3457 R11: 0202 R12: 0001
> R13: R14: 0001 R15: 7ffda32c37e0
This is another & un-related warning with original report, so I think
the patch in above
tree fixes the issue.
--
Ming Lei
On Sun, Mar 14, 2021 at 7:10 PM syzbot
wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:280d542f Merge tag 'drm-fixes-2021-03-05' of git://anongit..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15ade5aed0
> kernel config:
still added by
parted(systemd-udev)?
ioctl(BLKRRPART) needs to read partition table for adding back partitions, if
IOs
are failed by iscsi logout, I guess the issue can be avoided too?
--
Ming
in
> kernel between the systemd-udevd and iscsi-logout processing as described
> above. We are able to reproduce this even with latest upstream kernel.
>
> We have come across a patch from Ming Lei which was created for "avoid to
> drop & re-add partitions if partitions a
case of 'blktrace -b 8192' which is used by device-mapper
test suite[1]. This way could cause OOM easily.
Fix the issue by limiting max allowed pages to be 1/8 of totalram_pages().
[1] https://github.com/jthornber/device-mapper-test-suite.git
Signed-off-by: Ming Lei
---
kernel/trace/
uffer size for avoiding potential
OOM.
Ming Lei (2):
block: shutdown blktrace in case of fatal signal pending
blktrace: limit allowed total trace buffer size
fs/block_dev.c | 6 ++
kernel/trace/blktrace.c | 32
2 files changed, 38
blktrace may allocate lots of memory, if the process is terminated
by user or OOM, we need to provide one chance to remove the trace
buffer, otherwise memory leak may be caused.
Fix the issue by shutdown blktrace in case of task exiting in
blkdev_close().
Signed-off-by: Ming Lei
---
fs
On Sun, Mar 14, 2021 at 7:10 PM syzbot
wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:280d542f Merge tag 'drm-fixes-2021-03-05' of git://anongit..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15ade5aed0
> kernel config:
ux a5f35387ebdc
It should be the same issue which was addressed by
aebf5db91705 block: fix use-after-free in disk_part_iter_next
but converting to xarray introduced the issue again.
--
Ming Lei
More precisely, ss will count partial objects like denty objects with
"/sys/kernel/slab/dentry/partial" whose number can become huge.
On Thu, Mar 18, 2021 at 8:56 PM Xunlei Pang wrote:
>
>
>
> On 3/18/21 8:18 PM, Vlastimil Babka wrote:
> > On 3/17/21 8:54 AM, Xunlei Pang wrote:
> >> The node li
On Tue, Mar 16, 2021 at 07:35:44PM +0300, Sergei Shtepa wrote:
> The 03/16/2021 11:09, Ming Lei wrote:
> > On Fri, Mar 12, 2021 at 06:44:54PM +0300, Sergei Shtepa wrote:
> > > bdev_interposer allows to redirect bio requests to another devices.
> > >
> &g
lock);
> +
> + if (bdev_has_interposer(original))
> + ret = -EBUSY;
> + else {
> + original->bd_interposer = bdgrab(interposer);
> + if (!original->bd_interposer)
> + ret = -ENODEV;
> + }
> +
> + mutex_unlock(&bdev_interposer_attach_lock);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(bdev_interposer_attach);
> +
> +void bdev_interposer_detach(struct block_device *original)
> +{
> + if (WARN_ON(!original))
> + return;
> +
> + if (WARN_ON(!blk_mq_is_queue_frozen(original->bd_disk->queue)))
> + return;
The original request queue may become live now...
--
Ming
other request queues attached to same HBA.
This patch will cause performance regression because userspace may
switch scheduler according to medium or workloads, at that time other
LUNs will be affected by this patch.
--
Ming
On 03/06/2021 10:36 AM, Qing Zhang wrote:
Add default config for 2K1000.
Signed-off-by: Jiaxun Yang
Signed-off-by: Qing Zhang
Tested-by: Ming Wang
Thanks,
Ming
On 03/06/2021 10:36 AM, Qing Zhang wrote:
Add liointc-2.0 properties support, so update the maxItems and description.
Signed-off-by: Jiaxun Yang
Signed-off-by: Qing Zhang
Tested-by: Ming Wang
Thanks,
Ming
Yang
Signed-off-by: Qing Zhang
Tested-by: Ming Wang
Thanks,
Ming
On 03/06/2021 10:36 AM, Qing Zhang wrote:
Distinguish between 3A series CPU and 2K1000 CPU UART0.
Signed-off-by: Jiaxun Yang
Signed-off-by: Qing Zhang
Tested-by: Ming Wang
Thanks,
Ming
On 03/06/2021 10:36 AM, Qing Zhang wrote:
Get the fixed-clock from the CPU0 node of the device tree.
Signed-off-by: Jiaxun Yang
Signed-off-by: Qing Zhang
Tested-by: Ming Wang
Thanks,
Ming
Yang
Signed-off-by: Qing Zhang
Tested-by: Ming Wang
Thanks,
Ming
On 03/06/2021 10:36 AM, Qing Zhang wrote:
Add DeviceTree files for Loongson 2K1000 processor,currently only
supports single-core boot.
Signed-off-by: Jiaxun Yang
Signed-off-by: Qing Zhang
Tested-by: Ming Wang
Thanks,
Ming
q_entity_service_tree(entity);
1181is_in_service = entity == sd->in_service_entity;
1182
1183bfq_calc_finish(entity, entity->service);
1184
1185if (is_in_service)
Seems entity->sched_data points to NULL.
>
> Thanks,
> Paolo
>
> > I
Hello Hillf,
Thanks for the debug patch.
On Fri, Mar 5, 2021 at 5:00 PM Hillf Danton wrote:
>
> On Thu, 4 Mar 2021 16:42:30 +0800 Ming Lei wrote:
> > On Sat, Oct 10, 2020 at 1:40 PM Mikhail Gavrilov
> > wrote:
> > >
> > > Paolo, Jens I am sorry for the noi
t; FS: () GS:8dc90e0c() knlGS:
> CS: 0010 DS: ES: CR0: 80050033
> CR2: 003e8ebe4000 CR3: 0007c2546000 CR4: 00350ee0
> Call Trace:
> bfq_deactivate_entity+0x4f/0xc0
Hello,
The same stack trace was observed in RH internal test too, and kernel
is 5.11.0-0.rc6,
but there isn't reproducer yet.
--
Ming Lei
On Mon, Aug 10, 2020 at 8:22 PM Xunlei Pang wrote:
> static inline void
> @@ -2429,12 +2439,12 @@ static unsigned long partial_counter(struct
> kmem_cache_node *n,
> unsigned long ret = 0;
>
> if (item == PARTIAL_FREE) {
> - ret = atomic_long_read(&n->partial_free_
Any progress on this? The problem addressed by this patch has also
made jitters to our online apps which are quite annoying.
On Mon, Aug 24, 2020 at 6:05 PM xunlei wrote:
>
> On 2020/8/20 δΈε10:02, Pekka Enberg wrote:
> > On Mon, Aug 10, 2020 at 3:18 PM Xunlei Pang
> > wrote:
> >>
> >> v1->v2:
On Wed, Feb 24, 2021 at 09:18:25AM +0100, Christoph Hellwig wrote:
> On Wed, Feb 24, 2021 at 11:58:26AM +0800, Ming Lei wrote:
> > Hi Guys,
> >
> > The two patches changes block ioctl(BLKRRPART) for avoiding drop &
> > re-add partitions if partitions state isn'
d to extend for supporting SCSI XCOPY in future or
similar
block copy commands without breaking previous application, so please CC
linux-scsi
and scsi guys in your next post.
--
Ming
On Wed, Feb 17, 2021 at 08:16:29AM +0100, Christoph Hellwig wrote:
> On Wed, Feb 17, 2021 at 11:07:14AM +0800, Ming Lei wrote:
> > Do you think it is correct for ioctl(BLKRRPART) to always drop/re-add
> > partition device node?
>
> Yes, that is what it is designed to do. Th
On Tue, Feb 16, 2021 at 09:44:30AM +0100, Christoph Hellwig wrote:
> On Mon, Feb 15, 2021 at 12:03:41PM +0800, Ming Lei wrote:
> > Hello,
>
> I think this is a fundamentally bad idea. We should not keep the
> parsed partition state around forever just to work around some
i-segment discards. It calculates the correct discard segment
> count by counting the number of bio as each discard bio is considered its
> own segment.
>
> Fixes: 1e739730c5b9 ("block: optionally merge discontiguous discard bios into
> a single request")
> Signed-off-b
On Fri, Feb 05, 2021 at 10:17:06AM +0800, Ming Lei wrote:
> Hi Guys,
>
> The two patches changes block ioctl(BLKRRPART) for avoiding drop &
> re-add partitions if partitions state isn't changed. The current
> behavior confuses userspace because partitions can disapp
In line 824, it is trying to enable `out_ep`, so I
believe that in line 826, it should print `out_ep`
instead of `in_ep`.
Signed-off-by: Wei Ming Chen
---
drivers/usb/gadget/function/f_printer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/function
Hi,
I believe there is an unexpected RES page accounting when doing
multiple page mapping. The sample code was pasted below. In the
sample code, The same 1g pages are mapped for three times. And it is
expected that the process gets 1g RES instead of 3g RES pages(top
command showed result).
On Fri, Feb 05, 2021 at 08:14:29AM +0100, Christoph Hellwig wrote:
> On Fri, Feb 05, 2021 at 10:17:08AM +0800, Ming Lei wrote:
> > block ioctl(BLKRRPART) always drops current partitions and adds
> > partitions again, even though there isn't any change in partitions table.
>
No functional change, make code more readable, and prepare for
supporting safe re-read partitions.
Cc: Ewan D. Milne
Signed-off-by: Ming Lei
---
block/partitions/core.c | 51 ++---
1 file changed, 37 insertions(+), 14 deletions(-)
diff --git a/block
Hi Guys,
The two patches changes block ioctl(BLKRRPART) for avoiding drop &
re-add partitions if partitions state isn't changed. The current
behavior confuses userspace because partitions can disappear anytime
when ioctl(BLKRRPART).
Ming Lei (2):
block: move partitions check code in
s way may confuse userspace or users, for example, one normal
workable partition device node may disappear any time.
Fix this issue by checking if there is real change in partitions state,
and only drop & re-add them when partitions state is really changed.
Cc: Ewan D. Milne
Signed-off-by:
return nr_phys_segs;
> + }
> + /* fall through */
> case REQ_OP_SECURE_ERASE:
> case REQ_OP_WRITE_ZEROES:
> return 0;
blk_rq_nr_discard_segments() always returns >=1 segments, so no similar
issue in case of single range discard.
Reviewed-by: Ming Lei
And it can be thought as:
Fixes: 1e739730c5b9 ("block: optionally merge discontiguous discard bios into a
single request")
--
Ming
On Wed, Feb 03, 2021 at 11:23:37AM -0500, David Jeffery wrote:
> On Wed, Feb 03, 2021 at 10:35:17AM +0800, Ming Lei wrote:
> >
> > On Tue, Feb 02, 2021 at 03:43:55PM -0500, David Jeffery wrote:
> > > The return 0 does seem to be an old relic that does not make sense
>
reason? Maybe it is one arm64 specific issue.
> >
> > BTW, bio_iov_iter_get_pages() just takes ~200us on one x86_64 VM with THP,
> > which is
> > observed via bcc/funclatency when running the following workload:
> >
>
> I think you focused on bio_iov_iter_get_pages() because I just commented page
> merge delay only. Sorry about that. I missed details of this issue.
> Actually there are many operations during while-loop in do_direct_IO().
> Page merge operation is just one among them. Page merge operation is called
> by dio_send_cur_page() in while-loop. Below is call stack.
>
> __bio_try_merge_page+0x4c/0x614
> bio_add_page+0x40/0x12c
> dio_send_cur_page+0x13c/0x374
> submit_page_section+0xb4/0x304
> do_direct_IO+0x3d4/0x854
> do_blockdev_direct_IO+0x488/0xa18
> __blockdev_direct_IO+0x30/0x3c
> f2fs_direct_IO+0x6d0/0xb80
> generic_file_read_iter+0x284/0x45c
> f2fs_file_read_iter+0x3c/0xac
> __vfs_read+0x19c/0x204
> vfs_read+0xa4/0x144
>
> 2ms delay is not only caused by page merge operation. it inculdes many the
> other operations too. But those many operations included page merge should
> be executed more if bio size is grow up.
OK, got it.
Then I think you can just limit bio size in dio_bio_add_page() instead of
doing it for all.
--
Ming
On Tue, Feb 02, 2021 at 03:43:55PM -0500, David Jeffery wrote:
> On Tue, Feb 02, 2021 at 11:33:43AM +0800, Ming Lei wrote:
> >
> > On Mon, Feb 01, 2021 at 11:48:50AM -0500, David Jeffery wrote:
> > > When a stacked block device inserts a request into another
return nr_phys_segs;
> + }
> + /* fall through */
> case REQ_OP_SECURE_ERASE:
REQ_OP_SECURE_ERASE needs to be covered since block layer treats
the two in very similar way from discard viewpoint.
Also single range discard should be fixed too, since block layer
thinks single-range discard req segment is 1. Otherwise, the warning in
virtblk_setup_discard_write_zeroes() still may be triggered, at least.
--
Ming
767 : 0||
32768 -> 65535 : 0||
65536 -> 131071 : 0||
131072 -> 262143 : 1842 ||
262144 -> 524287 : 125 |** |
524288 -> 1048575: 6||
1048576 -> 2097151: 0||
2097152 -> 4194303: 1||
4194304 -> 8388607: 0||
8388608 -> 16777215 : 1||
Detaching...
--
Ming
|--->
> |--->|
> total 17ms elapsed to complete 32MB read done from device. |
Can you share us if enabling THP in your application can avoid this issue? BTW,
you
need to make the 32MB buffer aligned with huge page size. IMO, THP perfectly
fits
your case.
Thanks,
Ming
t_blocksize() in the I/O path. Did I perhaps overlook something?
>
> I don't know the exact mechanism how this change affects the speed,
> I'm not an expert in the block device subsystem (I'm a networking
> guy). This commit was found by git bisect, and my performance test
> confirmed that reverting it fixes the bug.
>
> It looks to me as this function sets the block size as part of control
> flow, and this size is used later in the fast path, and the commit
> that removed the loop decreased this block size.
Right, the issue is stupid __block_write_full_page() which submits single bio
for each buffer head. And I have tried to improve the situation by merging
BHs into single bio, see below patch:
https://lore.kernel.org/linux-block/2020123815.3448707-1-ming@redhat.com/
The above patch should improve perf for your test case.
--
Ming
On Tue, Jan 26, 2021 at 06:26:02AM +, Damien Le Moal wrote:
> On 2021/01/26 15:07, Ming Lei wrote:
> > On Tue, Jan 26, 2021 at 04:06:06AM +, Damien Le Moal wrote:
> >> On 2021/01/26 12:58, Ming Lei wrote:
> >>> On Tue, Jan 26, 2021 at 10:32:34AM +0900, Changh
On Tue, Jan 26, 2021 at 04:06:06AM +, Damien Le Moal wrote:
> On 2021/01/26 12:58, Ming Lei wrote:
> > On Tue, Jan 26, 2021 at 10:32:34AM +0900, Changheun Lee wrote:
> >> bio size can grow up to 4GB when muli-page bvec is enabled.
> >> but sometimes it would le
bio->bi_opf |= REQ_FUA;
- ret = bio_iov_iter_get_pages(bio, from);
+ ret = bio_iov_iter_get_pages(bio, from, is_sync_kiocb(iocb));
if (unlikely(ret))
goto out_release;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 676870b2c88d..fa3a503b955c 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -472,7 +472,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page
*page,
unsigned int len, unsigned int off, bool *same_page);
void __bio_add_page(struct bio *bio, struct page *page,
unsigned int len, unsigned int off);
-int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);
+int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter, bool sync);
void bio_release_pages(struct bio *bio, bool mark_dirty);
extern void bio_set_pages_dirty(struct bio *bio);
extern void bio_check_pages_dirty(struct bio *bio);
Thanks,
Ming
17), in Taiwan (5 Aug
2019) and Australia (25 Dec 2019 to 9 Jan 2020):
[1] https://tdtemcerts.wordpress.com/
[2] https://tdtemcerts.blogspot.sg/
[3] https://www.scribd.com/user/270125049/Teo-En-Ming
-END EMAIL SIGNATURE-
ze;\
> bio_clone_blkg_association(dst, src); \
> } while (0)
>
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index 866f74261b3b..e5dd5b7d8fc1 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -270,6 +270,7 @@ struct bio {
>*/
>
> unsigned short bi_max_vecs;/* max bvl_vecs we can hold */
> + unsigned intbi_max_size;/* max data size we can hold */
People don't like to extend bio which can be fit in two cachelines
exactly, and adding one 'int' will make it cross 3 cache lines.
Thanks,
Ming
Subject: How to Block Polycom Root Call IP Addresses in HPE MSR2003
Router
Author: Mr. Turritopsis Dohrnii Teo En Ming (TARGETED INDIVIDUAL)
Country: Singapore
Date: 15 Jan 2021 Friday Singapore Time
Type of Publication: Plain Text
Document version: 20210115.01
DETAILED STEPS
On Thu, Jan 14, 2021 at 11:24:35AM -0600, Brian King wrote:
> On 1/13/21 7:27 PM, Ming Lei wrote:
> > On Wed, Jan 13, 2021 at 11:13:07AM -0600, Brian King wrote:
> >> On 1/12/21 6:33 PM, Tyrel Datwyler wrote:
> >>> On 1/12/21 2:54 PM, Brian King wrote:
> >&g
On Wed, Jan 13, 2021 at 12:02:44PM +, Damien Le Moal wrote:
> On 2021/01/13 20:48, Ming Lei wrote:
> > On Wed, Jan 13, 2021 at 11:16:11AM +, Damien Le Moal wrote:
> >> On 2021/01/13 19:25, Ming Lei wrote:
> >>> On Wed, Jan 13, 2021 at 09:28:02AM +, Damien
t.
Actually it isn't related with commit 6eb045e092ef, because
blk_mq_alloc_tag_set()
uses .can_queue to create driver tag sbitmap and request pool.
So even thought without 6eb045e092ef, the updated .can_queue can't work
as expected because the max driver tag depth has been fixed by blk-mq already.
What 6eb045e092ef does is just to remove the double check on max
host-wide allowed commands because that has been respected by blk-mq
driver tag allocation already.
>
> I started looking through drivers that do this, and so far, it looks like the
> following drivers do: ibmvfc, lpfc, aix94xx, libfc, BusLogic, and likely
> others...
>
> We probably need an API that lets us change shost->can_queue dynamically.
I'd suggest to confirm changing .can_queue is one real usecase.
Thanks,
Ming
On Wed, Jan 13, 2021 at 11:16:11AM +, Damien Le Moal wrote:
> On 2021/01/13 19:25, Ming Lei wrote:
> > On Wed, Jan 13, 2021 at 09:28:02AM +, Damien Le Moal wrote:
> >> On 2021/01/13 18:19, Ming Lei wrote:
> >>> On Wed, Jan 13, 2021 at 12:09 PM Changheun Lee
On Wed, Jan 13, 2021 at 09:28:02AM +, Damien Le Moal wrote:
> On 2021/01/13 18:19, Ming Lei wrote:
> > On Wed, Jan 13, 2021 at 12:09 PM Changheun Lee
> > wrote:
> >>
> >>> On 2021/01/12 21:14, Changheun Lee wrote:
> >>>>> On 2021/01/12
So what is the actual total
> >latency
> >difference for the entire 32MB user IO ? That is I think what needs to be
> >compared here.
> >
> >Also, what is your device max_sectors_kb and max queue depth ?
> >
>
> 32MB total latency is about 19ms including merge time without this patch.
> But with this patch, total latency is about 17ms including merge time too.
19ms looks too big just for preparing one 32MB sized bio, which isn't
supposed to
take so long. Can you investigate where the 19ms is taken just for
preparing one
32MB sized bio?
It might be iov_iter_get_pages() for handling page fault. If yes, one suggestion
is to enable THP(Transparent HugePage Support) in your application.
--
Ming Lei
f19319 ("block: warn if
!__GFP_DIRECT_RECLAIM in bio_crypt_set_ctx()").
Not see difference in the two kernel(fio on null_blk with 224 hw queues,
and 'stress-ng --stackmmap-ops') on one 224 cores, dual sockets system.
BTW this patch itself doesn't touch fast path code, so it is supposed to
not affect performance.
Can you double check if the test itself is good?
Note: cf785af19319 is 2b0d3d3e4fcf^
Thanks,
Ming
direct I/O as memory stall */
> bio_clear_flag(bio, BIO_WORKINGSET);
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index d8f9077c43ef..1d30572a8c53 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -444,10 +444,13 @@ static inline void bio_wouldblock_error(struct bio *bio)
>
> /*
> * Calculate number of bvec segments that should be allocated to fit data
> - * pointed by @iter.
> + * pointed by @iter. If @iter is backed by bvec it's going to be reused
> + * instead of allocating a new one.
> */
> static inline int bio_iov_vecs_to_alloc(struct iov_iter *iter, int max_segs)
> {
> + if (iov_iter_is_bvec(iter))
> + return 0;
> return iov_iter_npages(iter, max_segs);
> }
>
> --
> 2.24.0
>
Reviewed-by: Ming Lei
--
Ming
in blk_types.h */
> #include
> +#include
>
> #define BIO_DEBUG
>
> @@ -441,6 +442,15 @@ static inline void bio_wouldblock_error(struct bio *bio)
> bio_endio(bio);
> }
>
> +/*
> + * Calculate number of bvec segments that should be allocated to fit data
> + * pointed by @iter.
> + */
> +static inline int bio_iov_vecs_to_alloc(struct iov_iter *iter, int max_segs)
> +{
> + return iov_iter_npages(iter, max_segs);
> +}
> +
> struct request_queue;
>
> extern int submit_bio_wait(struct bio *bio);
> --
> 2.24.0
>
Reviewed-by: Ming Lei
--
Ming
+ if (iov_iter_is_bvec(i)) {
> + iov_iter_bvec_advance(i, size);
> + return;
> + }
> iterate_and_advance(i, size, v, 0, 0, 0)
> }
> EXPORT_SYMBOL(iov_iter_advance);
> --
> 2.24.0
>
Reviewed-by: Ming Lei
--
Ming
iov_iter_bvec(&iter, is_write, aio_cmd->bvecs, sgl_nents, len);
>
> aio_cmd->cmd = cmd;
> aio_cmd->len = len;
> @@ -307,8 +301,6 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist
> *sgl, u32 sgl_nents,
> else
> ret = call_read_iter(file, &aio_cmd->iocb, &iter);
>
> - kfree(bvec);
> -
> if (ret != -EIOCBQUEUED)
> cmd_rw_aio_complete(&aio_cmd->iocb, ret, 0);
>
> --
> 2.24.0
>
Reviewed-by: Ming Lei
--
Ming
; +++ b/fs/direct-io.c
> @@ -426,6 +426,8 @@ static inline void dio_bio_submit(struct dio *dio, struct
> dio_submit *sdio)
> unsigned long flags;
>
> bio->bi_private = dio;
> + /* don't account direct I/O as memory stall */
> + bio_clear_flag(bio, BIO_WORKINGSET);
>
> spin_lock_irqsave(&dio->bio_lock, flags);
> dio->refcount++;
> --
> 2.24.0
>
Reviewed-by: Ming Lei
--
Ming
continue; \
> (void)(STEP); \
> } \
> }
> --
> 2.24.0
>
Reviewed-by: Ming Lei
--
Ming
left -= this_len;
> + n++;
> }
>
> iov_iter_bvec(&from, WRITE, array, n, sd.total_len - left);
> --
> 2.24.0
>
Reviewed-by: Ming Lei
--
Ming
On Thu, Jan 07, 2021 at 09:21:11AM +1100, Dave Chinner wrote:
> On Wed, Jan 06, 2021 at 04:45:48PM +0800, Ming Lei wrote:
> > On Tue, Jan 05, 2021 at 07:39:38PM +0100, Christoph Hellwig wrote:
> > > At least for iomap I think this is the wrong approach. Between t
is purpose? But iomap->length
still is still too big in case of xfs.
--
Ming
b_vcnt.bt
Cc: Alexander Viro
Cc: Darrick J. Wong
Cc: linux-...@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org
Signed-off-by: Ming Lei
---
fs/block_dev.c| 1 +
fs/iomap/buffered-io.c| 13 +
include/linux/bio.h | 2 --
include/linux/blk
On Mon, Jan 04, 2021 at 09:44:15AM +0100, Christoph Hellwig wrote:
> On Wed, Dec 30, 2020 at 08:08:15AM +0800, Ming Lei wrote:
> > It is observed that __block_write_full_page() always submit bio with size
> > of block size,
> > which is often 512 bytes.
> >
> >
Managing bio slab cache via xarray by using slab cache size as xarray index, and
storing 'struct bio_slab' instance into xarray.
So code is simplified a lot, meantime is is more readable than before.
Signed-off-by: Ming Lei
---
block/b
bvec_alloc(), bvec_free() and bvec_nr_vecs() are only used inside block
layer core functions, no need to declare them in public header.
Signed-off-by: Ming Lei
---
block/blk.h | 4
include/linux/bio.h | 3 ---
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/block
This bioset is just for allocating bio only from bio_next_split, and it needn't
bvecs, so remove the flag.
Cc: linux-bca...@vger.kernel.org
Cc: Coly Li
Signed-off-by: Ming Lei
---
drivers/md/bcache/super.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/b
bvec_alloc() may allocate more bio vectors than requested, so set .bi_max_vecs
as
actual allocated vector number, instead of the requested number. This way can
help
fs build bigger bio because new bio often won't be allocated until the current
one
becomes full.
Signed-off-by: Min
q->bio_split is only used by bio_split() for fast cloning bio, and no
need to allocate bvecs, so remove this flag.
Signed-off-by: Ming Lei
---
block/blk-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 96e5fcd7f
Hello,
All are bioset / bvec improvement, and most of them are quite
straightforward.
Ming Lei (6):
block: manage bio slab cache by xarray
block: don't pass BIOSET_NEED_BVECS for q->bio_split
block: don't allocate inline bvecs if this bioset needn't bvecs
block: s
The inline bvecs won't be used if user needn't bvecs by not passing
BIOSET_NEED_BVECS, so don't allocate bvecs in this situation.
Signed-off-by: Ming Lei
---
block/bio.c | 11 +++
include/linux/bio.h | 1 +
2 files changed, 8 insertions(+), 4 deletions(-)
dif
l.org
Cc: Christoph Hellwig
Cc: Jens Axboe
Signed-off-by: Ming Lei
---
fs/buffer.c | 112 +---
1 file changed, 90 insertions(+), 22 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 32647d2011df..6bcf9ce5d7f8 100644
--- a/fs/buffer.c
+++ b/fs/
On Mon, Dec 28, 2020 at 05:02:50PM +0800, yukuai (C) wrote:
> Hi
>
> On 2020/12/28 16:28, Ming Lei wrote:
> > Another candidate solution may be to always return true from
> > hctx_may_queue()
> > for this kind of queue because queue_depth has provided fair allocation
On Mon, Dec 28, 2020 at 09:56:15AM +0800, yukuai (C) wrote:
> Hi,
>
> On 2020/12/27 19:58, Ming Lei wrote:
> > Hi Yu Kuai,
> >
> > On Sat, Dec 26, 2020 at 06:28:06PM +0800, Yu Kuai wrote:
> > > When sharing a tag set, if most disks are issuing small amount of
mize,
or it is just one improvement in theory? And what is the disk(hdd, ssd
or nvme) and host? And how many disks in your setting? And how deep the tagset
depth is?
Thanks,
Ming
Subject: Teo En Ming's Guide to Configuring Asterisk/FreePBX with Cisco
7960 IP Phones
Author: Mr. Turritopsis Dohrnii Teo En Ming (TARGETED INDIVIDUAL)
Country: Singapore
Date: 24 December 2020 Thursday Singapore Time
Type of Publication: Plain Text
Document version: 202012
to be split into pairs of entangled
photons.
The theory that so riled Einstein is also referred to as 'spooky action
at a distance'.
Einstein wasn't happy with theory, because it suggested that information
could travel faster than light.
Mr. Turritopsis Dohrnii Teo En Min
,stripe=32)
> >> /dev/sda1 on /boot/efi type vfat
> >> (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
> > Hi John,
> >
>
> Hi Bart, Ming,
>
> > Thanks for the clarification. I want to take back my
===
# nano RINGLIST.DAT (Create configuration file)
===
FlintPhone FlintPhone.raw
HarpSynth HarpSynth.raw
Jamaica Jamaica.raw
Klaxons Klaxons.raw
KotoEffect KotoEffect.raw
MusicBox MusicBox.raw
Oh
Subject: List of Enterprise Networking Devices Bought on 19 Dec 2020
Saturday and Earlier
Mr. Turritopsis Dohrnii Teo En Ming (42 year old Singapore Targeted
Individual) bought the following enterprise networking devices on 19 Dec
2020 Saturday and earlier.
(1) Refurbished Fortigate 60E
mic_titer_usage_counter;
> +
> struct sbitmap_queue *bitmap_tags;
> struct sbitmap_queue *breserved_tags;
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 8465d7c5ebf0..a61279be0120 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2315,7 +2315,9 @@ void __blk_mq_free_rqs_ext(struct blk_mq_tag_set *set,
> struct blk_mq_tags *tags,
> void blk_mq_free_rqs_ext(struct blk_mq_tag_set *set, struct blk_mq_tags
> *tags,
>unsigned int hctx_idx, struct blk_mq_tags *ref_tags)
> {
> + while (atomic_cmpxchg(&ref_tags->iter_usage_counter, 1, 0) != 1);
> __blk_mq_free_rqs_ext(set, tags, hctx_idx, ref_tags);
> + atomic_set(&ref_tags->iter_usage_counter, 1);
> }
I guess it is simpler to sync the two code paths by adding mutex to 'ref_tags'
and
holding it in both __blk_mq_free_rqs_ext() and the above two iterator helpers.
thanks,
Ming
On Mon, Dec 14, 2020 at 10:24:22AM -0500, Steven Rostedt wrote:
> On Mon, 14 Dec 2020 10:22:17 +0800
> Ming Lei wrote:
>
> > trace_note_tsk() is called by __blk_add_trace(), which is covered by RCU
> > read lock.
> > So in case of PREEMPT_RT, warning of 'BU
1 - 100 of 1599 matches
Mail list logo