On 4/18/21 11:09 PM, Junxiao Bi wrote:
- finish_wait(&rqw->wait, &data.wq);
+ mutex_lock(&rqw->throttle_mutex);
+ wait_event(rqw->wait, acquire_inflight_cb(rqw, private_data));
+ mutex_unlock(&rqw->throttle_mutex);
This will break the throttle? There
On 4/18/21 5:33 AM, Hillf Danton wrote:
On Sat, 17 Apr 2021 14:37:57 Junxiao Bi wrote:
On 4/17/21 3:10 AM, Hillf Danton wrote:
+ if (acquire_inflight_cb(rqw, private_data))
This function is to increase atomic variable rq_wait->inflight.
You are right.
What's the mutex
On 4/17/21 3:10 AM, Hillf Danton wrote:
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -260,19 +260,17 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
.cb = acquire_inflight_cb,
.private_data = private_data,
};
- bool has_sleeper;
On 4/14/21 9:11 PM, Hillf Danton wrote:
On Wed, 14 Apr 2021 14:18:30 Junxiao Bi wrote:
There is a race bug which can cause io hung when multiple processes
run parallel in rq_qos_wait().
Let assume there were 4 processes P1/P2/P3/P4, P1/P2 were at the entry
of rq_qos_wait, and P3/P4 were
= true;
set_current_state(TASK_UNINTERRUPTIBLE);
} while (1);
finish_wait(&rqw->wait, &data.wq);
}
Cc: sta...@vger.kernel.org
Signed-off-by: Junxiao Bi
---
block/blk-rq-qos.c | 9 +++--
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/block/blk-rq-qos.c b/
On 3/24/21 5:37 PM, Ming Lei wrote:
On Wed, Mar 24, 2021 at 12:37:03PM +, Gulam Mohamed wrote:
Hi All,
We are facing a stale link (of the device) issue during the iscsi-logout
process if we use parted command just before the iscsi logout. Here are the
details:
As part of
.
Thanks,
Junxiao.
On 12/14/20 3:10 PM, Junxiao Bi wrote:
On 12/13/20 11:43 PM, Konstantin Khlebnikov wrote:
On Sun, Dec 13, 2020 at 9:52 PM Junxiao Bi <mailto:junxiao...@oracle.com>> wrote:
On 12/11/20 11:32 PM, Konstantin Khlebnikov wrote:
> On Thu, Dec 10, 2020 at 2:01 A
On 12/13/20 11:43 PM, Konstantin Khlebnikov wrote:
On Sun, Dec 13, 2020 at 9:52 PM Junxiao Bi <mailto:junxiao...@oracle.com>> wrote:
On 12/11/20 11:32 PM, Konstantin Khlebnikov wrote:
> On Thu, Dec 10, 2020 at 2:01 AM Junxiao Bi
mailto:junxiao..
On 12/11/20 11:32 PM, Konstantin Khlebnikov wrote:
On Thu, Dec 10, 2020 at 2:01 AM Junxiao Bi <mailto:junxiao...@oracle.com>> wrote:
Hi Konstantin,
We tested this patch set recently and found it limiting negative
dentry
to a small part of total memory. And also we
Hi Konstantin,
We tested this patch set recently and found it limiting negative dentry
to a small part of total memory. And also we don't see any performance
regression on it. Do you have any plan to integrate it into mainline? It
will help a lot on memory fragmentation issue causing by dentry
This issue had been fixed. I send the following patch in another thread.
Please take a look. Thank you.
[PATCH] md: get sysfs entry after redundancy attr group create
Thanks,
Junxiao.
On 8/3/20 9:00 AM, Junxiao Bi wrote:
Hi Song,
I am working on setup an env to reproduce, will update soon
"sync_completed" and "degraded" belongs to redundancy attr group,
it was not exist yet when md device was created.
Reported-by: kernel test robot
Fixes: e1a86dbbbd6a ("md: fix deadlock causing by sysfs_notify")
Signed-off-by: Junxiao Bi
---
drivers/md/md.c
Hi Song,
I am working on setup an env to reproduce, will update soon.
Thanks,
Junxiao.
On 8/2/20 10:52 PM, Song Liu wrote:
On Jul 29, 2020, at 2:04 AM, kernel test robot wrote:
Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: e1a86dbbbd6a77f73c3d099030495fa31f1
fffc01f0375 [raid1]
#12 [b87c4df07ea0] md_thread at 9a680348
#13 [b87c4df07f08] kthread at 9a0b8005
#14 [b87c4df07f50] ret_from_fork at 9aa00344
Signed-off-by: Junxiao Bi
---
v2 <- v1
- fix sysfs_notify for sysfs file 'level' to align
On 7/14/20 9:18 AM, Song Liu wrote:
On Mon, Jul 13, 2020 at 11:41 PM Junxiao Bi wrote:
On 7/13/20 11:17 PM, Song Liu wrote:
On Thu, Jul 9, 2020 at 4:36 PM Junxiao Bi wrote:
The following deadlock was captured. The first process is holding 'kernfs_mutex'
and hung by io. The io w
On 7/13/20 11:17 PM, Song Liu wrote:
On Thu, Jul 9, 2020 at 4:36 PM Junxiao Bi wrote:
The following deadlock was captured. The first process is holding 'kernfs_mutex'
and hung by io. The io was staging in 'r1conf.pending_bio_list' of raid1 device,
this pending bio list
io. The io would be flushed from
raid1d()->flush_pending_writes() by process 'md127_raid1', but it was
hung by 'kernelfs_mutex' in md_check_recovery()->md_update_sb() before
flush_pending_writes().
Thanks,
Junxiao.
On 7/9/20 4:35 PM, Junxiao Bi wrote:
The follow
fffc01f0375 [raid1]
#12 [b87c4df07ea0] md_thread at 9a680348
#13 [b87c4df07f08] kthread at 9a0b8005
#14 [b87c4df07f50] ret_from_fork at ffff9aa00344
Cc: sta...@vger.kernel.org
Signed-off-by: Junxiao Bi
---
drivers/md/md-bitmap.c | 2 +-
drivers/md/md.c
On 7/2/20 3:24 PM, Linus Torvalds wrote:
On Thu, Jul 2, 2020 at 2:17 PM Pavel Machek wrote:
commit 4cd9973f9ff69e37dd0ba2bd6e6423f8179c329a upstream.
Patch series "ocfs2: fix nfsd over ocfs2 issues", v2.
This causes locking imbalance:
This sems to be true upstream too.
When ocfs2_nfs_sy
On 6/22/20 5:47 PM, Matthew Wilcox wrote:
On Sun, Jun 21, 2020 at 10:15:39PM -0700, Junxiao Bi wrote:
On 6/20/20 9:27 AM, Matthew Wilcox wrote:
On Fri, Jun 19, 2020 at 05:42:45PM -0500, Eric W. Biederman wrote:
Junxiao Bi writes:
Still high lock contention. Collect the following hot path
On 6/22/20 8:20 AM, ebied...@xmission.com wrote:
If I understand correctly, the Java VM is not exiting. Just some of
it's threads.
That is a very different problem to deal with. That are many
optimizations that are possible when_all_ of the threads are exiting
that are not possible when_many
On 6/20/20 9:27 AM, Matthew Wilcox wrote:
On Fri, Jun 19, 2020 at 05:42:45PM -0500, Eric W. Biederman wrote:
Junxiao Bi writes:
Still high lock contention. Collect the following hot path.
A different location this time.
I know of at least exit_signal and exit_notify that take thread wide
On 6/19/20 10:24 AM, ebied...@xmission.com wrote:
Junxiao Bi writes:
Hi Eric,
The patch didn't improve lock contention.
Which raises the question where is the lock contention coming from.
Especially with my first variant. Only the last thread to be reaped
would free up anything i
[k]
_raw_spin_lock_irqsave
Thanks,
Junxiao.
On 6/19/20 7:09 AM, ebied...@xmission.com wrote:
Junxiao Bi reported:
When debugging some performance issue, i found that thousands of threads exit
around same time could cause a severe spin lock contention on proc dentry
"/proc/$parent_proces
On 6/18/20 5:02 PM, ebied...@xmission.com wrote:
Matthew Wilcox writes:
On Thu, Jun 18, 2020 at 03:17:33PM -0700, Junxiao Bi wrote:
When debugging some performance issue, i found that thousands of threads
exit around same time could cause a severe spin lock contention on proc
dentry "
Hi,
When debugging some performance issue, i found that thousands of threads
exit around same time could cause a severe spin lock contention on proc
dentry "/proc/$parent_process_pid/task/", that's because threads needs
to clean up their pid file from that dir when exit. Check the following
s
Anybody could help review this bug?
thanks,
Junxiao.
On 8/5/19 1:01 PM, Junxiao Bi wrote:
When md raid1 was used with imsm metadata, during the boot stage,
the raid device will first be set to readonly, then mdmon will set
it read-write later. When there were some partitions in this device
While loading fw crashdump in function fw_crash_buffer_show(),
left bytes in one dma chunk was not checked, if copying size
over it, overflow access will cause kernel panic.
Signed-off-by: Junxiao Bi
---
drivers/scsi/megaraid/megaraid_sas_base.c | 3 +++
1 file changed, 3 insertions(+)
diff
e, including permission check, (get|set)_(acl|attr), and
> the gfs2 code also do so.
>
> Changes since v1:
> - Let ocfs2_is_locked_by_me() just return true/false to indicate if the
> process gets the cluster lock - suggested by: Joseph Qi
> and Junxiao Bi .
>
> - Chang
he previous patch) for
> these funcs above, ocfs2_permission(), ocfs2_iop_[set|get]_acl(),
> ocfs2_setattr().
>
> Changes since v1:
> - Let ocfs2_is_locked_by_me() just return true/false to indicate if the
> process gets the cluster lock - suggested by: Joseph Qi
> and Junxiao
he previous patch) for
> these funcs above, ocfs2_permission(), ocfs2_iop_[set|get]_acl(),
> ocfs2_setattr().
>
> Changes since v1:
> 1. Let ocfs2_is_locked_by_me() just return true/false to indicate if the
> process gets the cluster lock - suggested by: Joseph Qi
> and Junx
On 01/16/2017 11:06 AM, Eric Ren wrote:
> Hi Junxiao,
>
> On 01/16/2017 10:46 AM, Junxiao Bi wrote:
>>>> If had_lock==true, it is a bug? I think we should BUG_ON for it, that
>>>> can help us catch bug at the first time.
>>> Good idea! But I'm not
On 01/13/2017 02:19 PM, Eric Ren wrote:
> Hi!
>
> On 01/13/2017 12:22 PM, Junxiao Bi wrote:
>> On 01/05/2017 11:31 PM, Eric Ren wrote:
>>> Commit 743b5f1434f5 ("ocfs2: take inode lock in
>>> ocfs2_iop_set/get_acl()")
>>> results in a de
On 01/13/2017 02:12 PM, Eric Ren wrote:
> Hi Junxiao!
>
> On 01/13/2017 11:59 AM, Junxiao Bi wrote:
>> On 01/05/2017 11:31 PM, Eric Ren wrote:
>>> We are in the situation that we have to avoid recursive cluster locking,
>>> but there is no way to check if a
On 01/05/2017 11:31 PM, Eric Ren wrote:
> Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
> results in a deadlock, as the author "Tariq Saeed" realized shortly
> after the patch was merged. The discussion happened here
> (https://oss.oracle.com/pipermail/ocfs2-devel/2015-S
On 01/05/2017 11:31 PM, Eric Ren wrote:
> We are in the situation that we have to avoid recursive cluster locking,
> but there is no way to check if a cluster lock has been taken by a
> precess already.
>
> Mostly, we can avoid recursive locking by writing code carefully.
> However, we found that
Hi,
The following panic is triggered when run ocfs2 xattr test on
linux-next-20160225. Did anybody ever see this?
[ 254.604228] BUG: unable to handle kernel paging request at
0002000800c0
[ 254.605013] IP: [] kmem_cache_alloc+0x78/0x160
[ 254.605013] PGD 7bbe5067 PUD 0
[ 254.605013] Oops:
On 12/23/2015 04:59 PM, Florian Westphal wrote:
> Junxiao Bi wrote:
>> The following panic happened when I run ocfs2-test on linux-next. Kernel
>> config is attached.
>>
>> [64910.905501] BUG: unable to handle kernel NULL pointer dereference at
>>
On 11/25/2015 01:04 PM, Gang He wrote:
> Hi Mark and Junxiao,
>
>
>>>>
>> Hi Mark,
>>
>> On 11/25/2015 06:16 AM, Mark Fasheh wrote:
>>> Hi Junxiao,
>>>
>>> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>>>&
Hi Gang,
On 11/25/2015 11:29 AM, Gang He wrote:
> Hi Mark and Junxiao,
>
>
>>>>
>> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>>> Hi Gang,
>>>
>>> On 11/03/2015 03:54 PM, Gang He wrote:
>>>> Hi Junxiao,
>>
On 11/25/2015 05:46 AM, Mark Fasheh wrote:
> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> On 11/03/2015 03:54 PM, Gang He wrote:
>>> Hi Junxiao,
>>>
>>> Thank for your reviewing.
>>> Current design, we use a
Hi Mark,
On 11/25/2015 06:16 AM, Mark Fasheh wrote:
> Hi Junxiao,
>
> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> This is not like a right patch.
>> First, online file check only checks inode's block number, valid flag,
&
On 11/03/2015 04:47 PM, Gang He wrote:
>
>
>
>> On 11/03/2015 04:15 PM, Gang He wrote:
>>> Hello Junxiao,
>>>
>>> See my comments inline.
>>>
>>>
>>
Hi Gang,
This is not like a right patch.
First, online file check only checks inode's block number, valid flag,
f
On 11/03/2015 04:15 PM, Gang He wrote:
> Hello Junxiao,
>
> See my comments inline.
>
>
>> Hi Gang,
>>
>> This is not like a right patch.
>> First, online file check only checks inode's block number, valid flag,
>> fs generation value, and meta ecc. I never see a real corruption
>> happened
Hi Gang,
On 11/03/2015 03:54 PM, Gang He wrote:
> Hi Junxiao,
>
> Thank for your reviewing.
> Current design, we use a sysfile as a interface to check/fix a file (via pass
> a ino number).
> But, this operation is manually triggered by user, instead of automatically
> fix in the kernel.
> Why?
Hi Gang,
I didn't see a need to add a sysfs file for the check and repair. This
leaves a hard problem for customer to decide. How they decide whether
they should repair the bad inode since this may cause corruption even
harder?
I think the error should be fixed by this feature automaticlly if repa
Hi Gang,
This is not like a right patch.
First, online file check only checks inode's block number, valid flag,
fs generation value, and meta ecc. I never see a real corruption
happened only on this field, if these fields are corrupted, that means
something bad may happen on other place. So fix th
O in the direct reclaim path.
v1 thread at:
https://lkml.org/lkml/2014/9/3/32
v2 changes:
patch log update to make the issue more clear.
Signed-off-by: Junxiao Bi
Cc: Dave Chinner
Cc: joyce.xue
Cc: Ming Lei
Cc:
---
include/linux/sched.h |6 --
1 file changed, 4 insertions(+), 2 del
On 09/05/2014 10:32 AM, Junxiao Bi wrote:
> On 09/04/2014 05:23 PM, Dave Chinner wrote:
>> On Wed, Sep 03, 2014 at 01:54:54PM +0800, Junxiao Bi wrote:
>>> commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O
>>> during memory allocation")
On 09/04/2014 05:23 PM, Dave Chinner wrote:
> On Wed, Sep 03, 2014 at 01:54:54PM +0800, Junxiao Bi wrote:
>> commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O
>> during memory allocation")
>> introduces PF_MEMALLOC_NOIO flag to avoid doing I
On 09/04/2014 10:30 AM, Andrew Morton wrote:
> On Thu, 04 Sep 2014 10:08:09 +0800 Junxiao Bi wrote:
>
>> On 09/04/2014 07:10 AM, Andrew Morton wrote:
>>> On Wed, 3 Sep 2014 13:54:54 +0800 Junxiao Bi wrote:
>>>
>>>> commit 21caf2fc1931 ("mm:
On 09/03/2014 08:20 PM, Trond Myklebust wrote:
> On Wed, Sep 3, 2014 at 1:54 AM, Junxiao Bi wrote:
>> commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O
>> during memory allocation")
>> introduces PF_MEMALLOC_NOIO flag to avoid doing I
On 09/04/2014 07:10 AM, Andrew Morton wrote:
> On Wed, 3 Sep 2014 13:54:54 +0800 Junxiao Bi wrote:
>
>> commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O
>> during memory allocation")
>> introduces PF_MEMALLOC_NOIO flag to avoid doin
may still
run into I/O, like in superblock shrinker.
Signed-off-by: Junxiao Bi
Cc: joyce.xue
Cc: Ming Lei
---
include/linux/sched.h |6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..2fb2c47 100644
--- a/inc
On 09/03/2014 11:10 AM, Dave Chinner wrote:
> On Wed, Sep 03, 2014 at 09:38:31AM +0800, Junxiao Bi wrote:
>> Hi Jiufei,
>>
>> On 09/02/2014 05:03 PM, Xue jiufei wrote:
>>> Hi, Dave
>>> On 2014/9/2 7:51, Dave Chinner wrote:
>>>> On Fri, Aug 29, 20
Hi Jiufei,
On 09/02/2014 05:03 PM, Xue jiufei wrote:
> Hi, Dave
> On 2014/9/2 7:51, Dave Chinner wrote:
>> On Fri, Aug 29, 2014 at 05:57:22PM +0800, Xue jiufei wrote:
>>> The patch trys to solve one deadlock problem caused by cluster
>>> fs, like ocfs2. And the problem may happen at least in the b
@ ---[ end trace b09ff97496363201 ]---
Signed-off-by: Junxiao Bi
---
block/blk-merge.c | 29 +++--
1 file changed, 23 insertions(+), 6 deletions(-)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index b3bf0df..ae4f4c8 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge
On 06/27/2014 03:24 PM, Junxiao Bi wrote:
> This uint overflow will cause req->__data_len < req->bio->bi_size,
> this will confuse block layer and device driver.
>
> I watched a panic caused by this when mkfs.ext4 a volume of a large
> virtual disk on vm guest, blkdev_is
gt;__data_len is less
than req->bio->bi_size.
Signed-off-by: Junxiao Bi
---
block/blk-merge.c | 40 ++--
1 file changed, 34 insertions(+), 6 deletions(-)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index b3bf0df..340c0a7 100644
--- a/block
On 06/10/2014 11:12 AM, Jens Axboe wrote:
> On 2014-06-09 20:50, Junxiao Bi wrote:
>> On 06/10/2014 10:41 AM, Jens Axboe wrote:
>>> On 2014-06-09 20:31, Junxiao Bi wrote:
>>>> commit 7b5a3522 (loop: Limit the number of requests in the bio list)
>>>> limi
On 06/10/2014 11:12 AM, Jens Axboe wrote:
> On 2014-06-09 20:50, Junxiao Bi wrote:
>> On 06/10/2014 10:41 AM, Jens Axboe wrote:
>>> On 2014-06-09 20:31, Junxiao Bi wrote:
>>>> commit 7b5a3522 (loop: Limit the number of requests in the bio list)
>>>> limi
On 06/10/2014 10:41 AM, Jens Axboe wrote:
> On 2014-06-09 20:31, Junxiao Bi wrote:
>> commit 7b5a3522 (loop: Limit the number of requests in the bio list)
>> limit
>> the request number in loop queue to not over 128. Since the
>> "request_fn" of
>> loop
Signed-off-by: Junxiao Bi
---
block/blk-core.c |6 ++
block/blk-sysfs.c |9 +++--
2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 40d6548..58c4bd4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -851,6 +
On 06/09/2014 11:53 PM, Jens Axboe wrote:
> On 2014-06-09 01:29, Andreas Mohr wrote:
>> Hi,
>>
>> having had a look at current mainline sources,
>> frankly I've (well, initially...) got trouble understanding
>> what this patch is doing.
>>
>> It's replacing an aggressive error-type bail-out (-EINVA
mnt]# cat /sys/block/loop0/queue/nr_requests
1024
[root@bijx mnt]# dd if=/dev/zero of=/dev/loop0 bs=1M count=5000
5000+0 records in
5000+0 records out
524288 bytes (5.2 GB) copied, 464.481 s, 11.3 MB/s
Signed-off-by: Junxiao Bi
---
block/blk-sysfs.c |8 +---
1 file changed, 5 inserti
Hi All,
Through iperf test between two 1000M NIC, I got a very different cpu
usage on server mode when bind iperf to cpu 0 or bind to other cpu,
the cpu usage of binding to cpu 0 is about two times higher than
another one, but the network bandwidth is nearly the same. My NIC
doesn't support multi-
66 matches
Mail list logo