On Thu, Dec 18, 2014 at 04:22:59PM -0800, Andy Lutomirski wrote:
> On Thu, Dec 18, 2014 at 3:30 PM, Andy Lutomirski wrote:
> > On Wed, Dec 17, 2014 at 3:12 PM, Shaohua Li wrote:
> >> This primarily speeds up clock_gettime(CLOCK_THREAD_CPUTIME_ID, ..). We
> >> use the
On Fri, Dec 19, 2014 at 09:53:24AM -0800, Andy Lutomirski wrote:
> On Fri, Dec 19, 2014 at 9:42 AM, Chris Mason wrote:
> >
> >
> > On Fri, Dec 19, 2014 at 11:48 AM, Andy Lutomirski
> > wrote:
> >>
> >> On Fri, Dec 19, 2014 at 3:23 AM, Peter Zijlstra
> >> wrote:
> >>>
> >>> On Thu, Dec 18, 2014
) {
> > memset(512M);
> > madvise(MADV_FREE or MADV_DONTNEED);
> > }
> >
> > 1) dontneed: 6.78user 234.09system 0:48.89elapsed
> > 2) madvfree: 6.03user 401.17system 1:30.67elapsed
> > 3) madvfree + this ptach: 5.68user 113.42system 0:36.52elapse
On Wed, Feb 25, 2015 at 04:11:18PM +0900, Minchan Kim wrote:
> On Wed, Feb 25, 2015 at 09:08:09AM +0900, Minchan Kim wrote:
> > Hi Michal,
> >
> > On Tue, Feb 24, 2015 at 04:43:18PM +0100, Michal Hocko wrote:
> > > On Tue 24-02-15 17:18:14, Minchan Kim wrote:
> > > > Recently, Shaohua reported tha
On Wed, Mar 11, 2015 at 06:19:27PM -0400, Tony Battersby wrote:
> On 03/11/2015 05:45 PM, Jens Axboe wrote:
> > On 03/11/2015 02:15 PM, Tony Battersby wrote:
> >> This reverts commits 12cb5ce101abfaf74421f8cc9f196e708209eb79 and
> >> 98bd4be1ba95f2fe7f543910792b7163a5de06eb.
> >>
> >> Commit 12cb5c
On Wed, Mar 11, 2015 at 06:19:27PM -0400, Tony Battersby wrote:
> On 03/11/2015 05:45 PM, Jens Axboe wrote:
> > On 03/11/2015 02:15 PM, Tony Battersby wrote:
> >> This reverts commits 12cb5ce101abfaf74421f8cc9f196e708209eb79 and
> >> 98bd4be1ba95f2fe7f543910792b7163a5de06eb.
> >>
> >> Commit 12cb5c
x27;t directly match to ata tag. We use the new flag for
sas ata tag allocation.
Reported-by: Tony Battersby
Signed-off-by: Shaohua Li
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 4c35f08..ef150eb 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.
On Thu, Mar 12, 2015 at 09:59:26AM -0400, Tejun Heo wrote:
> On Thu, Mar 12, 2015 at 05:46:01AM -0700, Shaohua Li wrote:
> > ata: Add a new flag for sas controller
> >
> > Add a new flag to destinguish sas controller. sas controller has its own tag
> > allocation, whic
kload here is fsync write a block device. Without plug
merge, sequential write (fsync makes it sync IO) will dispatch 4k IO.
Cc: Jens Axboe
Cc: Christoph Hellwig
Signed-off-by: Shaohua Li
---
block/blk-mq.c | 98 ++
1 file changed, 71 inser
On Fri, Feb 06, 2015 at 02:51:03PM +0900, Minchan Kim wrote:
> Hi Shaohua,
>
> On Thu, Feb 05, 2015 at 04:33:11PM -0800, Shaohua Li wrote:
> >
> > Hi Minchan,
> >
> > Sorry to jump in this thread so later, and if some issues are discussed
> > before.
>
On Fri, Feb 06, 2015 at 01:58:25PM +0100, Michal Hocko wrote:
> On Thu 05-02-15 16:33:11, Shaohua Li wrote:
> [...]
> > Did you think about move the MADV_FREE pages to the head of inactive LRU, so
> > they can be reclaimed easily?
>
> Yes this makes sense for pages livin
c: Peter Zijlstra
Cc: Andy Lutomirski
Cc: Ingo Molnar
Signed-off-by: Shaohua Li
---
kernel/events/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 04d8b48..98105cf 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5950
: Shaohua Li
---
kernel/events/core.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 19efcf1..04d8b48 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1769,6 +1769,10 @@ event_sched_in(struct perf_event
Hi Minchan,
Sorry to jump in this thread so later, and if some issues are discussed before.
I'm interesting in this patch, so tried it here. I use a simple test with
jemalloc. Obviously this can improve performance when there is no memory
pressure. Did you try setup with memory pressure?
In my t
On Mon, Feb 09, 2015 at 04:15:53PM +0900, Minchan Kim wrote:
> On Fri, Feb 06, 2015 at 10:29:18AM -0800, Shaohua Li wrote:
> > On Fri, Feb 06, 2015 at 02:51:03PM +0900, Minchan Kim wrote:
> > > Hi Shaohua,
> > >
> > > On Thu, Feb 05, 2015 at 04:33:11PM -0800, S
On Wed, Feb 11, 2015 at 09:56:20AM +0900, Minchan Kim wrote:
> Hi Shaohua,
>
> On Tue, Feb 10, 2015 at 02:38:26PM -0800, Shaohua Li wrote:
> > On Mon, Feb 09, 2015 at 04:15:53PM +0900, Minchan Kim wrote:
> > > On Fri, Feb 06, 2015 at 10:29:18AM -0800, Shaohua Li wrote
Ping!
On Fri, Jan 23, 2015 at 07:57:24AM -0800, Shaohua Li wrote:
> On Fri, Jan 23, 2015 at 09:44:51AM +0100, Peter Zijlstra wrote:
> > On Thu, Jan 22, 2015 at 01:09:02PM -0800, Shaohua Li wrote:
> > > ---
> > > kernel/events/core.c | 3 +++
> >
Currently vdso data is one page. Next patches will add per-cpu data to
vdso, which requires several pages if CPU number is big. This makes VDSO
data support multiple pages.
Cc: Andy Lutomirski
Cc: H. Peter Anvin
Cc: Ingo Molnar
Signed-off-by: Shaohua Li
---
arch/x86/include/asm/vdso.h
an be used to detect if context switch occurs.
Andy suggested we can use a timestamp, so in next patch we can save some
intructions. But the principle isn't changed here. This patch uses the
timestamp approach.
Cc: Andy Lutomirski
Cc: H. Peter Anvin
Cc: Ingo Molnar
Signed-off-by: Shaohua Li
etected on x86_64 context
switch code. Most archs that don't support vsyscalls will have this code
disabled via jump labels.
Cc: Andy Lutomirski
Cc: H. Peter Anvin
Cc: Ingo Molnar
Signed-off-by: Kumar Sundararajan
Signed-off-by: Arun Sharma
Signed-off-by: Chris Mason
Signed-off-by: Shaohua Li
On Thu, Feb 26, 2015 at 09:42:06AM +0900, Minchan Kim wrote:
> Hello,
>
> On Wed, Feb 25, 2015 at 10:37:48AM -0800, Shaohua Li wrote:
> > On Wed, Feb 25, 2015 at 04:11:18PM +0900, Minchan Kim wrote:
> > > On Wed, Feb 25, 2015 at 09:08:09AM +0900, Minchan Kim
On Wed, Mar 09, 2016 at 12:58:25PM +1100, Neil Brown wrote:
>
> break_stripe_batch_list breaks up a batch and copies some flags from
> the batch head to the members, preserving others.
>
> It doesn't preserve or copy STRIPE_PREREAD_ACTIVE. This is not
> normally a problem as STRIPE_PREREAD_ACTIV
On Thu, Mar 10, 2016 at 06:19:42AM +1100, Neil Brown wrote:
> On Thu, Mar 10 2016, Shaohua Li wrote:
>
> > On Wed, Mar 09, 2016 at 12:58:25PM +1100, Neil Brown wrote:
> >>
> >> break_stripe_batch_list breaks up a batch and copies some flags from
> >> th
otplug notifier transitions
> to free the scratch buffer.
>
> CC: Shaohua Li
> CC: linux-r...@vger.kernel.org
> Signed-off-by: Anna-Maria Gleixner
Applied, thanks!
nges up to f9a67b1182e5abfcfcec24762ea95a77332f035e:
md/bitmap: clear bitmap if bitmap_create failed (2016-04-01 13:05:50 -0700)
Guoqing Jiang (1):
md/bitmap: clear bitmap if bitmap_create failed
Shaohua Li (1):
MD: add rdev reference
ake_request handle arbitrarily sized bios)
this bug is introduced by d2be537c3ba
> Reported-by: Sebastian Roesner
> Reported-by: Eric Wheeler
> Cc: sta...@vger.kernel.org (4.2+)
> Cc: Shaohua Li
> Signed-off-by: Ming Lei
> ---
> I can reproduce the issue and verify the fix
On Tue, Apr 05, 2016 at 04:27:33PM -0800, Kent Overstreet wrote:
> On Tue, Apr 05, 2016 at 11:27:21AM -0700, Shaohua Li wrote:
> > On Wed, Apr 06, 2016 at 01:44:06AM +0800, Ming Lei wrote:
> > > After arbitrary bio size is supported, the incoming bio may
> > > be very bi
On Tue, Apr 05, 2016 at 04:36:04PM -0800, Kent Overstreet wrote:
> On Tue, Apr 05, 2016 at 05:30:07PM -0700, Shaohua Li wrote:
> > this one:
> > http://marc.info/?l=linux-kernel&m=145926976808760&w=2
>
> Ah. that patch won't actually fix the bug, since md isn&
On Tue, Apr 05, 2016 at 03:36:57PM +0200, Lars Ellenberg wrote:
> blk_check_plugged() will return a pointer
> to an object linked on current->plug->cb_list.
>
> That list may "at any time" be implicitly cleared by
> blk_flush_plug_list()
> flush_plug_callbacks()
> either as a result of blk_finish
On Tue, Apr 05, 2016 at 04:45:55PM -0800, Kent Overstreet wrote:
> On Tue, Apr 05, 2016 at 05:41:47PM -0700, Shaohua Li wrote:
> > On Tue, Apr 05, 2016 at 04:36:04PM -0800, Kent Overstreet wrote:
> > > On Tue, Apr 05, 2016 at 05:30:07PM -0700, Shaohua Li wrote:
> > &g
On Wed, Apr 06, 2016 at 08:47:56AM +0800, Ming Lei wrote:
> On Wed, Apr 6, 2016 at 2:27 AM, Shaohua Li wrote:
> > On Wed, Apr 06, 2016 at 01:44:06AM +0800, Ming Lei wrote:
> >> After arbitrary bio size is supported, the incoming bio may
> >> be very big. We have to sp
On Mon, Apr 18, 2016 at 10:05:22AM -0700, John Stultz wrote:
> On Mon, Apr 11, 2016 at 5:57 PM, Shaohua Li wrote:
> > Calvin found 'perf record -a --call-graph dwarf -- sleep 5' making
> > clocksource
> > switching to hpet. We found similar symptom in another m
On Mon, Apr 18, 2016 at 10:42:38AM -0700, John Stultz wrote:
> On Mon, Apr 18, 2016 at 10:32 AM, Shaohua Li wrote:
> > On Mon, Apr 18, 2016 at 10:05:22AM -0700, John Stultz wrote:
> >> On Mon, Apr 11, 2016 at 5:57 PM, Shaohua Li wrote:
> >> > Calvin found
bio might have NOMERGE flag set, for example blk_queue_split sets it.
When we initiate request, copy this flag too.
Signed-off-by: Shaohua Li
---
include/linux/blk_types.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
https://bugzilla.kernel.org/show_bug.cgi?id=117051
Reported-by: Park Ju Hyung
Fixes: 6ac45aeb6bca(block: avoid to merge splitted bio)
Cc: sta...@vger.kernel.org (v4.3+)
Cc: Ming Lei
Cc: Jens Axboe
Cc: Neil Brown
Signed-off-by: Shaohua Li
---
drivers/md/md.c | 2 ++
1 file changed, 2 insertions(+)
diff
On Tue, Mar 29, 2016 at 02:18:33PM -0700, Christoph Hellwig wrote:
> On Tue, Mar 29, 2016 at 09:42:33AM -0700, Shaohua Li wrote:
> > bio_alloc_bioset() allocates bvecs from bvec_slabs which can only
> > allocate maximum 256 bvec (eg, 1M for 4k pages). We can't bump
> &
On Wed, Mar 30, 2016 at 09:39:35AM +0800, Ming Lei wrote:
> On Wed, Mar 30, 2016 at 12:42 AM, Shaohua Li wrote:
> > bio_alloc_bioset() allocates bvecs from bvec_slabs which can only
> > allocate maximum 256 bvec (eg, 1M for 4k pages). We can't bump
> > BLK_DEF_MAX_SE
On Tue, Mar 29, 2016 at 11:51:51PM -0700, Christoph Hellwig wrote:
> On Tue, Mar 29, 2016 at 03:01:10PM -0700, Shaohua Li wrote:
> > The problem is bcache allocates a big bio (with bio_alloc). The bio is
> > split with blk_queue_split, but it isn't split to small size because
On Wed, Mar 30, 2016 at 08:13:07PM +0800, Ming Lei wrote:
> Hi Shaohua,
>
> On Wed, Mar 30, 2016 at 10:27 AM, Shaohua Li wrote:
> > On Wed, Mar 30, 2016 at 09:39:35AM +0800, Ming Lei wrote:
> >> On Wed, Mar 30, 2016 at 12:42 AM, Shaohua Li wrote:
> >> > bio
value.
In the relevant machine, the hpet counter doesn't read to 0x later.
The chance hpet has 0x counter is very small, this patch should have no
impact for good hpet.
I'm open if there is better solution.
Reported-by: Calvin Owens
Signed-off-by: Shaohua Li
---
arch/x86/ker
ff-by: Shaohua Li
---
kernel/time/clocksource.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 56ece14..36aff4e 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -122,9 +122,10 @@ stati
On Wed, Nov 04, 2015 at 05:05:47PM -0500, Daniel Micay wrote:
> > With enough pages at once, though, munmap would be fine, too.
>
> That implies lots of page faults and zeroing though. The zeroing alone
> is a major performance issue.
>
> There are separate issues with munmap since it ends up res
On Fri, Oct 30, 2015 at 05:02:47PM +0300, Roman Gushchin wrote:
> > Isn't the 4.1 fix just:
> >
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index e5befa356dbe..6e4350a78257 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -3522,16 +3522,16 @@ returnbi:
> >
On Fri, Oct 30, 2015 at 04:01:37PM +0900, Minchan Kim wrote:
> +static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
> + unsigned long end, struct mm_walk *walk)
> +
> +{
> + struct mmu_gather *tlb = walk->private;
> + struct mm_struct *mm = tlb->mm;
On Fri, Oct 30, 2015 at 04:01:41PM +0900, Minchan Kim wrote:
> MADV_FREE is a hint that it's okay to discard pages if there is memory
> pressure and we use reclaimers(ie, kswapd and direct reclaim) to free them
> so there is no value keeping them in the active anonymous LRU so this
> patch moves th
hread+0xf8/0x110
[ 28.020071] [] ?kthread_create_on_node+0x200/0x200
[ 28.020071] [] ret_from_fork+0x3f/0x70
[ 28.020071] [] ?kthread_create_on_node+0x200/0x200
Signed-off-by: Shaohua Li
---
kernel/workqueue.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/k
On Tue, Nov 03, 2015 at 09:52:23AM +0900, Minchan Kim wrote:
> On Fri, Oct 30, 2015 at 10:22:12AM -0700, Shaohua Li wrote:
> > On Fri, Oct 30, 2015 at 04:01:41PM +0900, Minchan Kim wrote:
> > > MADV_FREE is a hint that it's okay to discard pages if there is memory
&g
On Wed, Nov 04, 2015 at 09:53:42AM -0800, Shaohua Li wrote:
> On Tue, Nov 03, 2015 at 09:52:23AM +0900, Minchan Kim wrote:
> > On Fri, Oct 30, 2015 at 10:22:12AM -0700, Shaohua Li wrote:
> > > On Fri, Oct 30, 2015 at 04:01:41PM +0900, Minchan Kim wrote:
> > > > MADV
On Wed, Nov 04, 2015 at 10:25:55AM +0900, Minchan Kim wrote:
> Linux doesn't have an ability to free pages lazy while other OS already
> have been supported that named by madvise(MADV_FREE).
>
> The gain is clear that kernel can discard freed pages rather than swapping
> out or OOM if memory press
bad:
md-cluster: release RESYNC lock after the last resync message (2018-08-31
17:38:10 -0700)
Guoqing Jiang (1):
md-cluster: release RESYNC lock after the last resync message
Shaohua Li (1):
md/raid5-cache: disable reshape
Hi,
A few fixes of MD for this merge window. Mostly bug fixes:
- raid5 stripe batch fix from Amy
- Read error handling for raid1 FailFast device from Gioh
- raid10 recovery NULL pointer dereference fix from Guoqing
- Support write hint for raid5 stripe cache from Mariusz
- Fixes for device hot add/
Hi,
3 small fixes for MD:
- md-cluster fix for faulty device from Guoqing
- writehint fix for writebehind IO for raid1 from Mariusz
- a live lock fix for interrupted recovery from Yufen
Please pull!
The following changes since commit f8cf2f16a7c95acce497bfafa90e7c6d8397d653:
Merge branch 'next
On Fri, Nov 18, 2016 at 04:16:11PM +1100, Neil Brown wrote:
> Hi,
>
> I've been sitting on these patches for a while because although they
> solve a real problem, it is a fairly limited use-case, and I don't
> really like some of the details.
>
> So I'm posting them as RFC in the hope that a
On Tue, Nov 22, 2016 at 03:02:53PM -0500, Tejun Heo wrote:
> Hello, Shaohua.
>
> Sorry about the delay.
>
> On Mon, Nov 14, 2016 at 02:22:09PM -0800, Shaohua Li wrote:
> > @@ -1376,11 +1414,37 @@ static ssize_t tg_set_max(struct kernfs_open_file
> > *of,
> >
On Tue, Nov 22, 2016 at 03:16:43PM -0500, Tejun Heo wrote:
> On Mon, Nov 14, 2016 at 02:22:10PM -0800, Shaohua Li wrote:
> > each queue will have a state machine. Initially queue is in LIMIT_HIGH
> > state, which means all cgroups will be throttled according to their high
>
On Tue, Nov 22, 2016 at 04:27:15PM -0500, Tejun Heo wrote:
> Hello,
>
> On Mon, Nov 14, 2016 at 02:22:14PM -0800, Shaohua Li wrote:
> > throtl_slice is important for blk-throttling. A lot of stuffes depend on
> > it, for example, throughput measurement. It has 100ms defaul
On Tue, Nov 22, 2016 at 04:42:00PM -0500, Tejun Heo wrote:
> Hello,
>
> On Tue, Nov 22, 2016 at 04:21:21PM -0500, Tejun Heo wrote:
> > 1. A cgroup and its high and max limits don't have much to do with
> >other cgroups and their limits. I don't get how the choice between
> >high and max l
On Wed, Nov 23, 2016 at 04:23:35PM -0500, Tejun Heo wrote:
> Hello,
>
> On Mon, Nov 14, 2016 at 02:22:16PM -0800, Shaohua Li wrote:
> > cg1/cg2 bps: 10/80 -> 15/105 -> 20/100 -> 25/95 -> 30/90 -> 35/85 -> 40/80
> > -> 45/75 -> 10/80
>
> I wonde
On Wed, Nov 23, 2016 at 04:32:43PM -0500, Tejun Heo wrote:
> On Mon, Nov 14, 2016 at 02:22:18PM -0800, Shaohua Li wrote:
> > Add interface to configure the threshold
> >
> > Signed-off-by: Shaohua Li
> > ---
> > block/blk-sysfs.c| 7 +
On Wed, Nov 23, 2016 at 04:46:19PM -0500, Tejun Heo wrote:
> Hello, Shaohua.
>
> On Mon, Nov 14, 2016 at 02:22:17PM -0800, Shaohua Li wrote:
> > Unfortunately it's very hard to determine if a cgroup is real idle. This
> > patch uses the 'think time check' i
On Fri, Dec 23, 2016 at 12:52:30AM +, Colin King wrote:
> From: Colin Ian King
>
> Trivial fix to spelling mistake "recoverying" to "recovering" in
> pr_dbg message.
applied, thanks
> Signed-off-by: Colin Ian King
> ---
> drivers/md/raid5-cache.c | 2 +-
> 1 file changed, 1 insertion(+),
On Fri, Dec 23, 2016 at 07:25:56PM +0100, MasterPrenium wrote:
> Hello Guys,
>
> I've having some trouble on a new system I'm setting up. I'm getting a kernel
> BUG message, seems to be related with the use of Xen (when I boot the system
> _without_ Xen, I don't get any crash).
> Here is configu
Khlebnikov ?
Right.
> Do you want the "ext4.dat" fio file ? It will be really difficult for me to
> provide it to you as I've only a poor ADSL network connection.
Not necessary.
Thanks,
Shaohua
> Thanks for your help,
>
> MasterPrenium
>
> Le 04/01/2017 à
arc.info/?l=linux-block&m=147916216512915&w=2
V2->V3:
- Rebase
- Fix several bugs
- Make harddisk think time threshold bigger
http://marc.info/?l=linux-kernel&m=147552964708965&w=2
V1->V2:
- Drop io.low interface for simplicity and the interface isn't a must-have to
ff-by: Shaohua Li
---
block/blk-throttle.c | 20 ++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index d3ad43c..3bc6deb 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -212,12 +212,28 @@ static s
configuration. Old bps/iops fields in throtl_grp will be the
actual limit we use for throttling.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 142 +--
1 file changed, 114 insertions(+), 28 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk
Last patch introduces a way to detect idle cgroup. We use it to make
upgrade/downgrade decision. And the new algorithm can detect completely
idle cgroup too, so we can delete the corresponding code.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 40
be treated idle and
other cgroups can dispatch more IO.
Currently this latency target check is only for SSD as we can't
calcualte the latency target for hard disk. And this is only for cgroup
leaf node so far.
Signed-off-by: Shaohua Li
---
block/blk-
The throtl_slice is 100ms by default. This is a long time for SSD, a lot
of IO can run. To make cgroups have smoother throughput, we choose a
small value (20ms) for SSD.
Signed-off-by: Shaohua Li
---
block/blk-sysfs.c| 2 ++
block/blk-throttle.c | 18 +++---
block/blk.h
'us' with 'ns >> 10'. This is fast but loses
precision, should not a big deal.
Signed-off-by: Shaohua Li
---
block/bio.c | 2 ++
block/blk-throttle.c | 79 ++-
block/blk.h | 2 ++
include
We are going to support low/max limit, each cgroup will have 2 limits
after that. This patch prepares for the multiple limits change.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 110 ---
1 file changed, 70 insertions(+), 40 deletions
ningless. As long as parent's bps/iops (which is a sum of childrens
bps/iops) cross low limit, we can upgrade queue state.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 100 ---
1 file changed, 96 insertions(+), 4 deletions(-)
diff --git
other cgroups.
User will configure the interface in this way:
echo "8:16 rbps=2097152 wbps=max latency=100 idle=200" > io.low
latency is in microsecond unit
By default, latency target is 0, which means to guarantee IO latency.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c |
Add interface to configure the threshold. The io.low interface will
like:
echo "8:16 rbps=2097152 wbps=max idle=2000" > io.low
idle is in microsecond unit.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 41 -
1 file changed, 28 inse
When queue state machine is in LIMIT_MAX state, but a cgroup is below
its low limit for some time, the queue should be downgraded to lower
state as one cgroup's low limit isn't met.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 156
the sysfs name
'throttle_sample_time' reflects its character better.
Signed-off-by: Shaohua Li
---
Documentation/block/queue-sysfs.txt | 6 +++
block/blk-sysfs.c | 10 +
block/blk-throttle.c| 77 ++---
block/blk.h
clean up the code to avoid using -1
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 32
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index a6bb4fe..e45bf50 100644
--- a/block/blk-throttle.c
, which still is very long time.
Signed-off-by: Shaohua Li
---
block/blk-core.c | 2 +-
block/blk-mq.c| 2 +-
block/blk-stat.c | 7 ---
block/blk-stat.h | 29 +++--
block/blk-wbt.h | 10 +-
include/linux
If the scale becomes 0, we then
fully downgrade the queue to LIMIT_LOW state.
Note this doesn't completely avoid cgroup running under its low limit.
The best way to guarantee cgroup doesn't run under its limit is to set
max limit. For example, if we set cg1 max limit to 40, cg2 will nev
ead of request size. Currently this feature is SSD only, we probably
can use a fixed threshold like 4ms for hard disk though.
Signed-off-by: Shaohua Li
---
block/blk-stat.c | 4 ++
block/blk-throttle.c | 162 --
block/blk.h
roup sleep time not too big wouldn't change cgroup
bps/iops, but could make it wakeup more frequently, which isn't a big
issue because throtl_slice * 8 is already quite big.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 11 +++
1 file changed, 11 insertions(+)
diff --git
roups, so I
leave it here.
Signed-off-by: Shaohua Li
---
block/blk-throttle.c | 19 ++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 2d05c91..b3ce176 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@
On Tue, Jan 31, 2017 at 01:59:49PM -0500, Johannes Weiner wrote:
> Hi Shaohua,
>
> On Sun, Jan 29, 2017 at 09:51:17PM -0800, Shaohua Li wrote:
> > We are trying to use MADV_FREE in jemalloc. Several issues are found.
> > Without
> > solving the issues, jemalloc can&
blk_mq_tags/requests of specific hardware queue are mostly used in
specific cpus, which might not be in the same numa node as disk. For
example, a nvme card is in node 0. half hardware queue will be used by
node 0, the other node 1.
Signed-off-by: Shaohua Li
---
block/blk-mq.c | 14
nvme_queue is per-cpu queue (mostly). Allocating it in node where blk-mq
will use it.
Signed-off-by: Shaohua Li
---
drivers/nvme/host/pci.c | 19 +++
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 3faefab
On Tue, Jan 31, 2017 at 04:38:10PM -0500, Johannes Weiner wrote:
> On Tue, Jan 31, 2017 at 11:45:47AM -0800, Shaohua Li wrote:
> > On Tue, Jan 31, 2017 at 01:59:49PM -0500, Johannes Weiner wrote:
> > > Hi Shaohua,
> > >
> > > On Sun, Jan 29, 2017 at 09:51:17PM
nvme_queue is per-cpu queue (mostly). Allocating it in node where blk-mq
will use it.
Signed-off-by: Shaohua Li
---
drivers/nvme/host/pci.c | 12
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 032237c..9733008
Next patch will use the API to get the node from vector for nvme device
Signed-off-by: Shaohua Li
---
drivers/pci/msi.c | 16
include/linux/pci.h | 6 ++
2 files changed, 22 insertions(+)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 50c5003..ab7aee7 100644
blk_mq_tags/requests of specific hardware queue are mostly used in
specific cpus, which might not be in the same numa node as disk. For
example, a nvme card is in node 0. half hardware queue will be used by
node 0, the other node 1.
Signed-off-by: Shaohua Li
---
block/blk-mq.c | 21
is
more nartual.
Cc: Fenghua Yu
Cc: Thomas Gleixner
Signed-off-by: Shaohua Li
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 9 +
1 file changed, 9 insertions(+)
diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 8af04af..7e81527 100644
-
On Mon, Jan 09, 2017 at 11:03:59PM +0100, Thomas Gleixner wrote:
> On Mon, 9 Jan 2017, Fenghua Yu wrote:
> > On Fri, Jan 06, 2017 at 04:05:19PM -0800, Shaohua Li wrote:
> > But since you come here now, I would think reseting the CBM in
> > closid_free() is better.
>
&g
On Mon, Jan 09, 2017 at 04:46:35PM -0500, Tejun Heo wrote:
> Hello,
>
> Sorry about the long delay. Generally looks good to me. Overall,
> there are only a few things that I think should be addressed.
Thanks for your time!
> * Low limit should default to zero.
I forgot to change it after cha
On Sun, Jan 08, 2017 at 02:31:15PM +0100, MasterPrenium wrote:
> Hello,
>
> Replies below + :
> - I don't know if this can help but after the crash, when the system
> reboots, the Raid 5 stack is re-synchronizing
> [ 37.028239] md10: Warning: Device sdc1 is misaligned
> [ 37.028541] created bi
On Fri, Jan 20, 2017 at 10:29:52PM +0800, Geliang Tang wrote:
> Since i_blocksize() helper has been defined in fs.h, use it instead
> of open-coding.
which tree is this patch applied to? I can't find it in Linus's tree
> Signed-off-by: Geliang Tang
> ---
> drivers/md/bitmap.c | 6 +++---
> 1 fi
4 11:26:06 -0800)
--------
Shaohua Li (1):
md/raid5-cache: delete meaningless code
Song Liu (5):
md/r5cache: read data into orig_page for prexor of cached data
md/raid5: move comment of fetch_block to right location
E page can be promoted to
active page there. But there isn't mm_struct context at that place. Iterating
vma there sounds too silly. The patchset don't fix this issue yet. Hopefully
somebody can share a hint how to fix this issue.
Thanks,
Shaohua
Minchan previous patches:
http://marc
n normal way.
Cc: Michal Hocko
Cc: Minchan Kim
Cc: Hugh Dickins
Cc: Johannes Weiner
Cc: Rik van Riel
Cc: Mel Gorman
Signed-off-by: Shaohua Li
---
mm/rmap.c | 7 ++-
mm/vmscan.c | 56
2 files changed, 54 insertions(+), 9 deleti
ickins
Cc: Johannes Weiner
Cc: Rik van Riel
Cc: Mel Gorman
Signed-off-by: Shaohua Li
---
drivers/base/node.c | 2 ++
drivers/staging/android/lowmemorykiller.c | 3 ++-
fs/proc/meminfo.c | 1 +
include/linux/mm_inline.h
m
Cc: Hugh Dickins
Cc: Johannes Weiner
Cc: Rik van Riel
Cc: Mel Gorman
Signed-off-by: Shaohua Li
---
fs/proc/task_mmu.c | 8 +++-
include/linux/mm_inline.h | 5 +
include/linux/page-flags.h | 6 ++
mm/huge_memory.c | 1 +
mm/migrate.c | 2 ++
5
: Mel Gorman
Signed-off-by: Shaohua Li
---
include/linux/swap.h | 2 +-
mm/huge_memory.c | 5 ++---
mm/madvise.c | 3 +--
mm/swap.c| 51 +--
4 files changed, 33 insertions(+), 28 deletions(-)
diff --git a/include/linux/swa
401 - 500 of 876 matches
Mail list logo