On Thu, Apr 25, 2013 at 03:00:57PM -0700, Andrew Morton wrote:
> On Mon, 22 Apr 2013 17:26:28 +0900 Joonsoo Kim wrote:
>
> > We can get virtual address without virtual field.
> > So remove it.
> >
> > ...
> >
> > --- a/mm/highmem.c
> > +++ b/mm
On Mon, Mar 25, 2013 at 02:32:35PM -0400, Steven Rostedt wrote:
> On Mon, 2013-03-25 at 18:27 +, Christoph Lameter wrote:
> > On Mon, 25 Mar 2013, Steven Rostedt wrote:
> >
> > > If this makes it more deterministic, and lower worse case latencies,
> > > then it's definitely worth the price.
>
On Tue, Mar 26, 2013 at 11:30:32PM -0400, Steven Rostedt wrote:
> On Wed, 2013-03-27 at 11:59 +0900, Joonsoo Kim wrote:
>
> > How about using spin_try_lock() in unfreeze_partials() and
> > using spin_lock_contented() in get_partial_node() to reduce latency?
> > IMHO, thi
Remove one division operation in find_buiest_queue().
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6f238d2..1d8774f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4911,7 +4911,7 @@ static struct rq *find_busiest_queue(struct lb_env *env
ity is ensured.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 204a9a9..e232421 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -631,23 +631,20 @@ static u64 __sched_period(unsigned long nr_running)
*/
static u64 sched_slice(struct cfs_rq *
d-off-by: Joonsoo Kim
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 95ec757..204a9a9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4175,36 +4175,6 @@ static unsigned long task_h_load(struct task_struct *p)
/** Helpers for find_busiest_
nice -20 is sysctl_sched_min_granularity * 10 * (88761 / 97977),
that is, approximately, sysctl_sched_min_granularity * 9. This aspect
can be much larger if there is more tasks with nice 0.
So we should limit this possible weird situation.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched
data bss dec hex filename
342431136 116 354958aa7 kernel/sched/fair.o
In addition, rename @balance to @should_balance in order to represent
its purpose more clearly.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1d8774f..95ec757
sched_slice()
Feel free to give a comment for this patchset.
It's based on v3.9-rc4 and top of my previous patchset. But, perhaps,
it may not really depend on my previous patchset. :)
https://lkml.org/lkml/2013/3/26/28
"[PATCH v2 0/6] correct load_balance()"
Thanks.
Joonsoo Kim (5):
sc
Hello, Preeti.
On Fri, Mar 29, 2013 at 12:42:53PM +0530, Preeti U Murthy wrote:
> Hi Joonsoo,
>
> On 03/28/2013 01:28 PM, Joonsoo Kim wrote:
> > Following-up upper se in sched_slice() should not be done,
> > because sched_slice() is used for checking that resched is nee
Hello Preeti.
On Fri, Mar 29, 2013 at 05:05:37PM +0530, Preeti U Murthy wrote:
> Hi Joonsoo
>
> On 03/28/2013 01:28 PM, Joonsoo Kim wrote:
> > sched_slice() compute ideal runtime slice. If there are many tasks
> > in cfs_rq, period for this cfs_rq is extended to guarantee t
Hello, Peter.
On Fri, Mar 29, 2013 at 12:45:14PM +0100, Peter Zijlstra wrote:
> On Thu, 2013-03-28 at 16:58 +0900, Joonsoo Kim wrote:
> > There is not enough reason to place this checking at
> > update_sg_lb_stats(),
> > except saving one iteration for sched_group_cpus. But
On Fri, Mar 29, 2013 at 12:58:26PM +0100, Peter Zijlstra wrote:
> On Thu, 2013-03-28 at 16:58 +0900, Joonsoo Kim wrote:
> > +static int should_we_balance(struct lb_env *env)
> > +{
> > + struct sched_group *sg = env->sd->groups;
> >
ed to pay this cost at all times. With this rationale,
change code to initialize kprobe_blacklist when it is used firstly.
Cc: Ananth N Mavinakayanahalli
Cc: Anil S Keshavamurthy
Cc: "David S. Miller"
Cc: Masami Hiramatsu
Signed-off-by: Joonsoo Kim
---
I fotgot to add lkml.
Sorry f
Hello, Russell.
On Mon, Mar 25, 2013 at 09:48:16AM +, Russell King - ARM Linux wrote:
> On Mon, Mar 25, 2013 at 01:11:13PM +0900, Joonsoo Kim wrote:
> > nobootmem use max_low_pfn for computing boundary in free_all_bootmem()
> > So we need proper value to max_low_pfn.
> &g
Hello, Christoph.
On Mon, Apr 01, 2013 at 03:33:23PM +, Christoph Lameter wrote:
> Subject: slub: Fix object counts in acquire_slab V2
>
> It seems that we were overallocating objects from the slab queues
> since get_partial_node() assumed that page->inuse was undisturbed by
> acquire_slab().
Hello, Christoph.
On Mon, Apr 01, 2013 at 03:32:43PM +, Christoph Lameter wrote:
> On Thu, 28 Mar 2013, Paul Gortmaker wrote:
>
> > > Index: linux/init/Kconfig
> > > ===
> > > --- linux.orig/init/Kconfig 2013-03-28 12:14:26.9
Hello, Preeti.
On Mon, Apr 01, 2013 at 12:15:50PM +0530, Preeti U Murthy wrote:
> Hi Joonsoo,
>
> On 04/01/2013 10:39 AM, Joonsoo Kim wrote:
> > Hello Preeti.
> > So we should limit this possible weird situation.
> >>>
> >>> Signed-off-by: Joons
Hello, Preeti.
On Mon, Apr 01, 2013 at 12:36:52PM +0530, Preeti U Murthy wrote:
> Hi Joonsoo,
>
> On 04/01/2013 09:38 AM, Joonsoo Kim wrote:
> > Hello, Preeti.
> >
>
> >>
> >> Ideally the children's cpu share must add upto the parent's share
Hello, Nicolas.
On Tue, Mar 05, 2013 at 05:36:12PM +0800, Nicolas Pitre wrote:
> On Mon, 4 Mar 2013, Joonsoo Kim wrote:
>
> > With SMP and enabling kmap_high_get(), it makes users of kmap_atomic()
> > sequential ordered, because kmap_high_get() use global kmap_lock().
>
When we found that the flag has a bit of PAGE_FLAGS_CHECK_AT_PREP,
we reset the flag. If we always reset the flag, we can reduce one
branch operation. So remove it.
Cc: Hugh Dickins
Signed-off-by: Joonsoo Kim
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8fcced7..778f2a9 100644
--- a/mm
2013/3/7 Nicolas Pitre :
> On Thu, 7 Mar 2013, Joonsoo Kim wrote:
>
>> Hello, Nicolas.
>>
>> On Tue, Mar 05, 2013 at 05:36:12PM +0800, Nicolas Pitre wrote:
>> > On Mon, 4 Mar 2013, Joonsoo Kim wrote:
>> >
>> > > With SMP and e
Hello, Hugh.
On Thu, Mar 07, 2013 at 10:54:15AM -0800, Hugh Dickins wrote:
> On Thu, 7 Mar 2013, Joonsoo Kim wrote:
>
> > When we found that the flag has a bit of PAGE_FLAGS_CHECK_AT_PREP,
> > we reset the flag. If we always reset the flag, we can reduce one
> > branch
Hello, Russell.
On Thu, Mar 07, 2013 at 01:26:23PM +, Russell King - ARM Linux wrote:
> On Mon, Mar 04, 2013 at 01:50:09PM +0900, Joonsoo Kim wrote:
> > In kmap_atomic(), kmap_high_get() is invoked for checking already
> > mapped area. In __flush_dcache_page() and dma_c
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 12637ce..08bc2a4 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -23,6 +23,41 @@ enum slab_state slab_state;
> LIST_HEAD(slab_caches);
> DEFINE_MUTEX(slab_mutex);
>
> +static int kmem_cache_sanity_check(const char *name, siz
2012/8/16 Joonsoo Kim :
> When we try to free object, there is some of case that we need
> to take a node lock. This is the necessary step for preventing a race.
> After taking a lock, then we try to cmpxchg_double_slab().
> But, there is a possible scenario that cmpxchg_double_slab
cpu_partial of kmem_cache struct is a bit awkward.
It means the maximum number of objects kept in the per cpu slab
and cpu partial lists of a processor. However, current name
seems to represent objects kept in the cpu partial lists only.
So, this patch renames it.
Signed-off-by: Joonsoo Kim
Cc
ts available to the cpu without locking.
This isn't what we want.
Therefore fix it to imply same meaning in both case
and rename "available" to "cpu_slab_objects" for readability.
Signed-off-by: Joonsoo Kim
Cc: Christoph Lameter
diff --git a/mm/slub.c b/mm/s
2012/8/25 Christoph Lameter :
> On Sat, 25 Aug 2012, Joonsoo Kim wrote:
>
>> index d597530..c96e0e4 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -1538,6 +1538,7 @@ static void *get_partial_node(struct kmem_cache *s,
>> {
>> struct page *
2012/8/25 Christoph Lameter :
> On Sat, 25 Aug 2012, JoonSoo Kim wrote:
>
>> But, when using "cpu_partial_objects", I have a coding style problem.
>>
>> if (kmem_cache_debug(s)
>> |
consider pfmemalloc_match() in get_partial_node()
It prevent "deactivate -> re-get in get_partial".
Instead, new_slab() is called. It may return !PFMEMALLOC page,
so above situation will be suspended sometime.
Signed-off-by: Joonsoo Kim
Cc: David Miller
Cc: Neil Brown
Cc: Peter Zi
Now, we just do ClearSlabPfmemalloc() for first page of slab
when we clear SlabPfmemalloc flag. It is a problem because we sometimes
test flag of page which is not first page of slab in __ac_put_obj().
So add code to do ClearSlabPfmemalloc for all pages of slab.
Signed-off-by: Joonsoo Kim
Cc
In array cache, there is a object at index 0.
So fix it.
Signed-off-by: Joonsoo Kim
Cc: Mel Gorman
Cc: Christoph Lameter
diff --git a/mm/slab.c b/mm/slab.c
index 45cf59a..eb74bf5 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -976,7 +976,7 @@ static void *__ac_get_obj(struct kmem_cache *cachep
Hi, Glauber.
2012/9/18 Glauber Costa :
> +/*
> + * We need to verify if the allocation against current->mm->owner's memcg is
> + * possible for the given order. But the page is not allocated yet, so we'll
> + * need a further commit step to do the final arrangements.
> + *
> + * It is possible for
se it.
Using it makes code robust and prevent future mistakes.
So change code to use this enum value.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 692d976..188eef8 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -183,7 +183,8 @@ struct global_cwq {
WQ_NON_REENTRANT.
Change it to cpu argument is prevent to go into sub-optimal path.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 188eef8..bc5c5e1 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1158,6 +1158,8 @@ int queue_delayed_work_on(int
owing patch fix this issue properly.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index bc5c5e1..f69f094 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -269,12 +269,14 @@ struct workqueue_struct {
};
struct workqueue_struct *system
To speed cpu down processing up, use system_highpri_wq.
As scheduling priority of workers on it is higher than system_wq and
it is not contended by other normal works on this cpu, work on it
is processed faster than system_wq.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel
ker
to match theses. This implements it.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index f69f094..e0e1d41 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -489,6 +489,11 @@ static int worker_pool_pri(struct worker_pool *pool)
return pool - pool
2012/8/14 Tejun Heo :
> Hello,
>
> On Tue, Aug 14, 2012 at 01:17:49AM +0900, Joonsoo Kim wrote:
>> We assign cpu id into work struct in queue_delayed_work_on().
>> In current implementation, when work is come in first time,
>> current running cpu id is assigned.
>&g
> I think it would be better to just opencode system_wq selection in
> rebind_workers().
Sorry for my poor English skill.
Could you elaborate "opencode system_wq selection" means?
Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord
2012/8/14 Tejun Heo :
> On Tue, Aug 14, 2012 at 01:17:52AM +0900, Joonsoo Kim wrote:
>> To speed cpu down processing up, use system_highpri_wq.
>> As scheduling priority of workers on it is higher than system_wq and
>> it is not contended by other normal works on this
>> And, do u mean @cpu is WORK_CPU_UNBOUND?
>
> @cpu could be WORK_CPU_UNBOUND at that point. The timer will be added
> to local CPU but @work->data would be pointing to WORK_CPU_UNBOUND,
> again triggering the condition. Given that @cpu being
> WORK_CPU_UNBOUND is far more common than an actual
2012/8/14 Tejun Heo :
> On Tue, Aug 14, 2012 at 01:57:10AM +0900, JoonSoo Kim wrote:
>> > I think it would be better to just opencode system_wq selection in
>> > rebind_workers().
>>
>> Sorry for my poor English skill.
>> Could you elaborate "openc
2012/8/14 Tejun Heo :
> Hello,
>
> On Tue, Aug 14, 2012 at 02:02:31AM +0900, JoonSoo Kim wrote:
>> 2012/8/14 Tejun Heo :
>> > On Tue, Aug 14, 2012 at 01:17:52AM +0900, Joonsoo Kim wrote:
>> > Is this from an actual workload? ie. do you have a test case where
>&
> Why not just do
>
> if (cpu == WORK_CPU_UNBOUND)
> cpu = raw_smp_processor_id();
>
> if (!(wq->flags...) {
> ...
> if (gcwq && gcwq->cpu != WORK_CPU_UNBOUND)
> lcpu = gcwq->cpu;
> else
>
t 3 patches are for our purpose.
Joonsoo Kim (6):
workqueue: use enum value to set array size of pools in gcwq
workqueue: correct req_cpu in trace_workqueue_queue_work()
workqueue: change value of lcpu in __queue_delayed_work_on()
workqueue: introduce system_highpri_wq
workqueue: use sy
emporary local variable for storing local cpu.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 49d8f4a..6a17ab0 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1198,6 +1198,7 @@ static void __queue_work(unsigned int cpu, struct
workqu
owing patch fix this issue properly.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index f55ac26..470b0eb 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -269,12 +269,14 @@ struct workqueue_struct {
};
struct workqueue_struct *system
se it.
Using it makes code robust and prevent future mistakes.
So change code to use this enum value.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 4fef952..49d8f4a 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -183,7 +183,8 @@ struct global_cwq {
To speed cpu down processing up, use system_highpri_wq.
As scheduling priority of workers on it is higher than system_wq and
it is not contended by other normal works on this cpu, work on it
is processed faster than system_wq.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel
ker
to match theses. This implements it.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 470b0eb..4c5733c1 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1738,6 +1738,7 @@ retry:
/* rebind busy workers */
for_each_busy_worker(worker,
UNBOUND.
It is sufficient to prevent to go into sub-optimal path.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 6a17ab0..f55ac26 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1358,9 +1358,10 @@ static void __queue_delayed_work(int cpu,
2012/8/15 Tejun Heo :
> Hello,
>
> On Wed, Aug 15, 2012 at 03:10:12AM +0900, Joonsoo Kim wrote:
>> When we do tracing workqueue_queue_work(), it records requested cpu.
>> But, if !(@wq->flag & WQ_UNBOUND) and @cpu is WORK_CPU_UNBOUND,
>> requested cpu is chang
To speed cpu down processing up, use system_highpri_wq.
As scheduling priority of workers on it is higher than system_wq and
it is not contended by other normal works on this cpu, work on it
is processed faster than system_wq.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel
owing patch fix this issue properly.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 32c4f79..a768ffd 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -269,12 +269,14 @@ struct workqueue_struct {
};
struct workqueue_struct *system
ker
to match theses. This implements it.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index a768ffd..2945734 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1742,6 +1742,7 @@ retry:
/* rebind busy workers */
for_each_busy_worker(worker,
tem_wq
[6/6] No change
This patchset introduce system_highpri_wq
in order to use proper cwq for highpri worker.
First 3 patches are not related to that purpose.
Just fix arbitrary issues.
Last 3 patches are for our purpose.
Joonsoo Kim (6):
workqueue: use enum value to set array size of pools in g
emporary local variable for storing requested cpu.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 49d8f4a..c29f2dc 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1198,6 +1198,7 @@ static void __queue_work(unsigned int cpu, struct
workqu
UNBOUND.
It is sufficient to prevent to go into sub-optimal path.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c29f2dc..32c4f79 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1356,9 +1356,16 @@ static void __queue_delayed_work(int cpu,
se it.
Using it makes code robust and prevent future mistakes.
So change code to use this enum value.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 4fef952..49d8f4a 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -183,7 +183,8 @@ struct global_cwq {
re that all the cpu partial slabs is removed
from cpu partial list. In this time, we could expect that
this_cpu_cmpxchg is mostly succeed.
Signed-off-by: Joonsoo Kim
Cc: Christoph Lameter
Cc: David Rientjes
Acked-by: Christoph Lameter
---
Hello, Pekka.
These two patches get "Acked-by: Chri
cond, it may reduce lock contention.
When we do retrying, status of slab is already changed,
so we don't need a lock anymore in almost every case.
"release a lock first, and re-take a lock if necessary" policy is
helpful to this.
Signed-off-by: Joonsoo Kim
Cc: Christoph Lameter
Acke
ssigning 0 to objects count when we get for cpu freelist.
Signed-off-by: Joonsoo Kim
Cc: Christoph Lameter
Cc: David Rientjes
diff --git a/mm/slub.c b/mm/slub.c
index efce427..88dca1d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1550,7 +1550,12 @@ static void *get_partial_node(struct kmem_
2012/8/16 Christoph Lameter :
> On Thu, 16 Aug 2012, Joonsoo Kim wrote:
>
>> s->cpu_partial determine the maximum number of objects kept
>> in the per cpu partial lists of a processor. Currently, it is used for
>> not only per cpu partial list but also cpu freelist. Th
>> I think that s->cpu_partial is for cpu partial slab, not cpu slab.
>
> Ummm... Not entirely. s->cpu_partial is the mininum number of objects to
> "cache" per processor. This includes the objects available in the per cpu
> slab and the other slabs on the per cpu partial list.
Hmm..
When we do te
2012/7/16 Thomas Gleixner :
> - static const struct sched_param param = {
> - .sched_priority = MAX_RT_PRIO-1
> - };
> -
> - p = per_cpu(ksoftirqd, hotcpu);
> - per_cpu(ksoftirqd, hotcpu) = NULL;
> - sched_s
2012/7/16 Thomas Gleixner :
> The following series implements the infrastructure for parking and
> unparking kernel threads to avoid the full teardown and fork on cpu
> hotplug operations along with management infrastructure for hotplug
> and users.
>
> Changes vs. V2:
>
> Use callbacks for all fu
2012/8/17 Christoph Lameter :
> On Thu, 16 Aug 2012, JoonSoo Kim wrote:
>
>> But, if you prefer that s->cpu_partial is for both cpu slab and cpu
>> partial slab,
>> get_partial_node() needs an another minor fix.
>> We should add number of objects in cpu sla
2012/8/17 Tejun Heo :
> On Wed, Aug 15, 2012 at 11:25:35PM +0900, Joonsoo Kim wrote:
>> Change from v2
>> [1/6] No change
>> [2/6] Change local variable name and use it directly for TP
>> [3/6] Add a comment.
>> [4/6] No change
>> [5/6] Add a comment. F
2012/8/17 Christoph Lameter :
> On Fri, 17 Aug 2012, JoonSoo Kim wrote:
>
>> > What difference does this patch make? At the end of the day you need the
>> > total number of objects available in the partial slabs and the cpu slab
>> > for comparison.
>>
&g
2012/7/9 David Rientjes :
> On Mon, 9 Jul 2012, JoonSoo Kim wrote:
>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 8c691fa..5d41cad 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -1324,8 +1324,14 @@ static struct page *allocate_slab(struct
>
2012/7/9 David Rientjes :
> On Sun, 8 Jul 2012, JoonSoo Kim wrote:
>
>> >> __alloc_pages_direct_compact has many arguments so invoking it is very
>> >> costly.
>> >> And in almost invoking case, order is 0, so return immediately.
>> >>
>
2012/7/10 Mel Gorman :
> You say that invoking the function is very costly. I agree that a function
> call with that many parameters is hefty but it is also in the slow path of
> the allocator. For order-0 allocations we are about to enter direct reclaim
> where I would expect the cost far exceeds
Hello, Minchan.
On Thu, Jan 17, 2013 at 08:59:22AM +0900, Minchan Kim wrote:
> Hi Joonsoo,
>
> On Wed, Jan 16, 2013 at 05:08:55PM +0900, Joonsoo Kim wrote:
> > If object is on boundary of page, zs_map_object() copy content of object
> > to pre-allocated page and retu
loop over all processors in bootstrap().
Signed-off-by: Joonsoo Kim
diff --git a/mm/slub.c b/mm/slub.c
index 7204c74..8b95364 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3614,10 +3614,15 @@ static int slab_memory_callback(struct notifier_block
*self,
static struct kmem_cache * __init boo
After boot phase, 'n' always exist.
So add 'likely' macro for helping compiler.
Acked-by: Christoph Lameter
Signed-off-by: Joonsoo Kim
diff --git a/mm/slub.c b/mm/slub.c
index 8b95364..ddbd401 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1005,7 +1005,7 @@ static inline void i
..
Therfore, "available > s->cpu_partial / 2" is always false and
we always go to second iteration.
This patch correct this problem.
After that, we don't need return value of put_cpu_partial().
So remove it.
v2: calculate nr of objects using new.objects and new.inuse.
It is more accu
On Fri, Jan 18, 2013 at 10:55:01AM -0500, Steven Rostedt wrote:
> On Fri, 2013-01-18 at 10:04 -0500, Steven Rostedt wrote:
>
> Just to be more complete:
>
> > CPU0CPU1
> >
> c = __this_cpu_ptr(s->cpu_slab);
>
Hello, Tejun.
On Wed, Jan 16, 2013 at 05:42:32PM -0800, Tejun Heo wrote:
> Hello,
>
> Currently, on the backend side, there are two layers of abstraction.
> For each CPU and the special unbound wq-specific CPU, there's one
> global_cwq. gcwq in turn hosts two worker_pools - one for normal
> prio
On Wed, Jan 16, 2013 at 05:42:47PM -0800, Tejun Heo wrote:
> global_cwq is now nothing but a container for per-pcu standard
s/per-pcu/per-cpu/
> worker_pools. Declare the worker pools directly as
> cpu/unbound_std_worker_pools[] and remove global_cwq.
>
> * get_gcwq() is replaced with std_worke
;v3:
coverletter: refer a link related to this work
[2/3]: drop @flags of find_static_vm_vaddr
Rebased on v3.8-rc4
v1->v2:
[2/3]: patch description is improved.
Rebased on v3.7-rc7
Joonsoo Kim (3):
ARM: vmregion: remove vmregion code entirely
ARM: static_vm: introduce an infrastructure for
From: Joonsoo Kim
Now, there is no user for vmregion.
So remove it.
Signed-off-by: Joonsoo Kim
Signed-off-by: Joonsoo Kim
diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
index 8a9c4cb..4e333fa 100644
--- a/arch/arm/mm/Makefile
+++ b/arch/arm/mm/Makefile
@@ -6,7 +6,7 @@ obj-y
From: Joonsoo Kim
In current implementation, we used ARM-specific flag, that is,
VM_ARM_STATIC_MAPPING, for distinguishing ARM specific static mapped area.
The purpose of static mapped area is to re-use static mapped area when
entire physical address range of the ioremap request can be covered
From: Joonsoo Kim
A static mapped area is ARM-specific, so it is better not to use
generic vmalloc data structure, that is, vmlist and vmlist_lock
for managing static mapped area. And it causes some needless overhead and
reducing this overhead is better idea.
Now, we have newly introduced
This name doesn't represent specific meaning.
So rename it to imply it's purpose.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 26058d0..e6f8783 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6814,7 +6814,7 @@ struct
cur_ld_moved is reset if env.flags hit LBF_NEED_BREAK.
So, there is possibility that we miss doing resched_cpu().
Correct it as changing position of resched_cpu()
before checking LBF_NEED_BREAK.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 81fa536
Commit 88b8dac0 makes load_balance() consider other cpus in its group.
So, now, When we redo in load_balance(), we should reset some fields of
lb_env to ensure that load_balance() works for initial cpu, not for other
cpus in its group. So correct it.
Cc: Srivatsa Vaddagiri
Signed-off-by: Joonsoo
Vaddagiri
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e6f8783..d4c6ed0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6814,6 +6814,7 @@ struct task_group root_task_group;
LIST_HEAD(task_groups);
#endif
+DECLARE_PER_CPU(cpumask_var_t
not be moved
to other cpus and, of course, this situaltion may be continued after
we change the target cpu. So this patch clear LBF_ALL_PINNED in order
to mitigate useless redoing overhead, if can_migrate_task() is failed
by above reason.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched/fair.c
Some validation for task moving is performed in move_tasks() and
move_one_task(). We can move these code to can_migrate_task()
which is already exist for this purpose.
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 97498f4..849bc8e 100644
--- a/kernel
x27;t be moved as cpu affinity. But, currently,
if imbalance is not large enough to task's load, we leave LBF_ALL_PINNED
flag and 'redo' is triggered. This is not our intention, so correct it.
These are based on v3.8-rc7.
Joonsoo Kim (8):
sched: change position of resched_c
Srivatsa Vaddagiri
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0c6aaf6..97498f4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5016,8 +5016,15 @@ static int load_balance(int this_cpu, struct rq *this_rq,
After commit 88b8dac0, dst-cpu can be changed in load_balance(),
then we can't know cpu_idle_type of dst-cpu when load_balance()
return positive. So, add explicit cpu_idle_type checking.
Cc: Srivatsa Vaddagiri
Signed-off-by: Joonsoo Kim
diff --git a/kernel/sched/fair.c b/kernel/sched/f
Hello, Steven.
On Fri, Feb 15, 2013 at 01:13:39AM -0500, Steven Rostedt wrote:
> Performance counter stats for '/work/c/hackbench 500' (100 runs):
>
> 199820.045583 task-clock#8.016 CPUs utilized
>( +- 5.29% ) [100.00%]
> 3,594,264 context-switche
Hello, Alex.
On Mon, Feb 18, 2013 at 01:07:28PM +0800, Alex Shi wrote:
> We need initialize the se.avg.{decay_count, load_avg_contrib} to zero
> after a new task forked.
> Otherwise random values of above variables cause mess when do new task
I think that these are not random values. In arch_dup_
Hello, Alex.
On Mon, Feb 18, 2013 at 01:07:37PM +0800, Alex Shi wrote:
> If the waked/execed task is transitory enough, it will has a chance to be
> packed into a cpu which is busy but still has time to care it.
> For powersaving policy, only the history util < 25% task has chance to
> be packed,
Hello, Eric.
2012/10/14 Eric Dumazet :
> SLUB was really bad in the common workload you describe (allocations
> done by one cpu, freeing done by other cpus), because all kfree() hit
> the slow path and cpus contend in __slab_free() in the loop guarded by
> cmpxchg_double_slab(). SLAB has a cache f
anyone help me?
Thanks.
> On Tue, Dec 25, 2012 at 7:30 AM, JoonSoo Kim wrote:
>
> > 2012/12/26 Joonsoo Kim :
> > > commit cce89f4f6911286500cf7be0363f46c9b0a12ce0('Move kmem_cache
> > > refcounting to common code') moves some refcount manipulation code to
ration.
After that, we don't need return value of put_cpu_partial().
So remove it.
Signed-off-by: Joonsoo Kim
---
These are based on v3.8-rc3 and there is no dependency between each other.
If rebase is needed, please notify me.
diff --git a/mm/slub.c b/mm/slub.c
index ba2ca53..abef30e 1006
501 - 600 of 2325 matches
Mail list logo