[RFC PATCH 2/2] bfq/mq-deadline: remove redundant check for passthrough request

2021-04-14 Thread Lin Feng
th hdds under SAS controller and hdds under AHCI controller but obviously not covers all. Not sure if passthrough request can still escape into IO scheduler from blk_mq_sched_insert_requests, which is used by blk_mq_flush_plug_list and has lots of indirect callers.) Signed-off-by: Lin Feng -

[PATCH 1/2] blk-mq: bypass IO scheduler's limit_depth for passthrough request

2021-04-14 Thread Lin Feng
s patch introduce a new wrapper to make code not that ugly. Signed-off-by: Lin Feng --- block/blk-mq.c | 3 ++- include/linux/blkdev.h | 6 ++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index d4d7c1caa439..927189a55575 100644 --- a/block/b

Re: [PATCH] Revert "bfq: Fix computation of shallow depth"

2021-02-02 Thread Lin Feng
Hi all, On 2/2/21 22:20, Jens Axboe wrote: On 2/2/21 5:28 AM, Jan Kara wrote: Hello! On Fri 29-01-21 19:18:08, Lin Feng wrote: This reverts commit 6d4d273588378c65915acaf7b2ee74e9dd9c130a. bfq.limit_depth passes word_depths[] as shallow_depth down to sbitmap core sbitmap_get_shallow, which

Re: [PATCH] Revert "bfq: Fix computation of shallow depth"

2021-01-31 Thread Lin Feng
on codes for bfq's word_depths array are not necessary and one variable is enough. But IMHO async depth limitation for slow drivers is essential, which is what we always did in cfq age. On 1/29/21 19:18, Lin Feng wrote: This reverts commit 6d4d273588378c65915acaf7b2ee74e9dd9c130a. bf

[PATCH] x86/kaslr: try process e820 entries if can not get suitable regions from efi

2021-01-05 Thread Lin Feng
: Physical KASLR disabled: no suitable memory region! To enable physical kaslr with kexec, call process_e820_entries when no suitable regions in efi memmaps. Signed-off-by: Lin Feng --- I find a regular of Kernel code and data placement with kexec. It seems unsafe. The reason is showed above. I&#

[PATCH] sysctl.c: fix underflow value setting risk in vm_table

2020-12-23 Thread Lin Feng
vfs_cache_pressure and zone_reclaim_mode, -1 is apparently not a valid value, but we can set to them. And then kernel may crash. # echo -1 > /proc/sys/vm/vfs_cache_pressure Signed-off-by: Lin Feng --- kernel/sysctl.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --gi

[PATCH] [RFC] init/main: fix broken buffer_init when DEFERRED_STRUCT_PAGE_INIT set

2020-11-23 Thread Lin Feng
uffer_heads_over_limit in vmscan since we used a half done value of zone->managed_pages before, or should we use a smaller factor(<10%) in previous formula. Signed-off-by: Lin Feng --- init/main.c | 2 -- mm/page_alloc.c | 3 +++ 2 files changed, 3 insertions(+), 2 deletions(-)

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-19 Thread Lin Feng
On 9/19/19 11:49, Matthew Wilcox wrote: On Thu, Sep 19, 2019 at 10:33:10AM +0800, Lin Feng wrote: On 9/18/19 20:33, Michal Hocko wrote: I absolutely agree here. From you changelog it is also not clear what is the underlying problem. Both congestion_wait and wait_iff_congested should wake up

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Lin Feng
On 9/18/19 20:33, Michal Hocko wrote: +mm_reclaim_congestion_wait_jiffies +== + +This control is used to define how long kernel will wait/sleep while +system memory is under pressure and memroy reclaim is relatively active. +Lower values will decrease the kernel wait/sleep time. + +It'

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Lin Feng
Hi, On 9/18/19 19:38, Matthew Wilcox wrote: On Wed, Sep 18, 2019 at 11:21:04AM +0800, Lin Feng wrote: Adding a new tunable is not the right solution. The right way is to make Linux auto-tune itself to avoid the problem. For example, bdi_writeback contains an estimated write bandwidth

Re: [PATCH] [RESEND] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Lin Feng
On 9/18/19 20:27, Michal Hocko wrote: Please do not post a new version with a minor compile fixes until there is a general agreement on the approach. Willy had comments which really need to be resolved first. Sorry, but thanks for pointing out. Also does this [...] Reported-by: kbuild t

[PATCH] [RESEND] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Lin Feng
og of this patch. Signed-off-by: Lin Feng Reported-by: kbuild test robot --- Documentation/admin-guide/sysctl/vm.rst | 17 + kernel/sysctl.c | 10 ++ mm/vmscan.c | 14 +++--- 3 files changed, 38 insertions(+), 3

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-17 Thread Lin Feng
On 9/17/19 20:06, Matthew Wilcox wrote: On Tue, Sep 17, 2019 at 07:58:24PM +0800, Lin Feng wrote: In direct and background(kswapd) pages reclaim paths both may fall into calling msleep(100) or congestion_wait(HZ/10) or wait_iff_congested(HZ/10) while under IO pressure, and the sleep length

[PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-17 Thread Lin Feng
0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu22 : 1.0%us, 1.0%sy, 0.0%ni, 98.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 0.7%us, 0.3%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st Signed-off-by: Lin Feng --- Documentation/admin-guide/sysctl/vm.rst | 17 + kerne

[PATCH 2/2] kernel/latencytop.c: remove unnecessary checks for latencytop_enabled

2019-02-26 Thread Lin Feng
function clear_global_latency_tracing. Notes: These changes only visible to users who sets CONFIG_LATENCYTOP and won't change user tool latencytop's behaviors. Signed-off-by: Lin Feng --- kernel/latencytop.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/kernel/latencytop.c b/kernel/latencytop.c

[PATCH 1/2] kernel/latencytop.c: rename clear_all_latency_tracing to clear_tsk_latency_tracing

2019-02-26 Thread Lin Feng
The name clear_all_latency_tracing is misleading, in fact which only clear per task's latency_record[], and we do have another function named clear_global_latency_tracing which clear the global latency_record[] buffer. Signed-off-by: Lin Feng --- fs/proc/base.c | 2 +- include/

Re: [PATCH] ext4: mballoc.c: fix ac_g_ex and ac_f_ex misuse bug in EXT4_MB_HINT_TRY_GOAL path

2016-06-07 Thread Lin Feng
Hi Andreas, Thanks for your reply and review. On 06/08/2016 05:01 AM, Andreas Dilger wrote: On Jun 2, 2016, at 6:01 AM, Lin Feng wrote: Descriptions: ext4 block allocation core stack: ext4_mb_new_blocks ext4_mb_normalize_request ext4_mb_regular_allocator ext4_mb_find_by_goal

[PATCH] ext4: mballoc.h typo fix: correct wrong comments about MB_DEFAULT_STREAM_THRESHOLD

2016-06-06 Thread Lin Feng
ream allocation mode") and the comments for MB_DEFAULT_STREAM_THRESHOLD became stale. Signed-off-by: Lin Feng --- fs/ext4/mballoc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index 3ef1df6..2e64c0e 100644 --- a/fs/ext4/mba

Re: [PATCH] ext4: mballoc.c: fix ac_g_ex and ac_f_ex misuse bug in EXT4_MB_HINT_TRY_GOAL path

2016-06-05 Thread Lin Feng
4_MB_HINT_MERGE is only tested once and nowhere teaches how to use it. IIUC it also should be folded into EXT4_MB_HINT_TRY_GOAL path or simply skip EXT4_MB_HINT_MERGE test at -L1871. thanks, linfeng On 06/02/2016 08:01 PM, Lin Feng wrote: Descriptions: ext4 block allocation core st

[PATCH] ext4: mballoc.c: fix ac_g_ex and ac_f_ex misuse bug in EXT4_MB_HINT_TRY_GOAL path

2016-06-02 Thread Lin Feng
e file may get fragments even if the physical blocks in the hole is free, which is expected to be merged into a single extent. Signed-off-by: Lin Feng --- fs/ext4/mballoc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c ind

Re: [PATCH 2/2] mm: vmemmap: arm64: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
Hi will, On 04/08/2013 06:55 PM, Will Deacon wrote: > Given that we don't have NUMA support or memory-hotplug on arm64 yet, I'm > not sure that this change makes much sense at the moment. early_pfn_to_nid > will always return 0 and we only ever have one node. > > To be honest, I'm not sure what t

Re: [PATCH 0/2] mm: vmemmap: add vmemmap_verify check for hot-add node/memory case

2013-04-08 Thread Lin Feng
Hi Yinghai, On 04/09/2013 02:40 AM, Yinghai Lu wrote: > On Mon, Apr 8, 2013 at 2:56 AM, Lin Feng wrote: >> In hot add node(memory) case, vmemmap pages are always allocated from other >> node, > > that is broken, and should be fixed. > vmemmap should be on local no

Re: [PATCH 0/2] mm: vmemmap: add vmemmap_verify check for hot-add node/memory case

2013-04-08 Thread Lin Feng
Hi Andrew, On 04/09/2013 04:55 AM, Andrew Morton wrote: > On Mon, 8 Apr 2013 11:40:11 -0700 Yinghai Lu wrote: > >> On Mon, Apr 8, 2013 at 2:56 AM, Lin Feng wrote: >>> In hot add node(memory) case, vmemmap pages are always allocated from other >>> node, >> &g

Re: [PATCH 1/2] mm: vmemmap: x86: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
Hi all, On 04/08/2013 05:56 PM, Lin Feng wrote: > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 474e28f..e2a7277 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -1318,6 +1318,8 @@ vmemmap_populate(struct page *start_page, unsigned lo

[PATCH 0/2] mm: vmemmap: add vmemmap_verify check for hot-add node/memory case

2013-04-08 Thread Lin Feng
In hot add node(memory) case, vmemmap pages are always allocated from other node, but the current logic just skip vmemmap_verify check. So we should also issue "potential offnode page_structs" warning messages if we are the case Lin Feng (2): mm: vmemmap: x86: add vmemmap_verify che

[PATCH 2/2] mm: vmemmap: arm64: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
Deacon Cc: Arnd Bergmann Cc: Tony Lindgren Cc: Ben Hutchings Cc: Andrew Morton Reported-by: Yasuaki Ishimatsu Signed-off-by: Lin Feng --- arch/arm64/mm/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 70b8cd4..9f1e

[PATCH 1/2] mm: vmemmap: x86: add vmemmap_verify check for hot-add node case

2013-04-08 Thread Lin Feng
ar Cc: "H. Peter Anvin" Cc: Yinghai Lu Cc: Andrew Morton Reported-by: Yasuaki Ishimatsu Signed-off-by: Lin Feng --- arch/x86/mm/init_64.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 474e28f..e2a7277 1

Re: [PATCH] x86: numa: mm: kill double initialization for NODE_DATA

2013-04-02 Thread Lin Feng
Hi Wanpeng, On 04/02/2013 06:57 PM, Wanpeng Li wrote: >> >PS. For clarifying calling chains are showed as follows: >> >setup_arch() >> > ... >> > initmem_init() >> >x86_numa_init() >> > numa_init() >> >numa_register_memblks() >> > setup_node_data() >> >NODE_

[PATCH] x86: numa: mm: kill double initialization for NODE_DATA

2013-04-02 Thread Lin Feng
_early(pgdat->node_id,...) ... zone_sizes_init() free_area_init_nodes() free_area_init_node() pgdat->node_id = nid; pgdat->node_start_pfn = node_start_pfn; calculate_node_totalpages(); pgdat->node_spanned_pages = totalpages; Signed-off-by: Lin Feng ---

Re: THP: AnonHugePages in /proc/[pid]/smaps is correct or not?

2013-04-01 Thread Lin Feng
Hi Zhouping, On 04/02/2013 11:09 AM, Zhouping Liu wrote: > I don't understand clearly the last sentence 'you'll probably only get 100% > hugepages only 1/512th of the time.' > could you please explain more details about 'only 1/512th of the time'? IIUC, thp size is 2M so it may be comprised of 5

Re: [PATCH] kernel/range.c: subtract_range: fix the broken phrase issued by printk

2013-03-27 Thread Lin Feng
Hi Bjorn and others, On 03/28/2013 01:27 AM, Bjorn Helgaas wrote: >> - printk(KERN_ERR "run of slot in ranges\n"); >> > + pr_err("%s: run out of slot in ranges\n", >> > + __func__); >> >

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-25 Thread Lin Feng
Hi, On 03/25/2013 05:00 PM, Lenky Gao wrote: > I have found a comment in function physflat_cpu_mask_to_apicid to explain why. > > static unsigned int physflat_cpu_mask_to_apicid(const struct cpumask *cpumask) > { > int cpu; > > /* >* We're using fixed IRQ delivery, can only r

Re: [PATCH] x86: mm: add_pfn_range_mapped: use meaningful index to teach clean_sort_range()

2013-03-25 Thread Lin Feng
Hi Andrew, On 03/19/2013 02:52 AM, Yinghai Lu wrote: > On Mon, Mar 18, 2013 at 3:21 AM, Lin Feng wrote: >> Since add_range_with_merge() return the max none zero element of the array, >> it's >> suffice to use it to instruct clean_sort_range() to do the sort. Or the

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-25 Thread Lin Feng
On 03/25/2013 02:46 PM, Lenky Gao wrote: >> Do you mean on your old machine the irq will be distributed automatically >> among the cpus set by smp_affinity? >> > > Yes. My another machine's interrupts are as follows: And without irqbalance service? It sounds weird to me.. thanks, linfeng > >

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-24 Thread Lin Feng
Hi, On 03/25/2013 11:44 AM, Lenky Gao wrote: >> On 03/25/2013 11:18 AM, Lenky Gao wrote: >>> The irqbalance service has been stopped. >> So try start irqbalance to see what happen? >> It should help to give what you want ;-) > > Using the irqbalance service to dynamically change the IRQ-bound? It

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-24 Thread Lin Feng
Hi, On 03/25/2013 11:18 AM, Lenky Gao wrote: > The irqbalance service has been stopped. So try start irqbalance to see what happen? It should help to give what you want ;-) thanks, linfeng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to major

Re: Question: How to distribute the interrupts over multiple cores?

2013-03-24 Thread Lin Feng
Hi Gao, On 03/25/2013 10:33 AM, Lenky Gao wrote: > [root@localhost ~]# echo 6 > /proc/irq/25/smp_affinity > [root@localhost ~]# cat /proc/irq/25/smp_affinity > 06 Seems you bind the nic irq to second and third cpu for the bit mask you set is 110, so now eth9's irq is working on the 3rd cpu. H

Re: [patch] mm: speedup in __early_pfn_to_nid

2013-03-24 Thread Lin Feng
On 03/24/2013 04:37 AM, Yinghai Lu wrote: > +#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP > +int __init_memblock memblock_search_pfn_nid(unsigned long pfn, > + unsigned long *start_pfn, unsigned long *end_pfn) > +{ > + struct memblock_type *type = &memblock.memory; > + int mi

[PATCH] kernel/range.c: subtract_range: fix the broken phrase issued by printk

2013-03-18 Thread Lin Feng
Also replace deprecated printk(KERN_ERR...) with pr_err() as suggested by Yinghai, attaching the function name to provide plenty info. Cc: Yinghai Lu Signed-off-by: Lin Feng --- kernel/range.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/range.c b/kernel/range.c

[PATCH] x86: mm: accurate the comments for STEP_SIZE_SHIFT macro

2013-03-18 Thread Lin Feng
For x86 PUD_SHIFT is 30 and PMD_SHIFT is 21, so the consequence of (PUD_SHIFT-PMD_SHIFT)/2 is 4. Update the comments to the code. Cc: Yinghai Lu Signed-off-by: Lin Feng --- arch/x86/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm

[PATCH] kernel/range.c: subtract_range: return instead of continue to save some loops

2013-03-18 Thread Lin Feng
If we fall into that branch it means that there is a range fully covering the subtract range, so it's suffice to return there if there isn't any other overlapping ranges. Also fix the broken phrase issued by printk. Cc: Yinghai Lu Signed-off-by: Lin Feng --- kernel/range.c | 4 ++

[PATCH] x86: mm: add_pfn_range_mapped: use meaningful index to teach clean_sort_range()

2013-03-18 Thread Lin Feng
nd it never depends on nr_pfn_mapped. Cc: Jacob Shin Cc: Yinghai Lu Signed-off-by: Lin Feng --- arch/x86/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 59b7fc4..55ae904 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/i

Re: [PATCH V3 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-03-06 Thread Lin Feng
Hi Yasuaki, On 03/06/2013 03:48 PM, Yasuaki Ishimatsu wrote: > Hi Lin, > > IMHO, current implementation depends on luck. So even if system has > many non movable memory, get_user_pages_non_movable() may not allocate > non movable memory. Sorry, I'm not quite understand here, since the to be pinn

Re: [RFC/PATCH 3/5] mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is set

2013-03-06 Thread Lin Feng
Hi Marek, On 03/05/2013 02:57 PM, Marek Szyprowski wrote: > Ensure that newly allocated pages, which are faulted in in FOLL_DURABLE > mode comes from non-movalbe pageblocks, to workaround migration failures > with Contiguous Memory Allocator. snip > @@ -2495,7 +2498,7 @@ static inline void cow_us

Re: [RFC/PATCH 3/5] mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is set

2013-03-06 Thread Lin Feng
Hi Marek, On 03/05/2013 02:57 PM, Marek Szyprowski wrote: > @@ -2495,7 +2498,7 @@ static inline void cow_user_page(struct page *dst, > struct page *src, unsigned lo > */ > static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, > unsigned long address, pte_t *page

Re: [PATCH V3 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-26 Thread Lin Feng
Hi Andrew, Mel and other guys, How about this V3 patch, any comments? thanks, linfeng On 02/21/2013 07:01 PM, Lin Feng wrote: > get_user_pages() always tries to allocate pages from movable zone, which is > not > reliable to memory hotremove framework in some case. > > This pat

[PATCH V3 0/2] mm: hotplug: implement non-movable version of get_user_pages() to kill long-time pin pages

2013-02-21 Thread Lin Feng
ChangeLog v1->v2: Patch1: - Fix the negative return value bug pointed out by Andrew and other suggestions pointed out by Andrew and Jeff. Patch2: - Kill the CONFIG_MEMORY_HOTREMOVE dependence suggested by Jeff. --- Lin Feng (2): mm: hotplug: implement non-movable version of get_use

[PATCH V3 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-21 Thread Lin Feng
n Kim Cc: Zach Brown Reviewed-by: Tang Chen Reviewed-by: Gu Zheng Signed-off-by: Lin Feng --- include/linux/mm.h | 14 ++ include/linux/mmzone.h |4 ++ mm/memory.c| 103 mm/page_isolation.c|8 4 files ch

[PATCH V3 2/2] fs/aio.c: use get_user_pages_non_movable() to pin ring pages when support memory hotremove

2013-02-21 Thread Lin Feng
ndrew Morton Cc: Jeff Moyer Cc: Minchan Kim Cc: Zach Brown Reviewed-by: Tang Chen Reviewed-by: Gu Zheng Signed-off-by: Lin Feng --- fs/aio.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 2512232..193e145 100644 --- a/fs/aio.c +++ b/fs/

Re: [PATCH V2 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-20 Thread Lin Feng
Hi Wanpeng, On 02/20/2013 07:37 PM, Wanpeng Li wrote: >> + * This function first calls get_user_pages() to get the candidate pages, >> and >> >+ * then check to ensure all pages are from non movable zone. Otherwise >> >migrate > How about "Otherwise migrate candidate pages which have already be

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-20 Thread Lin Feng
On 02/20/2013 07:31 PM, Simon Jeons wrote: > On 02/20/2013 06:23 PM, Lin Feng wrote: >> Hi Simon, >> >> On 02/20/2013 05:58 PM, Simon Jeons wrote: >>>> The other is that this almost certainly broken for transhuge page >>>> handling. gup returns the

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-20 Thread Lin Feng
Hi Simon, On 02/20/2013 05:58 PM, Simon Jeons wrote: > >> >> The other is that this almost certainly broken for transhuge page >> handling. gup returns the head and tail pages and ordinarily this is ok > > When need gup thp? in kvm case? gup just pins the wanted pages(for x86 is 4k size) of use

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
Hi Wanpeng, On 02/20/2013 10:44 AM, Wanpeng Li wrote: >> Sorry, I misunderstood what "tail pages" means, stupid question, just ignore >> it. >> >flee... > According to the compound page, the first page of compound page is > called head page, other sub pages are called tail pages. > > Regards, >

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
On 02/19/2013 09:37 PM, Lin Feng wrote: >> > >> > The other is that this almost certainly broken for transhuge page >> > handling. gup returns the head and tail pages and ordinarily this is ok > I can't find codes doing such things :(, could you please poin

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
Hi Mel, On 02/05/2013 09:32 PM, Mel Gorman wrote: > On Tue, Feb 05, 2013 at 11:57:22AM +, Mel Gorman wrote: >> + migrate_pre_flag = 1; + } + + if (!isolate_lru_page(pages[i])) { + inc_z

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-19 Thread Lin Feng
Hi Mel, On 02/18/2013 11:17 PM, Mel Gorman wrote: >>> > > >>> > > >>> > > result. It's a little clumsy but the memory hot-remove failure message >>> > > could list what applications have pinned the pages that cannot be >>> > > removed >>> > > so the administrator has the option of force-killing

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-18 Thread Lin Feng
Hi Mel, See below. On 02/05/2013 07:57 PM, Mel Gorman wrote: > On Mon, Feb 04, 2013 at 04:06:24PM -0800, Andrew Morton wrote: >> The ifdefs aren't really needed here and I encourage people to omit >> them. This keeps the header files looking neater and reduces the >> chances of things later brea

[PATCH V2 2/2] fs/aio.c: use get_user_pages_non_movable() to pin ring pages when support memory hotremove

2013-02-05 Thread Lin Feng
ndrew Morton Cc: Jeff Moyer Cc: Minchan Kim Cc: Zach Brown Reviewed-by: Tang Chen Reviewed-by: Gu Zheng Signed-off-by: Lin Feng --- fs/aio.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 71f613c..f7a0d5c 100644 --- a/fs/aio.c +++ b/fs/

[PATCH V2 0/2] mm: hotplug: implement non-movable version of get_user_pages() to kill long-time pin pages

2013-02-05 Thread Lin Feng
1: - Fix the negative return value bug pointed out by Andrew and other suggestions pointed out by Andrew and Jeff. Patch2: - Kill the CONFIG_MEMORY_HOTREMOVE dependence suggested by Jeff. --- Lin Feng (2): mm: hotplug: implement non-movable version of get_user_pages() called get_user_page

[PATCH V2 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-05 Thread Lin Feng
er of get_user_pages() but it makes sure that all pages come from non-movable zone via additional page migration. Cc: Andrew Morton Cc: Mel Gorman Cc: KAMEZAWA Hiroyuki Cc: Yasuaki Ishimatsu Cc: Jeff Moyer Cc: Minchan Kim Cc: Zach Brown Reviewed-by: Tang Chen Reviewed-by: Gu Zheng Signed-off-by: Lin

Re: [PATCH 0/2] mm: hotplug: implement non-movable version of get_user_pages() to kill long-time pin pages

2013-02-05 Thread Lin Feng
Hi Minchan, On 02/05/2013 03:45 PM, Minchan Kim wrote: >> So it may not a good idea that we all fall into calling the *non_movable* >> version of >> > GUP when CONFIG_MIGRATE_ISOLATE is on. What do you think? > Frankly speaking, I can't understand Mel's comment. > AFAIUC, he said GUP checks the p

Re: [PATCH 0/2] mm: hotplug: implement non-movable version of get_user_pages() to kill long-time pin pages

2013-02-04 Thread Lin Feng
On 02/05/2013 01:25 PM, Minchan Kim wrote: > Hi Lin, > > On Tue, Feb 05, 2013 at 12:42:48PM +0800, Lin Feng wrote: >> Hi Minchan, >> >> On 02/05/2013 08:58 AM, Minchan Kim wrote: >>> Hello, >>> >>> On Mon, Feb 04, 2013 at 06:04:06PM +0800,

Re: [PATCH 2/2] fs/aio.c: use get_user_pages_non_movable() to pin ring pages when support memory hotremove

2013-02-04 Thread Lin Feng
Hi Zach, On 02/05/2013 07:02 AM, Zach Brown wrote: >>> index 71f613c..0e9b30a 100644 >>> --- a/fs/aio.c >>> +++ b/fs/aio.c >>> @@ -138,9 +138,15 @@ static int aio_setup_ring(struct kioctx *ctx) >>> } >>> >>> dprintk("mmap address: 0x%08lx\n", info->mmap_base); >>> +#ifdef CONFIG_MEMORY_H

Re: [PATCH 2/2] fs/aio.c: use get_user_pages_non_movable() to pin ring pages when support memory hotremove

2013-02-04 Thread Lin Feng
Hi Jeff, On 02/04/2013 11:18 PM, Jeff Moyer wrote: >> --- >> fs/aio.c | 6 ++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/fs/aio.c b/fs/aio.c >> index 71f613c..0e9b30a 100644 >> --- a/fs/aio.c >> +++ b/fs/aio.c >> @@ -138,9 +138,15 @@ static int aio_setup_ring(struct kioctx *ctx) >>

Re: [PATCH 0/2] mm: hotplug: implement non-movable version of get_user_pages() to kill long-time pin pages

2013-02-04 Thread Lin Feng
Hi Minchan, On 02/05/2013 08:58 AM, Minchan Kim wrote: > Hello, > > On Mon, Feb 04, 2013 at 06:04:06PM +0800, Lin Feng wrote: >> Currently get_user_pages() always tries to allocate pages from movable zone, >> as discussed in thread https://lkml.org/lkml/2012/11/29/69, i

Re: [PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-04 Thread Lin Feng
Hi Andrew, On 02/05/2013 08:06 AM, Andrew Morton wrote: > > melreadthis > > On Mon, 4 Feb 2013 18:04:07 +0800 > Lin Feng wrote: > >> get_user_pages() always tries to allocate pages from movable zone, which is >> not >> reliable to memory hotremove fram

[PATCH 1/2] mm: hotplug: implement non-movable version of get_user_pages() called get_user_pages_non_movable()

2013-02-04 Thread Lin Feng
er of get_user_pages() but it makes sure that all pages come from non-movable zone via additional page migration. Cc: Andrew Morton Cc: Mel Gorman Cc: KAMEZAWA Hiroyuki Cc: Yasuaki Ishimatsu Reviewed-by: Tang Chen Reviewed-by: Gu Zheng Signed-off-by: Lin Feng --- include/linux/mm.h | 5 in

[PATCH 0/2] mm: hotplug: implement non-movable version of get_user_pages() to kill long-time pin pages

2013-02-04 Thread Lin Feng
page migration. The 2nd patch gets around the aio ring pages can't be migrated bug caused by get_user_pages() via using the new function. It only works when configed with CONFIG_MEMORY_HOTREMOVE, otherwise it uses the old version of get_user_pages(). Lin Feng (2): mm: hotplug: implement non-movab

[PATCH 2/2] fs/aio.c: use get_user_pages_non_movable() to pin ring pages when support memory hotremove

2013-02-04 Thread Lin Feng
orton Reviewed-by: Tang Chen Reviewed-by: Gu Zheng Signed-off-by: Lin Feng --- fs/aio.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/fs/aio.c b/fs/aio.c index 71f613c..0e9b30a 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -138,9 +138,15 @@ static int aio_setup_ring(struct kioctx

Re: [PATCH] memory-hotplug: mm/Kconfig: move auto selects from MEMORY_HOTPLUG to MEMORY_HOTREMOVE as needed

2013-01-20 Thread Lin Feng
Hi Michal, On 01/18/2013 09:58 PM, Michal Hocko wrote: > On Fri 18-01-13 15:54:36, Lin Feng wrote: >> Besides page_isolation.c selected by MEMORY_ISOLATION under MEMORY_HOTPLUG >> is also such case, move it too. > > Yes, it seems that only HOTREMOVE needs MEMORY_ISOLATION

Re: [PATCH v3 1/2] memory-hotplug: introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node() when platform not support

2013-01-17 Thread Lin Feng
Hi Michal, On 01/17/2013 09:05 PM, Michal Hocko wrote: > On Thu 17-01-13 18:37:10, Lin Feng wrote: > [...] >>> > > I am still not sure I understand the relation to MEMORY_HOTREMOVE. >>> > > Is register_page_bootmem_info_node required/helpful even if >>>

[PATCH] memory-hotplug: mm/Kconfig: move auto selects from MEMORY_HOTPLUG to MEMORY_HOTREMOVE as needed

2013-01-17 Thread Lin Feng
page_isolation.c selected by MEMORY_ISOLATION under MEMORY_HOTPLUG is also such case, move it too. Signed-off-by: Lin Feng --- mm/Kconfig |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index f8c5799..a96c010 100644 --- a/mm/Kconfig +++ b/mm

Re: [PATCH v3 1/2] memory-hotplug: introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node() when platform not support

2013-01-17 Thread Lin Feng
Hi Michal, On 01/16/2013 10:14 PM, Michal Hocko wrote: > On Wed 16-01-13 16:14:18, Lin Feng wrote: > [...] >> diff --git a/mm/Kconfig b/mm/Kconfig >> index 278e3ab..f8c5799 100644 >> --- a/mm/Kconfig >> +++ b/mm/Kconfig >> @@ -162,10 +162,18 @@ config MOVABLE_N

[PATCH v3 1/2] memory-hotplug: introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node() when platform not support

2013-01-16 Thread Lin Feng
It's implemented by adding a new Kconfig option named CONFIG_HAVE_BOOTMEM_INFO_NODE, which will be automatically selected by memory-hotplug feature fully supported archs(currently only on x86_64). Reported-by: Michal Hocko Signed-off-by: Lin Feng --- ChangeLog v2->v3: - Rename the pat

[PATCH v3 0/2] memory-hotplug: introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node() when platform not support

2013-01-16 Thread Lin Feng
al. 2) patch 2/2: - New added, remove unimplemented functions suggested by Michal. ChangeLog v1->v2: 1) patch 1/2: - Add a Kconfig option named HAVE_BOOTMEM_INFO_NODE suggested by Michal, which will be automatically selected by supported archs(currently only on x86_64). Lin Feng (1)

[PATCH 2/2] memory-hotplug: cleanup: removing the arch specific functions without any implementation

2013-01-16 Thread Lin Feng
from top to end. Signed-off-by: Michal Hocko Signed-off-by: Lin Feng --- arch/ia64/mm/discontig.c |5 - arch/powerpc/mm/init_64.c |5 - arch/s390/mm/vmem.c |6 -- arch/sparc/mm/init_64.c |5 - 4 files changed, 0 insertions(+), 21 deletions(-) diff --git a

Re: [PATCH V2] memory-hotplug: revert register_page_bootmem_info_node() to empty when platform related code is not implemented

2013-01-15 Thread Lin Feng
Hi Michal, On 01/15/2013 10:20 PM, Michal Hocko wrote: >> +#else >> > +void register_page_bootmem_info_node(struct pglist_data *pgdat) >> > +{ >> > + /* TODO */ >> > +} > I think that TODO is misleading here because the function should be > empty if !CONFIG_HAVE_BOOTMEM_INFO_NODE. I would also su

Re: [PATCH] memory-hotplug: revert register_page_bootmem_info_node() to empty when platform related code is not implemented

2013-01-15 Thread Lin Feng
Hi Michal, I have updated to V2 version according to what you said, would you please take a look if it conforms to what you think? thanks, linfeng On 01/15/2013 02:43 AM, Michal Hocko wrote: > This is just ugly. Could you please add something like HAVE_BOOTMEM_INFO_NODE > or something with a

[PATCH V2] memory-hotplug: revert register_page_bootmem_info_node() to empty when platform related code is not implemented

2013-01-15 Thread Lin Feng
ig option named CONFIG_HAVE_BOOTMEM_INFO_NODE, which will be automatically selected by supported archs(currently only on x86_64). Reported-by: Michal Hocko Signed-off-by: Lin Feng --- ChangeLog v1->v2: - Add a Kconfig option named HAVE_BOOTMEM_INFO_NODE suggested by Michal, which will be automatically sel

[PATCH] memory-hotplug: revert register_page_bootmem_info_node() to empty when platform related code is not implemented

2013-01-14 Thread Lin Feng
which is a hotplug generic function but falling back to call platform related function register_page_bootmem_memmap(). Other platforms such as powerpc it's not implemented, so on such platforms, revert them as empty as they were before. Reported-by: Michal Hocko Signed-off-by: Lin Fen

Re: mmots: memory-hotplug: implement register_page_bootmem_info_section of sparse-vmemmap fix

2013-01-11 Thread Lin Feng
except for this, Tested-by: Lin Feng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: mmots: memory-hotplug-remove-memmap-of-sparse-vmemmap.patch compile fix

2013-01-11 Thread Lin Feng
It looks fine to me. Tested-by: Lin Feng On 01/11/2013 05:53 PM, Michal Hocko wrote: > Defconfig for x86_64 complains: > arch/x86/mm/init_64.c: In function ‘vmemmap_free’: > arch/x86/mm/init_64.c:1317: error: implicit declaration of function > ‘remove_pagetable’ > > vmemmap

Re: mmots: memory-hotplug: implement register_page_bootmem_info_section of sparse-vmemmap fix

2013-01-11 Thread Lin Feng
Hi Michal, On 01/11/2013 06:47 PM, Michal Hocko wrote: > Signed-off-by: Michal Hocko > --- > arch/x86/mm/init_64.c |3 +++ > include/linux/mm.h|2 ++ > 2 files changed, 5 insertions(+) > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index ddd3b58..d8edf52 100644 > -

Re: [PATCH v2] mm: memblock: fix wrong memmove size in memblock_merge_regions()

2013-01-07 Thread Lin Feng
On 01/08/2013 05:23 AM, Andrew Morton wrote: > On Mon, 7 Jan 2013 11:41:36 +0800 > Lin Feng wrote: > >> The memmove span covers from (next+1) to the end of the array, and the index >> of next is (i+1), so the index of (next+1) is (i+2). So the size of remaining >> a

[PATCH v2] mm: memblock: fix wrong memmove size in memblock_merge_regions()

2013-01-06 Thread Lin Feng
The memmove span covers from (next+1) to the end of the array, and the index of next is (i+1), so the index of (next+1) is (i+2). So the size of remaining array elements is (type->cnt - (i + 2)). Cc: Tejun Heo Reviewed-by: Wanpeng Li Signed-off-by: Lin Feng --- ChangeLog v1->v2: -

Re: [RFC PATCH] mm: memblock: optimize memblock_find_in_range_node() to minimize the search work

2013-01-06 Thread Lin Feng
On 01/04/2013 11:01 PM, Tejun Heo wrote: > On Fri, Jan 04, 2013 at 05:24:53PM +0800, Lin Feng wrote: >> The memblock array is in ascending order and we traverse the memblock array >> in >> reverse order so we can add some simple check to reduce the search work. >> &

Re: [RFC PATCH] mm: memblock: fix wrong memmove size in memblock_merge_regions()

2013-01-06 Thread Lin Feng
On 01/05/2013 09:04 AM, Wanpeng Li wrote: > On Fri, Jan 04, 2013 at 05:10:50PM +0800, Lin Feng wrote: >> The memmove span covers from (next+1) to the end of the array, and the index >> of next is (i+1), so the index of (next+1) is (i+2). So the size of remaining >> array e

Re: [RFC PATCH] mm: memblock: fix wrong memmove size in memblock_merge_regions()

2013-01-06 Thread Lin Feng
On 01/04/2013 10:56 PM, Tejun Heo wrote: > On Fri, Jan 04, 2013 at 05:10:50PM +0800, Lin Feng wrote: >> The memmove span covers from (next+1) to the end of the array, and the index >> of next is (i+1), so the index of (next+1) is (i+2). So the size of remaining >> array e

[RFC PATCH] mm: memblock: optimize memblock_find_in_range_node() to minimize the search work

2013-01-04 Thread Lin Feng
The memblock array is in ascending order and we traverse the memblock array in reverse order so we can add some simple check to reduce the search work. Tejun fix a underflow bug in 5d53cb27d8, but I think we could break there for the same reason. Cc: Tejun Heo Signed-off-by: Lin Feng --- mm

[RFC PATCH] mm: memblock: fix wrong memmove size in memblock_merge_regions()

2013-01-04 Thread Lin Feng
move the remaining array elements until we find a none-mergable element, but now we memmove everytime we find a neighboring compatible region. I'm not sure if the trial is worth though. Cc: Tejun Heo Signed-off-by: Lin Feng --- mm/memblock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

Re: [PATCH] pci: remove redundant function calls in pci_reassigndev_resource_alignment()

2012-12-28 Thread Lin Feng
On 12/28/2012 03:45 PM, Yinghai Lu wrote: > On Thu, Dec 27, 2012 at 11:31 PM, Lin Feng wrote: >> pci_reassigndev_resource_alignment() potentially calls >> pci_specified_resource_alignment() twice, which is redundant. >> >> pci_is_reas

[PATCH] pci: remove redundant function calls in pci_reassigndev_resource_alignment()

2012-12-27 Thread Lin Feng
, so also make some cleanup. Signed-off-by: Lin Feng Cc: Yinghai Lu --- drivers/pci/pci.c | 16 ++-- 1 files changed, 2 insertions(+), 14 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 5cb5820..789f401 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c

[PATCH] pci-sysfs: replace mutex_lock with mutex_trylock to avoid potential deadlock situation

2012-12-26 Thread Lin Feng
read_helper+0x4/0x10 [] ? kthread_freezable_should_stop+0x70/0x70 [] ? gs_change+0x13/0x13 Reported-by: Taku Izumi Signed-off-by: Lin Feng Signed-off-by: Gu Zheng --- drivers/pci/pci-sysfs.c | 42 ++ 1 files changed, 26 insertions(+), 16 deletions(-) di

Re: [PATCH v3 3/5] page_alloc: Introduce zone_movable_limit[] to keep movable limit for nodes

2012-12-11 Thread Lin Feng
the ranges in movable_map.map[] belongs, and calculates the >>>>> low boundary of ZONE_MOVABLE for each node. >>>>> >>>>> Signed-off-by: Tang Chen >>>>> Signed-off-by: Jiang Liu >>>>> Reviewed-by: Wen Congyang >>>>&g

[PATCH] mm/bootmem.c: remove unused wrapper function reserve_bootmem_generic()

2012-12-11 Thread Lin Feng
Wrapper fucntion reserve_bootmem_generic() currently have no caller, so clean it up. Signed-off-by: Lin Feng --- include/linux/bootmem.h |3 --- mm/bootmem.c|6 -- 2 files changed, 0 insertions(+), 9 deletions(-) diff --git a/include/linux/bootmem.h b/include/linux

Re: [BUG REPORT] [mm-hotplug, aio] aio ring_pages can't be offlined

2012-12-02 Thread Lin Feng
On 11/30/2012 06:47 PM, Andrew Morton wrote: > On Fri, 30 Nov 2012 18:29:30 +0800 Lin Feng wrote: > >>> add a new library function which callers can use before (or after?) >>> calling get_user_pages[_fast](). >> Sorry, I'm not quite understand what "li

Re: [BUG REPORT] [mm-hotplug, aio] aio ring_pages can't be offlined

2012-12-02 Thread Lin Feng
On 11/30/2012 07:00 PM, Mel Gorman wrote: >> >> Well, that's a fairly low-level implementation detail. A more typical >> approach would be to add a new get_user_pages_non_movable() or such. >> That would probably have the same signature as get_user_pages(), with >> one additional argument. The

Re: [BUG REPORT] [mm-hotplug, aio] aio ring_pages can't be offlined

2012-12-02 Thread Lin Feng
hi Domenico, Sorry for my late reply and thanks for your attention, see below :) On 11/30/2012 11:24 PM, Domenico Andreoli wrote: > On Thu, Nov 29, 2012 at 02:54:58PM +0800, Lin Feng wrote: >> Hi all, > > Hi Lin, > >> We encounter a "Resource temporarily unavail

Re: [BUG REPORT] [mm-hotplug, aio] aio ring_pages can't be offlined

2012-11-30 Thread Lin Feng
On 11/30/2012 03:55 PM, Andrew Morton wrote: > On Fri, 30 Nov 2012 15:01:26 +0800 Lin Feng wrote: > >> >> >> On 11/30/2012 01:57 PM, Andrew Morton wrote: >>> On Fri, 30 Nov 2012 11:42:05 +0800 Lin Feng wrote: >>> >>>> hi Andrew,

Re: [BUG REPORT] [mm-hotplug, aio] aio ring_pages can't be offlined

2012-11-29 Thread Lin Feng
On 11/30/2012 01:57 PM, Andrew Morton wrote: > On Fri, 30 Nov 2012 11:42:05 +0800 Lin Feng wrote: > >> hi Andrew, >> >> On 11/30/2012 07:39 AM, Andrew Morton wrote: >>> Tricky. >>> >>> I expect the same problem would occur with pages which ar

  1   2   >