On 2016/7/19 22:14, Vlastimil Babka wrote:
> On 07/19/2016 03:48 PM, Xishi Qiu wrote:
>> On 2016/7/19 21:17, Vlastimil Babka wrote:
>>
>>> On 07/19/2016 02:43 PM, Yisheng Xie wrote:
>>>> hi all,
>>>> I'm getting a 2-order page allocation failure problem on 4.1.18.
>>>> From the Mem-info, it seems the system have much zero order free pages 
>>>> which can be used for memory compaction.
>>>> Is it possible that the memory compacted by current process used by other 
>>>> process soon, which cause page allocation failure of current process ?
>>>
>>> It's possible, but an order-2 allocation should retry compaction in such 
>>> case.
>>>
>>
>> Hi Vlastimil,
>>
>> mode:0x2000d1 means it expects to alloc from zone_dma, (on arm64 zone_dma is 
>> 0-4G)
> 
> Yes, but I don't see where the __GFP_DMA comes from. The backtrace suggests 
> it's alloc_thread_info_node() which uses THREADINFO_GFP which is GFP_KERNEL | 
> __GFP_NOTRACK. There shouldn't be __GFP_DMA, even on arm64. Are there some 
> local modifications to the kernel source?
> 
>> The page cache is very small(active_file:292kB inactive_file:240kB),
>> so did_some_progress may be zero, and will not retry, right?
> 
> Could be, and then __alloc_pages_may_oom() has this:
> 
>         /* The OOM killer does not needlessly kill tasks for lowmem */
>         if (ac->high_zoneidx < ZONE_NORMAL)
>                 goto out;
> 
> So no oom and no faking progress for non-costly order that would result in 
> retry, because of that mysterious __GFP_DMA...

hi Vlastimil,
We do make change and add __GFP_DMA flag here for our platform driver's 
problem. Another question is  why it will do retry here, for it will goto out 
with did_some_progress=0 ?

             if (!did_some_progress)
                 goto nopage;

> 
>> Thanks,
>> Xishi Qiu
>>
>>>>
>>>> --- dmesg messages ---
>>>> 07-13 08:41:51.341 
>>>> <4>[309805.658142s][pid:1361,cpu5,sManagerService]sManagerService: page 
>>>> allocation failure: order:2, mode:0x2000d1
>>>
>>> Geez, these old kernels that can't print the mode human-readably...
>>> #define ___GFP_DMA              0x01
>>> #define ___GFP_WAIT             0x10
>>> #define ___GFP_IO               0x40
>>> #define ___GFP_FS               0x80
>>> #define ___GFP_NOTRACK          0x200000
>>>
>>> Compaction indeed should be possible. And it's a non-costly allocation. It 
>>> shouldn't even be allowed to fail, unless the process was killed?
>>>
>>>> 07-13 08:41:51.346 <4>[309805.658142s][pid:1361,cpu5,sManagerService]CPU: 
>>>> 5 PID: 1361 Comm: sManagerService Tainted: G        W       
>>>> 4.1.18-g09f547b #1
>>>
>>> There's a W taint flag so there should have been a WARN message/backtrace 
>>> preceding it. What is it? It could be related.
this only warning info about "sManagerService" is a secpolicy denied, seems not 
related to this problem.
07-13 08:40:11.925 <36>[309706.248809s][pid:349,cpu0,logd.auditd]type=1400 
audit(1468370411.909:1628880): avc: denied { getattr } for pid=1361 
comm="sManagerService" path="/proc/uid_iostats/show_uid_iostats" dev="proc" 
ino=4026531864 scontext=u:r:system_server:s0 
tcontext=u:object_r:proc_uid_iomonitor_showstat:s0 tclass=file permissive=0
>>>
>>>> 07-13 08:41:51.347 <4>[309805.658142s][pid:1361,cpu5,sManagerService]TGID: 
>>>> 981 Comm: system_server
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658172s][pid:1361,cpu5,sManagerService]Hardware name: hi3650 
>>>> (DT)
>>>> 07-13 08:41:51.347 <0>[309805.658172s][pid:1361,cpu5,sManagerService]Call 
>>>> trace:
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658203s][pid:1361,cpu5,sManagerService][<ffffffc00008a0a4>] 
>>>> dump_backtrace+0x0/0x150
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658203s][pid:1361,cpu5,sManagerService][<ffffffc00008a214>] 
>>>> show_stack+0x20/0x28
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658203s][pid:1361,cpu5,sManagerService][<ffffffc000fc4034>] 
>>>> dump_stack+0x84/0xa8
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658203s][pid:1361,cpu5,sManagerService][<ffffffc00018af54>] 
>>>> warn_alloc_failed+0x10c/0x164
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658233s][pid:1361,cpu5,sManagerService][<ffffffc00018e778>] 
>>>> __alloc_pages_nodemask+0x5b4/0x888
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658233s][pid:1361,cpu5,sManagerService][<ffffffc00018eb84>] 
>>>> alloc_kmem_pages_node+0x44/0x50
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658233s][pid:1361,cpu5,sManagerService][<ffffffc00009fa78>] 
>>>> copy_process.part.46+0x140/0x15ac
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658233s][pid:1361,cpu5,sManagerService][<ffffffc0000a10a0>] 
>>>> do_fork+0xe8/0x444
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658264s][pid:1361,cpu5,sManagerService][<ffffffc0000a14e8>] 
>>>> SyS_clone+0x3c/0x48
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658264s][pid:1361,cpu5,sManagerService]Mem-Info:
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658264s][pid:1361,cpu5,sManagerService]active_anon:491074 
>>>> inactive_anon:118072 isolated_anon:0
>>>> 07-13 08:41:51.347 <4>[309805.658264s] active_file:19087 
>>>> inactive_file:9843 isolated_file:0
>>>> 07-13 08:41:51.347 <4>[309805.658264s] unevictable:322 dirty:20 
>>>> writeback:0 unstable:0
>>>> 07-13 08:41:51.347 <4>[309805.658264s] slab_reclaimable:11788 
>>>> slab_unreclaimable:28068
>>>> 07-13 08:41:51.347 <4>[309805.658264s] mapped:20633 shmem:4038 
>>>> pagetables:10865 bounce:72
>>>> 07-13 08:41:51.347 <4>[309805.658264s] free:118678 free_pcp:58 free_cma:0
>>>> 07-13 08:41:51.347 <4>[309805.658294s][pid:1361,cpu5,sManagerService]DMA 
>>>> free:470628kB min:6800kB low:29116kB high:30816kB active_anon:1868540kB 
>>>> inactive_anon:376100kB active_file:292kB inactive_file:240kB 
>>>> unevictable:1080kB isolated(anon):0kB isolated(file):0kB present:3446780kB 
>>>> managed:3307056kB mlocked:1080kB dirty:80kB writeback:0kB mapped:7604kB 
>>>> shmem:14380kB slab_reclaimable:47152kB slab_unreclaimable:112268kB 
>>>> kernel_stack:28224kB pagetables:43460kB unstable:0kB bounce:288kB 
>>>> free_pcp:204kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB 
>>>> pages_scanned:0 all_unreclaimable? no
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658294s][pid:1361,cpu5,sManagerService]lowmem_reserve[]: 0 415 
>>>> 415
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658294s][pid:1361,cpu5,sManagerService]Normal free:4084kB 
>>>> min:872kB low:3740kB high:3960kB active_anon:95756kB inactive_anon:96188kB 
>>>> active_file:76056kB inactive_file:39132kB unevictable:208kB 
>>>> isolated(anon):0kB isolated(file):0kB present:524288kB managed:425480kB 
>>>> mlocked:208kB dirty:0kB writeback:0kB mapped:74928kB shmem:1772kB 
>>>> slab_reclaimable:0kB slab_unreclaimable:4kB kernel_stack:0kB 
>>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:28kB local_pcp:0kB 
>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658294s][pid:1361,cpu5,sManagerService]lowmem_reserve[]: 0 0 0
>>>> 07-13 08:41:51.347 <4>[309805.658325s][pid:1361,cpu5,sManagerService]DMA: 
>>>> 68324*4kB (UEM) 24706*8kB (UER) 2*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 
>>>> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 470976kB
>>>
>>> Indeed compaction should be doing something with this...
>>>
>>>> 07-13 08:41:51.347 
>>>> <4>[309805.658355s][pid:1361,cpu5,sManagerService]Normal: 270*4kB (UMR) 
>>>> 82*8kB (UMR) 48*16kB (MR) 25*32kB (R) 12*64kB (R) 2*128kB (R) 1*256kB (R) 
>>>> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4584kB
>>>> 07-13 08:41:51.347 <4>[309805.658386s][pid:1361,cpu5,sManagerService]38319 
>>>> total pagecache pages
>>>> 07-13 08:41:51.347 <4>[309805.658386s][pid:1361,cpu5,sManagerService]5384 
>>>> pages in swap cache
>>>> 07-13 08:41:51.347 <4>[309805.658386s][pid:1361,cpu5,sManagerService]Swap 
>>>> cache stats: add 628084, delete 622700, find 2187699/2264909
>>>> 07-13 08:41:51.347 <4>[309805.658386s][pid:1361,cpu5,sManagerService]Free 
>>>> swap  = 0kB
>>>> 07-13 08:41:51.348 <4>[309805.658416s][pid:1361,cpu5,sManagerService]Total 
>>>> swap = 524284kB
>>>> 07-13 08:41:51.348 
>>>> <4>[309805.658416s][pid:1361,cpu5,sManagerService]992767 pages RAM
>>>> 07-13 08:41:51.348 <4>[309805.658416s][pid:1361,cpu5,sManagerService]0 
>>>> pages HighMem/MovableOnly
>>>> 07-13 08:41:51.348 <4>[309805.658416s][pid:1361,cpu5,sManagerService]51441 
>>>> pages reserved
>>>> 07-13 08:41:51.348 <4>[309805.658416s][pid:1361,cpu5,sManagerService]8192 
>>>> pages cma reserved
>>>> 07-13 08:41:51.767 
>>>> <6>[309806.068298s][pid:2247,cpu6,notification-sq][I/sensorhub] 
>>>> shb_release ok
>>>>
>>>
>>>
>>> .
>>>
>>
>>
>>
> 
> 
> .
> 

Reply via email to