> Am 13.03.2021 um 05:04 schrieb Liang, Liang (Leo) <liang.li...@amd.com>: > > [AMD Public Use] > > Hi David, > > Which benchmark tool you prefer? Memtest86+ or else?
Hi Leo, I think you want something that runs under Linux natively. I‘m planning on coding up a kernel module to walk all 4MB pages in the freelists and perform a stream benchmark individually. Then we might be able to identify the problematic range - if there is a problematic range :) Guess I‘ll have it running by Monday and let you know. Cheers! > > BRs, > Leo > -----Original Message----- > From: David Hildenbrand <da...@redhat.com> > Sent: Saturday, March 13, 2021 12:47 AM > To: Liang, Liang (Leo) <liang.li...@amd.com>; Deucher, Alexander > <alexander.deuc...@amd.com>; linux-ker...@vger.kernel.org; amd-gfx list > <amd-gfx@lists.freedesktop.org>; Andrew Morton <a...@linux-foundation.org> > Cc: Huang, Ray <ray.hu...@amd.com>; Koenig, Christian > <christian.koe...@amd.com>; Mike Rapoport <r...@linux.ibm.com>; Rafael J. > Wysocki <raf...@kernel.org>; George Kennedy <george.kenn...@oracle.com> > Subject: Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail > in __free_pages_core()") > >> On 12.03.21 17:19, Liang, Liang (Leo) wrote: >> [AMD Public Use] >> >> Dmesg attached. >> > > > So, looks like the "real" slowdown starts once the buddy is up and running > (no surprise). > > > [ 0.044035] Memory: 6856724K/7200304K available (14345K kernel code, 9699K > rwdata, 5276K rodata, 2628K init, 12104K bss, 343324K reserved, 0K > cma-reserved) > [ 0.044045] random: get_random_u64 called from > __kmem_cache_create+0x33/0x460 with crng_init=1 > [ 0.049025] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1 > [ 0.050036] ftrace: allocating 47158 entries in 185 pages > [ 0.097487] ftrace: allocated 185 pages with 5 groups > [ 0.109210] rcu: Hierarchical RCU implementation. > > vs. > > [ 0.041115] Memory: 6869396K/7200304K available (14345K kernel code, 3433K > rwdata, 5284K rodata, 2624K init, 6088K bss, 330652K reserved, 0K > cma-reserved) > [ 0.041127] random: get_random_u64 called from > __kmem_cache_create+0x31/0x430 with crng_init=1 > [ 0.041309] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1 > [ 0.041335] ftrace: allocating 47184 entries in 185 pages > [ 0.055719] ftrace: allocated 185 pages with 5 groups > [ 0.055863] rcu: Hierarchical RCU implementation. > > > And it gets especially bad during ACPI table processing: > > [ 4.158303] ACPI: Added _OSI(Module Device) > [ 4.158767] ACPI: Added _OSI(Processor Device) > [ 4.159230] ACPI: Added _OSI(3.0 _SCP Extensions) > [ 4.159705] ACPI: Added _OSI(Processor Aggregator Device) > [ 4.160551] ACPI: Added _OSI(Linux-Dell-Video) > [ 4.161359] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio) > [ 4.162264] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics) > [ 17.713421] ACPI: 13 ACPI AML tables successfully acquired and loaded > [ 18.716065] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored > [ 20.743828] ACPI: EC: EC started > [ 20.744155] ACPI: EC: interrupt blocked > [ 20.945956] ACPI: EC: EC_CMD/EC_SC=0x666, EC_DATA=0x662 > [ 20.946618] ACPI: \_SB_.PCI0.LPC0.EC0_: Boot DSDT EC used to handle > transactions > [ 20.947348] ACPI: Interpreter enabled > [ 20.951278] ACPI: (supports S0 S3 S4 S5) > [ 20.951632] ACPI: Using IOAPIC for interrupt routing > > vs. > > [ 0.216039] ACPI: Added _OSI(Module Device) > [ 0.216041] ACPI: Added _OSI(Processor Device) > [ 0.216043] ACPI: Added _OSI(3.0 _SCP Extensions) > [ 0.216044] ACPI: Added _OSI(Processor Aggregator Device) > [ 0.216046] ACPI: Added _OSI(Linux-Dell-Video) > [ 0.216048] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio) > [ 0.216049] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics) > [ 0.228259] ACPI: 13 ACPI AML tables successfully acquired and loaded > [ 0.229527] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored > [ 0.231663] ACPI: EC: EC started > [ 0.231666] ACPI: EC: interrupt blocked > [ 0.233664] ACPI: EC: EC_CMD/EC_SC=0x666, EC_DATA=0x662 > [ 0.233667] ACPI: \_SB_.PCI0.LPC0.EC0_: Boot DSDT EC used to handle > transactions > [ 0.233670] ACPI: Interpreter enabled > [ 0.233685] ACPI: (supports S0 S3 S4 S5) > [ 0.233687] ACPI: Using IOAPIC for interrupt routing > > The jump from 4.1 -> 17.7 is especially bad. > > Which might in fact indicate that this could be related to using some very > special slow (ACPI?) memory for ordinary purposes, interfering with actual > ACPI users? > > But again, just a wild guess, because the system is extremely slow > afterwards, however, we don't have any pauses without any signs of life for > that long. > > > It would be interesting to run a simple memory bandwidth benchmark on the > fast kernel with differing sizes up to running OOM to see if there is really > some memory that is just horribly slow once allocated and used. > > -- > Thanks, > > David / dhildenb > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx