** Patch added: "0001-UBUNTU-SAUCE-arm64-Kconfig-Disable-ACPI_HOTPLUG_CPU.patch" https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2088047/+attachment/5840165/+files/0001-UBUNTU-SAUCE-arm64-Kconfig-Disable-ACPI_HOTPLUG_CPU.patch
-- You received this bug notification because you are a member of Canonical Platform QA Team, which is subscribed to ubuntu-kernel-tests. https://bugs.launchpad.net/bugs/2088047 Title: log_check / kernel_tainted test from ubuntu_boot failed on Oracular AWS a1.metal Status in ubuntu-kernel-tests: New Bug description: Found on Oracular/6.11.0-11.11 boot testing on AWS a1.metal instance. The relevant console log excerpts: -----(snip)----- 06:55:12 INFO | 2024-11-09T06:51:17.584884+00:00 ip-172-31-6-235 kernel: cpuinfo: failed to register hotplug callbacks. -----(snip)----- 06:55:12 INFO | 2024-11-09T06:51:17.584978+00:00 ip-172-31-6-235 kernel: ------------[ cut here ]------------ 06:55:12 INFO | 2024-11-09T06:51:17.584980+00:00 ip-172-31-6-235 kernel: WARNING: CPU: 7 PID: 1 at fs/sysfs/group.c:128 internal_create_group+0xc4/0x380 06:55:12 INFO | 2024-11-09T06:51:17.584981+00:00 ip-172-31-6-235 kernel: Modules linked in: 06:55:12 INFO | 2024-11-09T06:51:17.584983+00:00 ip-172-31-6-235 kernel: CPU: 7 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-11-generic #11-Ubuntu 06:55:12 INFO | 2024-11-09T06:51:17.584984+00:00 ip-172-31-6-235 kernel: Hardware name: Amazon EC2 a1.metal/Not Specified, BIOS 1.0 10/16/2017 06:55:12 INFO | 2024-11-09T06:51:17.584985+00:00 ip-172-31-6-235 kernel: pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) 06:55:12 INFO | 2024-11-09T06:51:17.584987+00:00 ip-172-31-6-235 kernel: pc : internal_create_group+0xc4/0x380 06:55:12 INFO | 2024-11-09T06:51:17.584989+00:00 ip-172-31-6-235 kernel: lr : sysfs_create_group+0x24/0x50 06:55:12 INFO | 2024-11-09T06:51:17.584993+00:00 ip-172-31-6-235 kernel: sp : ffff80008009bb90 06:55:12 INFO | 2024-11-09T06:51:17.584995+00:00 ip-172-31-6-235 kernel: x29: ffff80008009bba0 x28: 0000000000000000 x27: ffff19093bd33ca8 06:55:12 INFO | 2024-11-09T06:51:17.584997+00:00 ip-172-31-6-235 kernel: x26: 0000000000000000 x25: ffff436d28704000 x24: ffffd59c11b04a88 06:55:12 INFO | 2024-11-09T06:51:17.584998+00:00 ip-172-31-6-235 kernel: x23: 0000000000000000 x22: ffffd59c14046768 x21: ffffd59c1362fca8 06:55:12 INFO | 2024-11-09T06:51:17.585000+00:00 ip-172-31-6-235 kernel: x20: 0000000000000036 x19: 0000000000000004 x18: ffff800080095060 06:55:12 INFO | 2024-11-09T06:51:17.585001+00:00 ip-172-31-6-235 kernel: x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 06:55:12 INFO | 2024-11-09T06:51:17.585003+00:00 ip-172-31-6-235 kernel: x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 06:55:12 INFO | 2024-11-09T06:51:17.585006+00:00 ip-172-31-6-235 kernel: x11: 0000000000000000 x10: 0000000000000000 x9 : ffffd59c1128fc4c 06:55:12 INFO | 2024-11-09T06:51:17.585008+00:00 ip-172-31-6-235 kernel: x8 : 0101010101010101 x7 : 0000000000000000 x6 : 0000000000000000 06:55:12 INFO | 2024-11-09T06:51:17.585010+00:00 ip-172-31-6-235 kernel: x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff1902003fa280 06:55:12 INFO | 2024-11-09T06:51:17.585011+00:00 ip-172-31-6-235 kernel: x2 : ffffd59c12648f88 x1 : 0000000000000000 x0 : 0000000000000000 06:55:12 INFO | 2024-11-09T06:51:17.585012+00:00 ip-172-31-6-235 kernel: Call trace: 06:55:12 INFO | 2024-11-09T06:51:17.585013+00:00 ip-172-31-6-235 kernel: internal_create_group+0xc4/0x380 06:55:12 INFO | 2024-11-09T06:51:17.585014+00:00 ip-172-31-6-235 kernel: sysfs_create_group+0x24/0x50 06:55:12 INFO | 2024-11-09T06:51:17.585015+00:00 ip-172-31-6-235 kernel: topology_add_dev+0x28/0x50 06:55:12 INFO | 2024-11-09T06:51:17.585016+00:00 ip-172-31-6-235 kernel: cpuhp_invoke_callback+0x200/0x780 06:55:12 INFO | 2024-11-09T06:51:17.585021+00:00 ip-172-31-6-235 kernel: cpuhp_issue_call+0x100/0x198 06:55:12 INFO | 2024-11-09T06:51:17.585023+00:00 ip-172-31-6-235 kernel: __cpuhp_setup_state_cpuslocked+0x128/0x330 06:55:12 INFO | 2024-11-09T06:51:17.585024+00:00 ip-172-31-6-235 kernel: __cpuhp_setup_state+0x5c/0xa8 06:55:12 INFO | 2024-11-09T06:51:17.585025+00:00 ip-172-31-6-235 kernel: topology_sysfs_init+0x40/0x78 06:55:12 INFO | 2024-11-09T06:51:17.585026+00:00 ip-172-31-6-235 kernel: do_one_initcall+0x64/0x3a0 06:55:12 INFO | 2024-11-09T06:51:17.585027+00:00 ip-172-31-6-235 kernel: do_initcalls+0x19c/0x210 06:55:12 INFO | 2024-11-09T06:51:17.585028+00:00 ip-172-31-6-235 kernel: kernel_init_freeable+0x18c/0x1e8 06:55:12 INFO | 2024-11-09T06:51:17.585029+00:00 ip-172-31-6-235 kernel: kernel_init+0x3c/0x190 06:55:12 INFO | 2024-11-09T06:51:17.585031+00:00 ip-172-31-6-235 kernel: ret_from_fork+0x10/0x20 06:55:12 INFO | 2024-11-09T06:51:17.585035+00:00 ip-172-31-6-235 kernel: ---[ end trace 0000000000000000 ]--- 06:55:12 INFO | 2024-11-09T06:51:17.585037+00:00 ip-172-31-6-235 kernel: sysfs: cannot create duplicate filename '/devices/cache' 06:55:12 INFO | 2024-11-09T06:51:17.585038+00:00 ip-172-31-6-235 kernel: CPU: 5 UID: 0 PID: 47 Comm: cpuhp/5 Tainted: G W 6.11.0-11-generic #11-Ubuntu 06:55:12 INFO | 2024-11-09T06:51:17.585039+00:00 ip-172-31-6-235 kernel: Tainted: [W]=WARN 06:55:12 INFO | 2024-11-09T06:51:17.585040+00:00 ip-172-31-6-235 kernel: Hardware name: Amazon EC2 a1.metal/Not Specified, BIOS 1.0 10/16/2017 06:55:12 INFO | 2024-11-09T06:51:17.585041+00:00 ip-172-31-6-235 kernel: Call trace: 06:55:12 INFO | 2024-11-09T06:51:17.585146+00:00 ip-172-31-6-235 kernel: dump_backtrace+0x104/0x160 06:55:12 INFO | 2024-11-09T06:51:17.585149+00:00 ip-172-31-6-235 kernel: show_stack+0x24/0x50 06:55:12 INFO | 2024-11-09T06:51:17.585150+00:00 ip-172-31-6-235 kernel: dump_stack_lvl+0x84/0xc0 06:55:12 INFO | 2024-11-09T06:51:17.585155+00:00 ip-172-31-6-235 kernel: dump_stack+0x1c/0x40 06:55:12 INFO | 2024-11-09T06:51:17.585191+00:00 ip-172-31-6-235 kernel: sysfs_warn_dup+0xa8/0xf0 06:55:12 INFO | 2024-11-09T06:51:17.585193+00:00 ip-172-31-6-235 kernel: sysfs_create_dir_ns+0x124/0x150 06:55:12 INFO | 2024-11-09T06:51:17.585194+00:00 ip-172-31-6-235 kernel: create_dir+0x30/0x120 06:55:12 INFO | 2024-11-09T06:51:17.585215+00:00 ip-172-31-6-235 kernel: kobject_add_internal+0x90/0x240 06:55:12 INFO | 2024-11-09T06:51:17.585218+00:00 ip-172-31-6-235 kernel: kobject_add+0xa0/0x140 06:55:12 INFO | 2024-11-09T06:51:17.585234+00:00 ip-172-31-6-235 kernel: device_add+0xd8/0x748 06:55:12 INFO | 2024-11-09T06:51:17.585236+00:00 ip-172-31-6-235 kernel: cpu_device_create+0x19c/0x1c0 06:55:12 INFO | 2024-11-09T06:51:17.585238+00:00 ip-172-31-6-235 kernel: cache_add_dev+0x84/0x428 06:55:12 INFO | 2024-11-09T06:51:17.585252+00:00 ip-172-31-6-235 kernel: cacheinfo_cpu_online+0x90/0x138 06:55:12 INFO | 2024-11-09T06:51:17.585254+00:00 ip-172-31-6-235 kernel: cpuhp_invoke_callback+0x200/0x780 06:55:12 INFO | 2024-11-09T06:51:17.585256+00:00 ip-172-31-6-235 kernel: cpuhp_thread_fun+0x140/0x358 06:55:12 INFO | 2024-11-09T06:51:17.585281+00:00 ip-172-31-6-235 kernel: smpboot_thread_fn+0x224/0x250 06:55:12 INFO | 2024-11-09T06:51:17.585287+00:00 ip-172-31-6-235 kernel: kthread+0xf4/0x108 06:55:12 INFO | 2024-11-09T06:51:17.585289+00:00 ip-172-31-6-235 kernel: ret_from_fork+0x10/0x20 06:55:12 INFO | 2024-11-09T06:51:17.585299+00:00 ip-172-31-6-235 kernel: kobject: kobject_add_internal failed for cache with -EEXIST, don't try to register things with the same name in the same directory. This also was observed on 6.11.0-1004-aws and 6.11.0-1005-aws. Note that Noble is not affected. See [Affected versions] section for more details. ------------------------------------- [Summary] - This is not a regression but caused by problematic ACPI table on a1.metal. - If ACPI table won't be fixed soon, it might be an option to add a workaround at least in our tree. Please see more details in section [Solution] [Cause] According to the warn messages, the following two are failing: * cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/cpuinfo:online", cpuid_cpu_online, cpuid_cpu_offline) * cpuhp_setup_state(CPUHP_AP_BASE_CACHEINFO_ONLINE, "base/cacheinfo:online", cacheinfo_cpu_online, cacheinfo_cpu_pre_down) Note that there are other cpuhp callbacks that are failing. Boot- time tracing of cpuhp:* events reveals it: 4) | /* cpuhp_enter: cpu: 0004 target: 238 step: 199 (cpu_capacity_sysctl_add) */ 4) | /* cpuhp_exit: cpu: 0004 state: 238 step: 199 ret: -2 */ 4) | /* cpuhp_enter: cpu: 0004 target: 238 step: 199 (cpuid_cpu_online) */ 4) | /* cpuhp_exit: cpu: 0004 state: 238 step: 199 ret: -19 */ 5) | /* cpuhp_enter: cpu: 0004 target: 238 step: 54 (topology_add_dev) */ 5) | /* cpuhp_exit: cpu: 0004 state: 238 step: 54 ret: -22 */ 5) | /* cpuhp_enter: cpu: 0005 target: 238 step: 193 (cacheinfo_cpu_online) */ 5) | /* cpuhp_exit: cpu: 0005 state: 238 step: 193 ret: -17 */ These failures are due to non-enabled CPU#4-15 despite that they are in cpu_possible_mask and also online. The issue is that acpi_get_phys_id() fails to get phys_id for processor devices (CPU#4-15) because of discrepancies in ACPI table. -> acpi_processor_get_info -> acpi_get_phys_id -> map_mat_entry -> map_madt_entry Processor Device _UIDs are sequential numbers starting from 0, while Processor UIDs in MADT/PPTT are non-sequential (0x0, 0x1, 0x2, 0x3, 0x100, 0x101, 0x102, 0x103, 0x200, 0x201, ...). This results in the map_madt_entry() failure for CPU#4-15. [Affected Versions] * All Oracular kernels are affected at the moment. * All Noble kernels are not affected at the moment. This is because only Oracular set CONFIG_ACPI_HOTPLUG_CPU=y because of the two upstream commits: 9d0873892f4d ("arm64: Kconfig: Enable hotplug CPU on arm64 if ACPI_PROCESSOR is enabled.") 46800e38ef0e ("arm64: Kconfig: Fix dependencies to enable ACPI_HOTPLUG_CPU") which are originally included in its master kernel. [Solution] There are some options: (a). override ACPI table (while waiting for firmware update) (b). apply a workaround patch for o:aws (c). set CONFIG_ACPI_HOTPLUG_CPU=n in some way [Experiment] Regarding (b), I cooked up a workaround patch (dirty hack), and confirmed that acpi_processor_get_info() turns to succeed for all CPU#4-15 and the warn messages disappeared. See the attached. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2088047/+subscriptions -- Mailing list: https://launchpad.net/~canonical-ubuntu-qa Post to : canonical-ubuntu-qa@lists.launchpad.net Unsubscribe : https://launchpad.net/~canonical-ubuntu-qa More help : https://help.launchpad.net/ListHelp