On 27/05/2025 10:34, hyunki00....@samsung.com wrote: - old addresses
Please backport 32e92d9f6f87 ("iommu/iova: Separate out rcache init") to linux-5.15.y
If you want some work done, then you generally have to do it yourself or pay someone to do it. Or report a real problem, so someone who cares helps.
Commit de53fd7aedb1 32e92d9f6f87 ("iommu/iova: Separate out rcache init") fixes below issue. This should be applied to all stable kernels that applied commit. Issue ===== As you metioned in commit message, fails in init_iova_rcaches() are not handled safely, and a problem actually occurs. By the context of the 2 lines below in linux-5.15.y, callback of cpuhp may be called before the percpu variable is allocated. cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, &iovad->cpuhp_dead); init_iova_rcaches(iovad); The problem occurred in the Linux kernel version 5.15.144, if remove_cpu(cpu) is called between 2 line.
So this some artificial test you create to race cpu hotplug with adding/removing a device? Or something like that?
The following is the panic log: [ 2.097125][ T1] Unable to handle kernel paging request at virtual address ffffffcb74a6b004 ... [ 2.097226][ T1] Call trace: [ 2.097323][ T1] do_raw_spin_lock+0x1c/0x12c [ 2.098469][ T1] _raw_spin_lock_irqsave+0x30/0x60 [ 2.118152][ T1] free_cpu_cached_iovas+0x50/0xb0 [ 2.118307][ T1] iova_cpuhp_dead+0x1c/0x30 [ 2.119447][ T1] cpuhp_invoke_callback+0x2d8/0x5b0 [ 2.119608][ T1] _cpu_down+0x17c/0x4a0 [ 2.139216][ T1] cpu_device_down+0x44/0x70 [ 2.139353][ T1] cpu_subsys_offline+0x10/0x20 [ 2.140503][ T1] device_offline+0xf4/0x130 [ 2.140640][ T1] remove_cpu+0x24/0x40 [ 2.160305][ T1] init_iova_domain+0xec/0x1f0 Here is my modification based on the top of the tree of linux-5.15.y