On Wed, Sep 26, 2018 at 06:17:49PM +0200, Borislav Petkov wrote: > On Wed, Sep 26, 2018 at 01:03:40PM -0300, Mauro Carvalho Chehab wrote: > > I guess this is/was needed to create things like this: > > > > lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> > > ../../../devices/system/edac/mc > > They're still there: > > $ ls -l /sys/bus/edac/devices/ > total 0 > lrwxrwxrwx 1 root root 0 Sep 26 18:15 csrow0 -> > ../../../devices/system/edac/mc/mc0/csrow0 > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm0 -> > ../../../devices/system/edac/mc/mc0/dimm0 > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm3 -> > ../../../devices/system/edac/mc/mc0/dimm3 > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm6 -> > ../../../devices/system/edac/mc/mc0/dimm6 > lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm9 -> > ../../../devices/system/edac/mc/mc0/dimm9 > lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc -> ../../../devices/system/edac/mc > lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc0 -> > ../../../devices/system/edac/mc/mc0
I ran into trouble on my 4 socket broadwell server (so 8 memory controllers, a whole pile of DIMMs, running from sb_edac.c) Things start going wrong with: [ 45.216657] sysfs: cannot create duplicate filename '/bus/edac/devices/dimm0' [ 45.216663] CPU: 37 PID: 2034 Comm: systemd-udevd Not tainted 4.19.0-rc5 #1 [ 45.216665] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 [ 45.216667] Call Trace: [ 45.216688] dump_stack+0x5c/0x7b [ 45.216697] sysfs_warn_dup+0x56/0x70 [ 45.216702] sysfs_do_create_link_sd.isra.2+0x98/0xb0 [ 45.216714] bus_add_device+0x77/0x160 [ 45.216720] device_add+0x424/0x660 [ 45.216731] edac_create_sysfs_mci_device+0xb9/0x2f0 [ 45.216738] edac_mc_add_mc_with_groups+0x111/0x2b0 [ 45.216747] sbridge_init+0x13c9/0x2000 [sb_edac] [ 45.216757] ? _raw_spin_lock+0x1d/0x20 [ 45.216765] ? free_pcppages_bulk+0x2ca/0x630 [ 45.216769] ? 0xffffffffc050f000 [ 45.216779] do_one_initcall+0x46/0x1c8 [ 45.216784] ? free_unref_page_commit+0x95/0x120 [ 45.216791] ? _cond_resched+0x15/0x40 [ 45.216798] ? kmem_cache_alloc_trace+0x153/0x1c0 [ 45.216805] do_init_module+0x5b/0x208 [ 45.216826] load_module+0x1a2d/0x1fb0 [ 45.216835] ? __do_sys_finit_module+0xe9/0x110 [ 45.216840] __do_sys_finit_module+0xe9/0x110 [ 45.216847] do_syscall_64+0x5b/0x180 [ 45.216852] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 45.216856] RIP: 0033:0x7fcdec618bd9 and fell off a cliff after that. Going back to the old code I have a "dimm0" on each of the eight controllers: # find /sys -name dimm0 /sys/devices/system/edac/mc/mc6/dimm0 /sys/devices/system/edac/mc/mc4/dimm0 /sys/devices/system/edac/mc/mc2/dimm0 /sys/devices/system/edac/mc/mc0/dimm0 /sys/devices/system/edac/mc/mc7/dimm0 /sys/devices/system/edac/mc/mc5/dimm0 /sys/devices/system/edac/mc/mc3/dimm0 /sys/devices/system/edac/mc/mc1/dimm0 /sys/bus/mc6/devices/dimm0 /sys/bus/mc4/devices/dimm0 /sys/bus/mc2/devices/dimm0 /sys/bus/mc0/devices/dimm0 /sys/bus/mc7/devices/dimm0 /sys/bus/mc5/devices/dimm0 /sys/bus/mc3/devices/dimm0 /sys/bus/mc1/devices/dimm0 # ls -l /sys/bus/mc0/devices total 0 lrwxrwxrwx. 1 root root 0 Sep 26 11:08 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0 lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0 lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3 lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6 lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9 lrwxrwxrwx. 1 root root 0 Sep 26 11:08 mc0 -> ../../../devices/system/edac/mc/mc0 It looks like the new code isn't trying to place the dimm symlinks in the proper subdirectories. -Tony