On Wed, May 14, 2014 at 12:02:32PM -0700, Luck, Tony wrote: > When we suspend a laptop we offline all but one processor. But > the mce code registers on a notify chain so it can clean up > some sysfs entries. Part of that code calls device_unregister() > which will fire kobject_uevent() which might wake up some user > code that is watching for such things.
The issue being? It is not clear from the text what actually is the problem. > Patch below from Aiden Park works around the issue by avoiding > the device_unregister()/device_register() by just keeping track > of the original device registrations. > > 1) Is there a better way to avoid the kobject_uevent()? > 2) What user code actually uses all these sysfs files? > Some test code in mcelog(8) uses: > /sys/devices/system/machinecheck/machinecheck0/trigger > Some code in mce-test package adjusts > /sys/devices/system/machinecheck/machinecheck0/tolerant > But neither of these make use of all the per-cpu instances > created/destroyed on every suspend/resume (or logical processor > offline). Should we just move these out of per-cpu directories > and scrap this whole block of code? Oh, I'd love to. Especially "trigger" - it wants to be in debugfs anyway. The "tolerant" deal we probably need to discuss a bit more. Exposing MCE tolerance levels even to root is not such a good idea and this tolerant thing is primarily used in testing, AFAICT, so that the box doesn't actually panic when you inject errors and so on... > 3) checkpatch isn't happy about use of NR_CPUS I suspect the > usage here warrants for_each_possible_cpu() Right. > -Tony Luck > --- > > diff --git a/arch/x86/kernel/cpu/mcheck/mce.c > b/arch/x86/kernel/cpu/mcheck/mce.c > old mode 100644 > new mode 100755 Something decided to make mce.c an executable script :-) > index 68317c80de7f..e49302f24ce5 > --- a/arch/x86/kernel/cpu/mcheck/mce.c > +++ b/arch/x86/kernel/cpu/mcheck/mce.c > @@ -113,6 +113,8 @@ static DEFINE_PER_CPU(struct work_struct, mce_work); > > static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs > *regs); > > +static struct device *mce_devices[NR_CPUS]; > + > /* > * CPU/chipset specific EDAC code can register a notifier call here to print > * MCE errors in a human-readable form. > @@ -2060,7 +2062,18 @@ static int mce_syscore_suspend(void) > > static void mce_syscore_shutdown(void) > { > + int i = 0; > + struct device *dev; > + > mce_disable_error_reporting(); > + > + for (i = 0; i < NR_CPUS; i++) { WARNING: usage of NR_CPUS is often wrong - consider using cpu_possible(), num_possible_cpus(), for_each_possible_cpu(), etc #64: FILE: arch/x86/kernel/cpu/mcheck/mce.c:2070: + for (i = 0; i < NR_CPUS; i++) { > + dev = mce_devices[i]; > + if (dev) { > + device_unregister(dev); > + mce_devices[i] = NULL; > + } > + } > } > > /* > @@ -2272,23 +2285,27 @@ static void mce_device_release(struct device *dev) > static int mce_device_create(unsigned int cpu) > { > struct device *dev; > - int err; > + int err = 0; > int i, j; > > if (!mce_available(&boot_cpu_data)) > return -EIO; > > - dev = kzalloc(sizeof *dev, GFP_KERNEL); > - if (!dev) > - return -ENOMEM; > - dev->id = cpu; > - dev->bus = &mce_subsys; > - dev->release = &mce_device_release; > + dev = mce_devices[cpu]; > + if (!dev) { > + dev = kzalloc(sizeof *dev, GFP_KERNEL); Please run through checkpatch: WARNING: sizeof *dev should be sizeof(*dev) #93: FILE: arch/x86/kernel/cpu/mcheck/mce.c:2296: + dev = kzalloc(sizeof *dev, GFP_KERNEL); > + if (!dev) > + return -ENOMEM; > + dev->id = cpu; > + dev->bus = &mce_subsys; > + dev->release = &mce_device_release; > > - err = device_register(dev); > - if (err) { > - put_device(dev); > - return err; > + err = device_register(dev); > + if (err) { > + put_device(dev); > + return err; > + } > + mce_devices[cpu] = dev; Ok, in any case, AFAIU it is a mechanism to cache those mce_devices and not generate them every time we come from suspend. And we have those mce_device per cpu vars which we use too. So this one adds another array of mce_devices. Dumb question: would it be possible to save us all that init/teardown everytime and simply toggle their visibility in sysfs instead? I mean, maybe play with device_create_file/device_remove_file only and hand it down those struct device *dev things we have cached per-cpu. This will definitely save us the kobject_uevent() calls. But it could be that I'm missing something obvious... -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/