Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Sun, 19 Jul 2020 12:07:14 +0200 Stephen Berman wrote: > On Tue, 14 Jul 2020 16:11:35 +0200 Sebastian Andrzej Siewior > wrote: > [...] >> Stephen, the patch attached adds a WARN_ON() statement which will >> produce a stack trace (4 or so). Could please run 'dmesg' after a while >> and send it back. There should be a >> "WARNING in drivers/acpi/acpica/evmisc.c" >> >> statement or something along the lines. > > I've now rebuilt 5.6.4 with the new patch; sorry for the delay. Output > of dmesg attached. FWIW, while you were away I got an additional datapoint: someone who saw my original post about this problem to one of the Linux From Scratch mailing lists wrote me that he had the same problem, also getting messages about the cdrom timing out ("sr 3:0:0:0" in his case); I told him about your workaround of adding 'thermal.tzp=300' to the kernel commandline, and he replied that this works for him too. And it turns out we have similar motherboards: I have a Gigabyte Z390 M Gaming Rev. 1001 board and he has Gigabyte Z390 Designare rev 1.0. Steve Berman
Re: [PATCH] ACPI: thermal: Do not call acpi_thermal_check() directly
On Fri, 22 Jan 2021 17:23:36 +0100 "Rafael J. Wysocki" wrote: > On Thu, Jan 14, 2021 at 7:35 PM Rafael J. Wysocki wrote: >> >> From: Rafael J. Wysocki >> >> Calling acpi_thermal_check() from acpi_thermal_notify() directly >> is problematic if _TMP triggers Notify () on the thermal zone for >> which it has been evaluated (which happens on some systems), because >> it causes a new acpi_thermal_notify() invocation to be queued up >> every time and if that takes place too often, an indefinite number of >> pending work items may accumulate in kacpi_notify_wq over time. >> >> Besides, it is not really useful to queue up a new invocation of >> acpi_thermal_check() if one of them is pending already. >> >> For these reasons, rework acpi_thermal_notify() to queue up a thermal >> check instead of calling acpi_thermal_check() directly and only allow >> one thermal check to be pending at a time. Moreover, only allow one >> acpi_thermal_check_fn() instance at a time to run >> thermal_zone_device_update() for one thermal zone and make it return >> early if it sees other instances running for the same thermal zone. >> >> While at it, fold acpi_thermal_check() into acpi_thermal_check_fn(), >> as it is only called from there after the other changes made here. >> >> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208877 >> Reported-by: Stephen Berman >> Diagnosed-by: Sebastian Andrzej Siewior >> Signed-off-by: Rafael J. Wysocki > > Well, it's been over a week since this was posted. > > Does anyone have any comments? Sorry, I haven't been able to make time to test the patch yet, but I'll try to do so this weekend. Is it just the patch below that I should apply, ignoring the previous patches you sent? And can I apply it to the current mainline kernel? Thanks, Steve Berman >> --- >> drivers/acpi/thermal.c | 46 +- >> 1 file changed, 33 insertions(+), 13 deletions(-) >> >> Index: linux-pm/drivers/acpi/thermal.c >> === >> --- linux-pm.orig/drivers/acpi/thermal.c >> +++ linux-pm/drivers/acpi/thermal.c >> @@ -174,6 +174,8 @@ struct acpi_thermal { >> struct thermal_zone_device *thermal_zone; >> int kelvin_offset; /* in millidegrees */ >> struct work_struct thermal_check_work; >> + struct mutex thermal_check_lock; >> + refcount_t thermal_check_count; >> }; >> >> /* >> -- >> @@ -495,14 +497,6 @@ static int acpi_thermal_get_trip_points( >> return 0; >> } >> >> -static void acpi_thermal_check(void *data) >> -{ >> - struct acpi_thermal *tz = data; >> - >> - thermal_zone_device_update(tz->thermal_zone, >> - THERMAL_EVENT_UNSPECIFIED); >> -} >> - >> /* sys I/F for generic thermal sysfs support */ >> >> static int thermal_get_temp(struct thermal_zone_device *thermal, int *temp) >> @@ -900,6 +894,12 @@ static void acpi_thermal_unregister_ther >> Driver Interface >> >> -- */ >> >> +static void acpi_queue_thermal_check(struct acpi_thermal *tz) >> +{ >> + if (!work_pending(&tz->thermal_check_work)) >> + queue_work(acpi_thermal_pm_queue, &tz->thermal_check_work); >> +} >> + >> static void acpi_thermal_notify(struct acpi_device *device, u32 event) >> { >> struct acpi_thermal *tz = acpi_driver_data(device); >> @@ -910,17 +910,17 @@ static void acpi_thermal_notify(struct a >> >> switch (event) { >> case ACPI_THERMAL_NOTIFY_TEMPERATURE: >> - acpi_thermal_check(tz); >> + acpi_queue_thermal_check(tz); >> break; >> case ACPI_THERMAL_NOTIFY_THRESHOLDS: >> acpi_thermal_trips_update(tz, ACPI_TRIPS_REFRESH_THRESHOLDS); >> - acpi_thermal_check(tz); >> + acpi_queue_thermal_check(tz); >> acpi_bus_generate_netlink_event(device->pnp.device_class, >> dev_name(&device->dev), >> event, 0); >> break; >> case ACPI_THERMAL_NOTIFY_DEVICES: >> acpi_thermal_trips_update(tz, ACPI_TRIPS_REFRESH_DEVICES); >> -
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Thu, 31 Dec 2020 21:46:11 +0100 "Rafael J. Wysocki" wrote: > ATM, I'm tempted to do something like the patch below (with the rationale > that it shouldn't be necessary to read the temperature right after updating > the trip points if polling is in use, because the next update through polling > will cause it to be read anyway and it will trigger trip point actions as > needed). > > Stephen, can you give it a go, please? On Sat, 02 Jan 2021 12:03:17 +0100 "Rafael J. Wysocki" wrote: > There is one more way to address this, probably better: instead of checking > the > temperature right away in acpi_thermal_notify(), queue that on > acpi_thermal_pm_queue > and so only if another thermal check is not pending. > > This way there will be at most one temperature check coming from > acpi_thermal_notify() queued up at any time which should prevent the > build-up of work items from taking place. > > So something like this: Thanks for the patches. I'll try them as soon as I can. Steve Berman
Re: [PATCH] ACPI: thermal: Do not call acpi_thermal_check() directly
On Fri, 22 Jan 2021 17:42:59 +0100 "Rafael J. Wysocki" wrote: > On Fri, Jan 22, 2021 at 5:39 PM Stephen Berman wrote: >> >> On Fri, 22 Jan 2021 17:23:36 +0100 "Rafael J. Wysocki" >> wrote: >> >> > On Thu, Jan 14, 2021 at 7:35 PM Rafael J. Wysocki >> > wrote: >> >> >> >> From: Rafael J. Wysocki >> >> >> >> Calling acpi_thermal_check() from acpi_thermal_notify() directly >> >> is problematic if _TMP triggers Notify () on the thermal zone for >> >> which it has been evaluated (which happens on some systems), because >> >> it causes a new acpi_thermal_notify() invocation to be queued up >> >> every time and if that takes place too often, an indefinite number of >> >> pending work items may accumulate in kacpi_notify_wq over time. >> >> >> >> Besides, it is not really useful to queue up a new invocation of >> >> acpi_thermal_check() if one of them is pending already. >> >> >> >> For these reasons, rework acpi_thermal_notify() to queue up a thermal >> >> check instead of calling acpi_thermal_check() directly and only allow >> >> one thermal check to be pending at a time. Moreover, only allow one >> >> acpi_thermal_check_fn() instance at a time to run >> >> thermal_zone_device_update() for one thermal zone and make it return >> >> early if it sees other instances running for the same thermal zone. >> >> >> >> While at it, fold acpi_thermal_check() into acpi_thermal_check_fn(), >> >> as it is only called from there after the other changes made here. >> >> >> >> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208877 >> >> Reported-by: Stephen Berman >> >> Diagnosed-by: Sebastian Andrzej Siewior >> >> Signed-off-by: Rafael J. Wysocki >> > >> > Well, it's been over a week since this was posted. >> > >> > Does anyone have any comments? >> >> Sorry, I haven't been able to make time to test the patch yet, but I'll >> try to do so this weekend. Is it just the patch below that I should >> apply, ignoring the previous patches you sent? > > Yes. > >> And can I apply it to the current mainline kernel? > > Yes, it should be applicable to the current mainline (at least as of > 5.11-rc4). > > Thanks! I've now updated my local repo to 5.11.0-rc4+, installed your patch, rebuilt and installed the kernel, rebooted (without adding 'thermal.tzp=300' to the kernel command line), did some normal activity, then ran 'shutdown -h now', and the machine did just that. So your patch seems to have fixed the problem I reported. Many thanks! Steve Berman
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Mon, 04 Jan 2021 16:38:43 +0100 Stephen Berman wrote: > On Thu, 31 Dec 2020 21:46:11 +0100 "Rafael J. Wysocki" > wrote: > >> ATM, I'm tempted to do something like the patch below (with the rationale >> that it shouldn't be necessary to read the temperature right after updating >> the trip points if polling is in use, because the next update through polling >> will cause it to be read anyway and it will trigger trip point actions as >> needed). >> >> Stephen, can you give it a go, please? > > On Sat, 02 Jan 2021 12:03:17 +0100 "Rafael J. Wysocki" > wrote: > >> There is one more way to address this, probably better: instead of checking >> the >> temperature right away in acpi_thermal_notify(), queue that on >> acpi_thermal_pm_queue >> and so only if another thermal check is not pending. >> >> This way there will be at most one temperature check coming from >> acpi_thermal_notify() queued up at any time which should prevent the >> build-up of work items from taking place. >> >> So something like this: > > Thanks for the patches. I'll try them as soon as I can. FTR, since this is the thread I started for this bug, I've confirmed in https://lore.kernel.org/lkml/87y2gi78sg@gmx.net/T/#t that the latest patch fixes the bug. Steve Berman
power-off delay/hang due to commit 6d25be57 (mainline)
I'm experiencing a delay or hang in powering off my computer after `halt -d -f -i -p' and I've bisected it to this commit in the mainline tree: commit 6d25be5782e482eb93e3de0c94d0a517879377d0 Author: Thomas Gleixner Date: Wed Mar 13 17:55:48 2019 +0100 sched/core, workqueues: Distangle worker accounting from rq lock The delays have varied in length from ~20 seconds to seeminingly indefinitely long (the longest I've waited before pressing the start button on the machine is 6 minutes). With kernels prior to this commit, my machine powers off within 4 seconds after the halt invocation. I first noticed this problem with the stable kernel 5.5.9 that I built with Linux From Scratch (LFS). My previous LFS system had stable kernel 4.20.12 and there was no such delay. I also tried stable kernel 5.6.4 and the delay/hang also happens with it. After consulting with LFS developers I cloned the mainline kernel repository and determined that problem happens with commit with v5.2 (0ecfebd2) but not with v5.1 (e93c9c99), then ran the bisection with those boundaries. With mainline kernels (I only tried with v.5.3 and then commits between v5.1 and v5.2 during the bisection) there are no kernel messages during the delay/hang, but with 5.5.9 and 5.6.4, when the delay is longer than two minutes, a message like the following is displayed: sr 5:0:0:0: tag#21 timing out command, waited 120s repeating every 2 minutes with a different tag#. This made me think the problem was due to the cdrom device in the machine, but when I compiled 5.3.0 without /dev/cdrom (by not setting CONFIG_BLK_DEV_SR, which also left CONFIG_CDROM unset), the delay/hang still happened. The length of the delay seems to depend not only on the kernel but also the number or kind of programs used. The 20 second delay has happened only went I invoke `shutdown -h now' (which runs `halt -d -f -i -p' in LFS) immediately after booting. When I ran `more' in one tty and `shutdown -h now' in another, the machine powered off in less than two minutes. But after running emacs in a tty (and then killing it), the delay was seemingly indefinite, as it is with my usual workflow with X, emacs, firefox, etc. running (but I always kill them before invoking `shutdown -h now' from a tty). During the bisection I tested each kernel build by running startx, emacs, firefox, killing these, then `shutdown -h now'. I've attached the git log of the bisection augmented with power-off timing notes. (FWIW, when I reboot the machine with `shutdown -r now' there is no delay with the problematic kernels.) One piece of advice I got from the LFS mailing list was to try and revert the above commit pinpointed by the bisection in a recent mainline or stable kernel. But it fails to revert cleanly (using `patch -R') in such kernels and I am too unfamiliar with the code to try and revert it manually. Another pointer I got was to the recent commit 62849a96, which fixes a bug due to the commit that causes my problem. I applied 62849a96 to 5.6.4, but it did not prevent the delay/hang. I've found nothing on the web about this problem, so it seems to be rare. That's why I'm seeking your help (on the advice of LFS developers). In addition to the bisection log I've also attached the kernel 5.1.0 config file (I used that during the bisection, accepting all defaults of later options, as with the configs of 5.5.9 and 5.6.4) and the contents of /proc/cpuinfo. Please let me know if there's any other information I can provide or anything else I can do to help with debugging this issue. I hope you can fix it. Thanks, Steve Berman bisect.log Description: application/text/plain config-5.1.0 Description: application/text/plain processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 158 model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz stepping: 10 microcode : 0xaa cpu MHz : 1225.910 cache size : 12288 KB physical id : 0 siblings: 12 core id : 0 cpu cores : 6 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp flush_l1d bugs: cpu_meltdown spectre_v1 sp
Re: power-off delay/hang due to commit 6d25be57 (mainline)
(GMX put your email into my spam folder, so I didn't see it before I sent my followup about removing the wifi firmware.) On Tue, 9 Jun 2020 22:23:39 +0200 Sebastian Andrzej Siewior wrote: > On 2020-06-09 12:06:23 [+0200], Stephen Berman wrote: >> I recompiled kernel 5.6.4 with the printk() call you suggested, then >> booted the kernel with "ignore_loglevel initcall_debug" (but leaving the >> CDROM and wifi intact for now). After working as I normally do, I >> called `shutdown -h now', again as usual. After the "Bringing down the >> loopback interface" message there were these two messages: >> >> reboot: __do_sys_reboot(317)CMD: 89abcdef >> reboot: __do_sys_reboot(317)CMD: 4321fedc >> >> Then nothing more for two minutes, then, as previously: >> >> sr 5:0:0:0: tag#10 timing out command, waited 120 seconds. >> >> Then I did a hard reboot. >> >> This morning I detached the cables to the CDROM (but left the disk in >> the box) and again booted 5.6.4 with "ignore_loglevel initcall_debug". >> After working for a bit, I called `shutdown -h now', and now there were >> quite a few more messages, but again the machine did not power off. >> Here is the ouput, which I transcribed carefully, hopefully without >> error (I omitted the fractional parts of the timestamps and some of the >> usb1-portX messages): > > If it helps you could delay printks on shutdown (via > /proc/sys/kernel/printk_delay) and record a video clip. I could make > storage available for an upload. So far this hasn't been necessary, since the new output until the first two-minute pause fit on one screen and I could photograph it. > | * Bringing down the loopback interface... > | [1123.***] reboot: __do_sys_reboot(317)CMD: 89abcdef > | [1123.***] reboot: __do_sys_reboot(317)CMD: 4321fedc > This happens quicker than I expected. Initially I assumed that you are > not there yet and userland is still poking at your cdrom. Accordin > … > | [1124.***] sd 4:0:0:0: shutdown > | [1124.***] sd 4:0:0:0: shutdown [sda] Synchronizing SCSI cache > | [1124.***] sd 4:0:0:0: shutdown [sda] Stopping disk > > After this, there shouldn't be any outstanding disk requests. > … > | [1125.***] ahci :00:17.0: shutdown > > This disables the ahci controller which means disk/cdrom requests won't > be answered by the hardware. > … > | [1125.***] ACPI: Preparing to enter system sleep state S5 > | [1352.***] INFO: task halt:5187 blocked for more than 122 seconds. > | [1352.***] Not tainted 5.6.4 #4 > | [1352.***] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this > message > | [1352.***] haltD13856 5187 5186 0x4000 > | [1352.***] Call Trace: > | [1352.***] ? __schedule+0x272/0x5b0 > | [1352.***] schedule+0x45/0xb0 > | [1352.***] schedule_timeout+0x204/0x2f0 > | [1352.***] ? acpi_os_release_object+0x5/0x10 > | [1352.***] ? acpi_ut_update_object_reference+0x14e/0x1d2 > | [1352.***] wait_for_completion+0xa3/0x100 > | [1352.***] ? wake_up_q+0x90/0x90 > | [1352.***] flush_workqueue+0x130/0x420 > > I have no idea where this flush_workqueue() is coming from. The command > scripts/decode_stacktrace.sh vmlinux $(pwd)/ kernel_power_off+0x3d/0x70 > > should reveal that. I saved the call trace lines to a file and did this (with `$(pwd)/' the script did not run): steve [ /sources/linux-5.6.4 ]$ ./scripts/decode_stacktrace.sh vmlinux . kernel_power_off+0x3d/0x70 < /home/steve/temp/call_trace [1352.***] Call Trace: [1352.***] ? __schedule (core.c:?) [1352.***] schedule (??:?) [1352.***] schedule_timeout (??:?) [1352.***] ? acpi_os_release_object (??:?) [1352.***] ? acpi_ut_update_object_reference (??:?) [1352.***] wait_for_completion (??:?) [1352.***] ? wake_up_q (??:?) [1352.***] flush_workqueue (??:?) [1352.***] kernel_power_off (??:?) [1352.***] __do_sys_reboot (reboot.c:?) [1352.***] ? do_send_sig_info (??:?) [1352.***] ? kill_pid_info (??:?) [1352.***] ? kill_something_info (signal.c:?) [1352.***] ? __x64_sys_kill (??:?) [1352.***] do_syscall_64 (??:?) [1352.***] entry_SYSCALL_64_after_hwframe (??:?) [1352.***] RIP: 0033:0x7f95dd6992c3 [1352.***] Code: Bad RIP value. objdump: '/tmp/tmp.BlpNDdSQpG.o': No such file Did I do something wrong or do I need to pass certain compiler flags when building the kernel (when I build Emacs I pass CFLAGS='-Og -g3')? > The cdrom is polled by the kernel every two seconds. I *think* the > kernel is blocked for some reason and then the cdrom polled. This can't > complete and you see the srX warning. The window is quite small but not > impossible. > I managed to fabricate this case bu
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Wed, 10 Jun 2020 12:25:14 +0200 Sebastian Andrzej Siewior wrote: > On 2020-06-10 10:21:29 [+0200], Stephen Berman wrote: >> (GMX put your email into my spam folder, so I didn't see it before I >> sent my followup about removing the wifi firmware.) > > okay. > >> On Tue, 9 Jun 2020 22:23:39 +0200 Sebastian Andrzej Siewior >> wrote: >> > scripts/decode_stacktrace.sh vmlinux $(pwd)/ kernel_power_off+0x3d/0x70 >> > >> > should reveal that. >> >> I saved the call trace lines to a file and did this (with `$(pwd)/' the >> script did not run): >> >> steve [ /sources/linux-5.6.4 ]$ ./scripts/decode_stacktrace.sh vmlinux >> . kernel_power_off+0x3d/0x70 < /home/steve/temp/call_trace >> [1352.***] Call Trace: >> [1352.***] ? __schedule (core.c:?) >> [1352.***] schedule (??:?) >> [1352.***] schedule_timeout (??:?) >> [1352.***] ? acpi_os_release_object (??:?) >> [1352.***] ? acpi_ut_update_object_reference (??:?) >> [1352.***] wait_for_completion (??:?) >> [1352.***] ? wake_up_q (??:?) >> [1352.***] flush_workqueue (??:?) >> [1352.***] kernel_power_off (??:?) >> [1352.***] __do_sys_reboot (reboot.c:?) >> [1352.***] ? do_send_sig_info (??:?) >> [1352.***] ? kill_pid_info (??:?) >> [1352.***] ? kill_something_info (signal.c:?) >> [1352.***] ? __x64_sys_kill (??:?) >> [1352.***] do_syscall_64 (??:?) >> [1352.***] entry_SYSCALL_64_after_hwframe (??:?) >> [1352.***] RIP: 0033:0x7f95dd6992c3 >> [1352.***] Code: Bad RIP value. >> objdump: '/tmp/tmp.BlpNDdSQpG.o': No such file >> >> Did I do something wrong or do I need to pass certain compiler flags >> when building the kernel (when I build Emacs I pass CFLAGS='-Og -g3')? > > the arguments last argument should not be kernel_power_off+0x3d/0x70. I > should have been entered after "enter" but you can also pipe the whole > file as you did. > If it gives you the ?? then might need > CONFIG_DEBUG_INFO Thanks. I set that, recompiled and now get this: decode_stacktrace Description: application/text/plain [...] >> By the other patch do you mean the following? (This email was also put >> into my spam by GMX and AFAICT has not yet shown up on the vger list.) > > Yes. But now that I played a little with it, I made dis: [...] > So you should have output like: > |[ 14.131696] Flushing events_freezable_power_ > the flush that hangs > > |[ 17.203246] Flushing events_freezable_power_ over, 0 > It has been noticed (0 at the end) > > |[ 17.203849] Showing busy workqueues and worker pools: > |[ 17.204400] workqueue events_freezable_power_: flags=0x84 > |[ 17.205009] pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 > |[ 17.205774] in-flight: 2147:disk_events_workfn > |[ 17.206268] workqueue kblockd: flags=0x18 > |[ 17.206682] pwq 13: cpus=6 node=0 flags=0x0 nice=-20 active=1/256 > refcnt=2 > |[ 17.207428] pending: blk_mq_requeue_work > |[ 17.207958] pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=2s workers=3 idle: > | 535 42 > > A dump of all pending worqueue items. > > |[ 20.275239] Flushing events_freezable_power_ over, 0 > |[ 20.275812] Showing busy workqueues and worker pools: > |[ 20.276385] workqueue events_freezable_power_: flags=0x84 > |[ 20.276972] pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 > |[ 20.277712] in-flight: 2147:disk_events_workfn > |[ 20.278245] workqueue kblockd: flags=0x18 > |[ 20.278685] pwq 13: cpus=6 node=0 flags=0x0 nice=-20 active=1/256 > refcnt=2 > |[ 20.279471] pending: blk_mq_requeue_work > |[ 20.279989] pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=5s workers=3 idle: > | 535 42 > > Another atempt. As you see `blk_mq_requeue_work' is references in both > outputs but it makes progress (it is scheduled multiple times). The item > `disk_events_workfn' stucks. I rebuilt kernel 5.6.4 with that patch, but before calling `shutdown -h now' I guess I should have set printk_delay, because this time there was a lot of output and it went by too fast. I'll try to delay it the next time. After the initial output (which went by too fast for me to photograph) there were endlessly repeated screenfuls of output like this: [ 2578.402398] , acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred and at the bottom of the screen this: [ 2578.427680] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=4 idle: 1331 After this line there was a brief pause (so I was able to photograph it) and then it repeated, always the same. In the last line pool, cpus, node, flags, nice and hung were always 0, and workers was always 4. Only idle changed, but seemed to cycle through a few values. After watching this repetition for several minutes, I finally did a hard reboot. Steve Berman
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Thu, 11 Jun 2020 00:49:26 +0200 Stephen Berman wrote: > On Wed, 10 Jun 2020 12:25:14 +0200 Sebastian Andrzej Siewior > wrote: > [...] >>> By the other patch do you mean the following? (This email was also put >>> into my spam by GMX and AFAICT has not yet shown up on the vger list.) >> >> Yes. But now that I played a little with it, I made dis: > [...] >> So you should have output like: >> |[ 14.131696] Flushing events_freezable_power_ >> the flush that hangs >> >> |[ 17.203246] Flushing events_freezable_power_ over, 0 >> It has been noticed (0 at the end) >> >> |[ 17.203849] Showing busy workqueues and worker pools: >> |[ 17.204400] workqueue events_freezable_power_: flags=0x84 >> |[ 17.205009] pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 >> refcnt=2 >> |[ 17.205774] in-flight: 2147:disk_events_workfn >> |[ 17.206268] workqueue kblockd: flags=0x18 >> |[ 17.206682] pwq 13: cpus=6 node=0 flags=0x0 nice=-20 active=1/256 >> refcnt=2 >> |[ 17.207428] pending: blk_mq_requeue_work >> |[ 17.207958] pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=2s workers=3 idle: >> | 535 42 >> >> A dump of all pending worqueue items. >> >> |[ 20.275239] Flushing events_freezable_power_ over, 0 >> |[ 20.275812] Showing busy workqueues and worker pools: >> |[ 20.276385] workqueue events_freezable_power_: flags=0x84 >> |[ 20.276972] pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 >> refcnt=2 >> |[ 20.277712] in-flight: 2147:disk_events_workfn >> |[ 20.278245] workqueue kblockd: flags=0x18 >> |[ 20.278685] pwq 13: cpus=6 node=0 flags=0x0 nice=-20 active=1/256 >> refcnt=2 >> |[ 20.279471] pending: blk_mq_requeue_work >> |[ 20.279989] pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=5s workers=3 idle: >> | 535 42 >> >> Another atempt. As you see `blk_mq_requeue_work' is references in both >> outputs but it makes progress (it is scheduled multiple times). The item >> `disk_events_workfn' stucks. > > I rebuilt kernel 5.6.4 with that patch, but before calling `shutdown -h > now' I guess I should have set printk_delay, because this time there was > a lot of output and it went by too fast. I'll try to delay it the next > time. I've done that now. I've sent you screenshots offlist. Here's a brief summary: The initial shutdown log output is essentially the same as the transcription I posted upthread, except that it is interspersed with messages from your patch like "Flush hid_reset+0x0/0x80". Then comes: ACPI: Preparing to enter system sleep state S5 Flushing kacpid Flushing kacpid_notify Flushing kacpid_notify over, 0 Showing busy workqueues and worker pools: workqueue events: flags=0x0 pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 pending: drm_fb_helper_diry_work workqueue kacpid_notify: flags=0x0 pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1 refcnt=4185 in-flight: 1578:acpi_os_execute_deferred delayed: acpi_os_execute_deferred, acpi_os_execute_deferred, ... The "acpi_os_execute_deferred" messages were repeated many times in the above line, then every 20-30 seconds again for several minutes. Then suddenly a call trace appeared which was similar but not identical to the one I posted upthread, and each line of the trace was followed by the line ", acpi_os_execute_deferred". This went by quite quickly even with the printk_delay I added, and I was unable to photograph the start of it and couldn't get all of the subsequent output, but the screenshots show some of it. After several minutes of this output I again did a hard reboot. Steve Berman
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Wed, 24 Jun 2020 22:11:56 +0200 Sebastian Andrzej Siewior wrote: > On 2020-06-17 23:09:44 [+0200], Stephen Berman wrote: >> >> Attached. > > I did not forget about this but had recently little time to look into > this. No problem! Steve Berman
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Fri, 12 Jun 2020 13:01:22 +0200 Sebastian Andrzej Siewior wrote: > + ACPI in case the ACPI folks see something obvious. [...] >> The "acpi_os_execute_deferred" messages were repeated many times in the >> above line, then every 20-30 seconds again for several minutes. Then >> suddenly a call trace appeared which was similar but not identical to >> the one I posted upthread, and each line of the trace was followed by >> the line ", acpi_os_execute_deferred". This went by quite quickly even >> with the printk_delay I added, and I was unable to photograph the start >> of it and couldn't get all of the subsequent output, but the screenshots >> show some of it. After several minutes of this output I again did a >> hard reboot. > > All good. I though that you will have one worker that is blocking but > you have a lot of them. It appears that one is active and stuck and more > are waiting. > > Could you look at acpi in /proc/interrupts 10 secs apart to see if it > increments? > >grep -E 'acpi|smbus' /proc/interrupts I tried this several times, 10 seconds apart and longer, but saw no change, it was always this: steve [ ~ ]$ grep -E 'acpi|smbus' /proc/interrupts 9: 0 5 0 0 0 0 0 0 0 0 0 0 IO-APIC 9-fasteoi acpi 16: 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC 16-fasteoi i801_smbus > You could also do "echo t > /proc/sysrq-trigger" which gives you a lot > of task state information, but at the end you will also see "howing busy > workqueues and worker pools:" regarding the workqueue state. I am > curious to see if you already have worker stuck in kacpid_notify with > acpi_os_execute_deferred. What am I supposed to do after "echo t > /proc/sysrq-trigger"? Both before and after doing that I get an error trying to open the file: root [ ~ ]# cat /proc/sysrq-trigger cat: /proc/sysrq-trigger: Input/output error > Now that we know that know that acpi_os_execute_deferred() is stuck, > lets shed some light in what it is trying to do. The patch at the end > will dump this information into the console buffer (The `dmesg' command > will print the whole kernel buffer). I have no idea if this starts > printing while the system is running or during shutdown. I would expect > to see the Start line in acpi_os_execute_deferred() but not the End one. > > diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c > index 41168c027a5a4..0e983c558bcb5 100644 > --- a/drivers/acpi/osl.c > +++ b/drivers/acpi/osl.c > @@ -840,7 +840,9 @@ static void acpi_os_execute_deferred(struct work_struct > *work) > { > struct acpi_os_dpc *dpc = container_of(work, struct acpi_os_dpc, work); > > + pr_err("%s(%d) Start %px %pF(%px)\n", __func__, __LINE__, dpc, > dpc->function, dpc->context); > dpc->function(dpc->context); > + pr_err("%s(%d) End %px %pF(%px)\n", __func__, __LINE__, dpc, > dpc->function, dpc->context); > kfree(dpc); > } > > @@ -1096,6 +1098,8 @@ acpi_status acpi_os_execute(acpi_execute_type type, >*/ > if (type == OSL_NOTIFY_HANDLER) { > queue = kacpi_notify_wq; > + pr_err("%s(%d) Adding %pS %px\n", __func__, __LINE__, function, > +context); > INIT_WORK(&dpc->work, acpi_os_execute_deferred); > } else if (type == OSL_GPE_HANDLER) { > queue = kacpid_wq; I applied this patch to 5.6.4 and recompiled, and on the next boot with that kernel, the kernel buffer (and kernel and system logs) began to get flooded with these messages: Jun 14 10:37:13 strobe-jhalfs kernel: [5.943987] acpi_os_execute_deferred(843) Start 8fb82c7b6500 3edf1e05(8fb82c492990) Jun 14 10:37:13 strobe-jhalfs kernel: [5.944102] acpi_os_execute(1101) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8fb82c492990 Jun 14 10:37:13 strobe-jhalfs kernel: [5.944104] acpi_os_execute_deferred(845) End 8fb82c7b6500 3edf1e05(8fb82c492990) Jun 14 10:37:13 strobe-jhalfs kernel: [5.944105] acpi_os_execute_deferred(843) Start 8fb82b844800 2ba560ea(8fb82c492990) Jun 14 10:37:13 strobe-jhalfs kernel: [5.944124] acpi_os_execute_deferred(845) End 8fb82b844800 2ba560ea(8fb82c492990) Jun 14 10:37:13 strobe-jhalfs kernel: [5.944288] acpi_os_execute_deferred(843) Start 8fb82c7b6540 3edf1e05(8fb82c492990) Jun 14 10:37:13 strobe-jhalfs kernel: [5.944387] acpi_os_execute(1101) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8fb82c492990 and so on without stopping. I could start X and avoid seeing the messages, but was afraid the logs would fill up the root partition if I let it keep going, so I rebooted with another kernel. Was this message flood because I booted with "ignore_loglevel initcall_debug"? In the logs there are also
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Sun, 14 Jun 2020 19:10:05 +0200 Sebastian Andrzej Siewior wrote: > On 2020-06-14 14:12:18 [+0200], Stephen Berman wrote: [...] >> What am I supposed to do after "echo t > /proc/sysrq-trigger"? Both >> before and after doing that I get an error trying to open the file: >> >> root [ ~ ]# cat /proc/sysrq-trigger >> cat: /proc/sysrq-trigger: Input/output error > > echo "t > /proc/sysrq-trigger" > > not cat. Ok, sorry, I had misunderstood, but now I've looked at the documentation. I had in fact already done `echo t > /proc/sysrq-trigger' in an xterm (as root) and there was no output. Later, after booting kernel 5.1.0 because of the message flooding with 5.6.4, I did `echo t > /proc/sysrq-trigger' in a virtual tty (also as root) and the only output was: `[ ] sysrq: Show State'. Is this expected? (In /proc/sys/kernel/sysrq there is `1'.) I couldn't try it in a 5.6.4 virtual tty because of the message flooding (see below). [...] >> I applied this patch to 5.6.4 and recompiled, and on the next boot with >> that kernel, the kernel buffer (and kernel and system logs) began to get >> flooded with these messages: >> >> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.943987] >> acpi_os_execute_deferred(843) Start 8fb82c7b6500 >> 3edf1e05(8fb82c492990) >> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944102] acpi_os_execute(1101) >> Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8fb82c492990 >> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944104] >> acpi_os_execute_deferred(845) End 8fb82c7b6500 >> 3edf1e05(8fb82c492990) >> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944105] >> acpi_os_execute_deferred(843) Start 8fb82b844800 >> 2ba560ea(8fb82c492990) >> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944124] >> acpi_os_execute_deferred(845) End 8fb82b844800 >> 2ba560ea(8fb82c492990) >> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944288] >> acpi_os_execute_deferred(843) Start 8fb82c7b6540 >> 3edf1e05(8fb82c492990) >> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944387] acpi_os_execute(1101) >> Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8fb82c492990 >> >> and so on without stopping. I could start X and avoid seeing the >> messages, but was afraid the logs would fill up the root partition if I >> let it keep going, so I rebooted with another kernel. >> >> Was this message flood because I booted with "ignore_loglevel >> initcall_debug"? In the logs there are also lots of messages like this: > > Is there a acpi_os_execute_* flood? The *few* at what appears to system > startup and might be normal. If there appear *many* more and are > constantly printing (check with dmesg) then we might be to something. The kernel log shows 305 of these messages in the 4 minutes and 17 seconds between the start of klogd and when I rebooted. [...] > I attached the updated acpi patch. It limits the prints to the > kacpi_notify_wq queue which appears to stuck at shutdown. I applied this patch and rebuilt the kernel. After booting 5.6.4 this morning there was again message flooding and this time I was unable to log in because of it, so had to do a hard reboot and booted with 5.1.0. The kernel log shows 207 acpi_os_execute_deferred_notify messages in 3 minutes and 6 seconds; here are the first: Jun 15 08:55:22 strobe-jhalfs kernel: [5.943848] acpi_os_execute(1109) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8d7aaba18990 Jun 15 08:55:22 strobe-jhalfs kernel: [5.943851] acpi_os_execute_deferred_notify(851) Start 8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990) Jun 15 08:55:22 strobe-jhalfs kernel: [5.943865] acpi_os_execute_deferred_notify(853) End 8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990) Jun 15 08:55:22 strobe-jhalfs kernel: [5.944104] acpi_os_execute(1109) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8d7aaba18990 Jun 15 08:55:22 strobe-jhalfs kernel: [5.944106] acpi_os_execute_deferred_notify(851) Start 8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990) Jun 15 08:55:22 strobe-jhalfs kernel: [5.944115] acpi_os_execute_deferred_notify(853) End 8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990) Jun 15 08:55:22 strobe-jhalfs kernel: [5.944406] acpi_os_execute(1109) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8d7aaba18990 Jun 15 08:55:22 strobe-jhalfs kernel: [5.944408] acpi_os_execute_deferred_notify(851) Start 8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990) Jun 15 08:55:22 strobe-jhalfs kernel: [5.944417] acpi_os_execute_deferred_notify(853) End 8d7aab157fc0 acpi_ev_asynch_enable_gpe+0
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Mon, 15 Jun 2020 16:51:30 +0200 Sebastian Andrzej Siewior wrote: > On 2020-06-15 09:58:00 [+0200], Stephen Berman wrote: >> Ok, sorry, I had misunderstood, but now I've looked at the >> documentation. I had in fact already done `echo t > >> /proc/sysrq-trigger' in an xterm (as root) and there was no output. >> Later, after booting kernel 5.1.0 because of the message flooding with >> 5.6.4, I did `echo t > /proc/sysrq-trigger' in a virtual tty (also as >> root) and the only output was: `[ ] sysrq: Show State'. Is >> this expected? (In /proc/sys/kernel/sysrq there is `1'.) I couldn't >> try it in a 5.6.4 virtual tty because of the message flooding (see >> below). > > If you do this "t" then there should be a lot of output on your console. > If you do this from an xterm then you can see the output after typing > "dmesg". The output should appear also in your system log. Ah, ok, I do see it in the log, it looks basically the same as the call trace I posted upthread. I wonder why there was no such output in the console, could there be some setting that suppresses it even though /proc/sys/kernel/sysrq has `1'? >> The kernel log shows 305 of these messages in the 4 minutes and 17 >> seconds between the start of klogd and when I rebooted. > > Okay, this is a lot. > >> > The interesting part is to see if there is a acpi_os_execute() adding a >> > specific event multiple times which does not complete. Maybe at runtime, >> > maybe at shutdown time. If that is the case then ignoring this specific >> > event might fix the shutdown problem. With all this information so far, >> > I don't see a relation with this problem and the commit… >> >> In those 3 minutes and 8 seconds there were 5 "Adding >> acpi_ev_asynch_enable_gpe" messages at the beginning, then 185 "Adding >> acpi_ev_notify_dispatch" messages, which kept coming until I rebooted. > > 5 acpi_ev_asynch_enable_gpe() in three minutes since boot isn't much. > 185 of "Adding acpi_ev_notify_dispatch" is a lot. > > But to be clear, only > acpi_os_execute(1109) Adding acpi_ev_notify_dispatch" messages? > No > "acpi_os_execute_deferred_notify() Start/End acpi_ev_notify_dispatch"? > > The syslog should have captured the log on disk. There were in fact 99 "Start .* acpi_ev_notify_dispatch" messages and 98 "End .* acpi_ev_notify_dispatch" messages. Here's the last of them before I rebooted: Jun 15 08:58:25 strobe-jhalfs kernel: [ 193.315014] acpi_os_execute(1109) Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a66e0 Jun 15 08:58:25 strobe-jhalfs kernel: [ 193.319929] acpi_os_execute_deferred_notify(853) End 8d7aa8758d00 acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e7eb0) Jun 15 08:58:25 strobe-jhalfs kernel: [ 193.321242] acpi_os_execute_deferred_notify(851) Start 8d7aa8758a80 acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e70a0) Jun 15 08:58:26 strobe-jhalfs kernel: [ 194.339017] acpi_os_execute(1109) Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a6730 Jun 15 08:58:27 strobe-jhalfs kernel: [ 195.363005] acpi_os_execute(1109) Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a6780 Jun 15 08:58:27 strobe-jhalfs kernel: [ 195.367978] acpi_os_execute_deferred_notify(853) End 8d7aa8758a80 acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e70a0) Jun 15 08:58:27 strobe-jhalfs kernel: [ 195.369234] acpi_os_execute_deferred_notify(851) Start 8d7aa8758240 acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e7a00) Jun 15 08:58:28 strobe-jhalfs kernel: [ 196.387017] acpi_os_execute(1109) Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a68c0 Jun 15 08:58:28 strobe-jhalfs kernel: [ 196.388508] acpi_os_execute_deferred_notify(853) End 8d7aa8758240 acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e7a00) Jun 15 08:58:28 strobe-jhalfs kernel: [ 196.390128] acpi_os_execute_deferred_notify(851) Start 8d7aa8758b80 acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e7460) Jun 15 08:58:29 strobe-jhalfs kernel: [ 197.208714] wlan1: deauthenticating from 7c:ff:4d:08:df:22 by local choice (Reason: 3=DEAUTH_LEAVING) Jun 15 08:58:29 strobe-jhalfs kernel: [ 197.232214] ip (4614) used greatest stack depth: 11272 bytes left Jun 15 08:58:29 strobe-jhalfs kernel: Kernel logging (proc) stopped. Jun 15 08:58:29 strobe-jhalfs kernel: Kernel log daemon terminating. > I attached a modified acpi_dbg.patch. Please enable: > - CONFIG_ACPI_DEBUG=y > > Looking at your 5.1 you have tracing enabled (hope it still is). > > The attached patch will dump the date into the tracing buffer, so you > console should remain "clean". Once it records 300 of those "requests > for acpi_ev_notify_dispatch" it will stop recording. > After 4-5 minutes please do > cat /sys/kernel/debug/tracing/trace > file.txt > > compress it and send it. Will do as soon as I can, hopefully later today or tomorrow. Thanks. Steve Berman
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Mon, 15 Jun 2020 17:58:46 +0200 Sebastian Andrzej Siewior wrote: > On 2020-06-15 17:41:06 [+0200], Stephen Berman wrote: >> > If you do this "t" then there should be a lot of output on your console. >> > If you do this from an xterm then you can see the output after typing >> > "dmesg". The output should appear also in your system log. >> >> Ah, ok, I do see it in the log, it looks basically the same as the call >> trace I posted upthread. I wonder why there was no such output in the >> console, could there be some setting that suppresses it even though >> /proc/sys/kernel/sysrq has `1'? > > You have ignore_loglevel soo everything should appear on your console. > The is true for your tty / ctrl+alt+f1 but your xterm shell. Yet that appears not to be the case here, or I'm still doing something wrong 🤷 >> There were in fact 99 "Start .* acpi_ev_notify_dispatch" messages and 98 >> "End .* acpi_ev_notify_dispatch" messages. Here's the last of them >> before I rebooted: [...] >> acpi_os_execute_deferred_notify(851) Start 8d7aa8758a80 >> acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e70a0) >> Jun 15 08:58:26 strobe-jhalfs kernel: [ 194.339017] acpi_os_execute(1109) >> Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a6730 >> Jun 15 08:58:27 strobe-jhalfs kernel: [ 195.363005] acpi_os_execute(1109) >> Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a6780 >> Jun 15 08:58:27 strobe-jhalfs kernel: [ 195.367978] >> acpi_os_execute_deferred_notify(853) End 8d7aa8758a80 [...] > Okay. So there is "Adding" of four events/work items in total, each > almost every second. > Processing of one work-item took two seconds, the other one took a > second. These events are "old" so I don't see the "adding" line for the > Start/End. > You could look in the log for 8d7aa8758a80 to seen when it was > enqueued but it feels like ACPI enqueues more events then it is capable > of processing. It's there, 196 lines above the Start 8d7aa8758a80 line: Jun 15 08:56:58 strobe-jhalfs kernel: [ 106.275356] acpi_os_execute(1109) Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa84e70a0 Steve Berman
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Thu, 14 May 2020 00:04:28 +0200 Sebastian Andrzej Siewior wrote: > On 2020-05-08 23:30:45 [+0200], Stephen Berman wrote: >> > Can you log the output on the serial console? >> >> How do I do that? > > The spec for your mainboard says "serial port header". You would need to > connect a cable there to another computer and log its output. > The alternative would be to delay the output on the console and use a > camera. It's easiest for me to take a picture, since there isn't much output and in any case the delay happens on it's own ;-). I'm sending you the image (from kernel 5.6.4) off-list since even after reducing it it's 1.2 MB large. >> > If the commit you cited is really the problem then it would mean that a >> > worker isn't scheduled for some reason. Could you please enable >> > CONFIG_WQ_WATCHDOG to see if workqueue core code notices that a worker >> > isn't making progress? I enabled that and also CONFIG_SOFTLOCKUP_DETECTOR, CONFIG_HARDLOCKUP_DETECTOR and CONFIG_DETECT_HUNG_TASK, which had all been unset previously. >> How will I know if that happens, is there a specific message in the tty? > > On the tty console where you see the "timing out command, waited" > message, there should be something starting with > |BUG: workqueue lockup - pool > > following with the pool information that got stuck. That code checks the > workqueues every 30secs by default. So if you waited >= 60secs then > system is not detecting a stall. As you can see in the photo, there was no message about a workqueue lockup, only "task halt:5320 blocked for more than seconds" every two minutes. I suppose that comes from one of the other options I enabled. Does it reveal anything about the problem? > As far as I can tell, there is nothing special on your system. The CD > and disk drives are served by the AHCI controller. There is no special > SCSI/SATA/SAS controller. > Right now I have no idea how the workqueues fit in the picture. Could > you please check if the stall-dector says something? Is that the message I repeated above or do you mean the workqueue? > Is it possible to show me output when the timeout message comes? My > guess is that the system is going down and before unounting/remount RO > the filesystem it flushes its last data. But this is done before issuing > the "halt-syscall". The entire output from `shutdown -h now' is in the picture; after the fourth "timing out command" message, I pressed the reset button. Steve Berman
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Tue, 14 Jul 2020 16:11:35 +0200 Sebastian Andrzej Siewior wrote: > On 2020-07-14 15:54:57 [+0200], Rafael J. Wysocki wrote: >> On Tue, Jul 14, 2020 at 3:44 PM Sebastian Andrzej Siewior >> wrote:> >> > On 2020-06-24 23:49:52 [+0200], Stephen Berman wrote: >> > >> > Let me summarize the thread here: >> > >> > On Stephen's system, ACPI informs the thermal zone driver to poll the >> > temperature every second and the driver does so. >> > The driver queries the temperature by invoking acpi_evaluate_integer() >> > which invokes (at some point) acpi_ev_queue_notify_request(). >> >> Well, I don't quite see how acpi_ev_queue_notify_request() can be >> invoked from the acpi_evaluate_integer() code path. >> >> Do you have a call trace showing that? > > So the trace in > https://lore.kernel.org/linux-acpi/87o8pjh1i0@gmx.net/ > > shows the pattern and we nailed it down that it comes from > thermal_get_temp(). I assumed acpi_ex_opcode_2A_0T_0R() since the other > candidate was acpi_ev_asynch_execute_gpe_method(). > > Stephen, the patch attached adds a WARN_ON() statement which will > produce a stack trace (4 or so). Could please run 'dmesg' after a while > and send it back. There should be a > "WARNING in drivers/acpi/acpica/evmisc.c" > > statement or something along the lines. I've now rebuilt 5.6.4 with the new patch; sorry for the delay. Output of dmesg attached. Steve Berman dmesg-5.6.4.gz Description: Binary data
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Wed, 17 Jun 2020 16:27:34 +0200 Sebastian Andrzej Siewior wrote: > On 2020-06-16 22:28:43 [+0200], Stephen Berman wrote: >> Your assessment and predictions are right on the mark! > perfect. > >> I'm fine with the thermal.tzp=300 workaround, but it would be good to >> find out why this problem started with commit 6d25be57, if my git >> bisection was correct, or if it wasn't, then at least somewhere between >> 5.1.0 and 5.2.0. Or can you already deduce why? If not, I'd be more >> than happy to continue applying any patches or trying any suggestions >> you have, if you want to continue debugging this issue. In any case, >> thanks for pursuing it to this point. > > I have no idea why the commit in question should make any difference. > Could please apply the tracing patch on v5.1 and send the trace? Attached. Steve Berman trace-5.1.0.txt.gz Description: Binary data
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Mon, 15 Jun 2020 16:51:30 +0200 Sebastian Andrzej Siewior wrote: > I attached a modified acpi_dbg.patch. Please enable: > - CONFIG_ACPI_DEBUG=y > > Looking at your 5.1 you have tracing enabled (hope it still is). > > The attached patch will dump the date into the tracing buffer, so you > console should remain "clean". Once it records 300 of those "requests > for acpi_ev_notify_dispatch" it will stop recording. > After 4-5 minutes please do > cat /sys/kernel/debug/tracing/trace > file.txt > > compress it and send it. I set CONFIG_ACPI_DEBUG=y, applied the patch and rebuilt 5.6.4, but: $ cat /sys/kernel/debug/tracing/trace > trace.txt cat: /sys/kernel/debug/tracing/trace: No such file or directory Here are all the 5.6.4 config options with "TRACING" or "TRACE"; do I need to set any that aren't set? CONFIG_SYSCTL_EXCEPTION_TRACE=y CONFIG_TRACEPOINTS=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_PM_TRACE=y CONFIG_PM_TRACE_RTC=y CONFIG_HAVE_KPROBES_ON_FTRACE=y CONFIG_HAVE_ARCH_TRACEHOOK=y CONFIG_HAVE_RELIABLE_STACKTRACE=y # CONFIG_MAC80211_MESSAGE_TRACING is not set # CONFIG_DMA_FENCE_TRACE is not set CONFIG_IWLWIFI_DEVICE_TRACING=y # CONFIG_TRACE_SINK is not set # HW tracing support # end of HW tracing support CONFIG_STACKTRACE=y CONFIG_RCU_TRACE=y CONFIG_USER_STACKTRACE_SUPPORT=y CONFIG_NOP_TRACER=y CONFIG_HAVE_FUNCTION_TRACER=y CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y CONFIG_HAVE_DYNAMIC_FTRACE=y CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y CONFIG_HAVE_SYSCALL_TRACEPOINTS=y CONFIG_TRACE_CLOCK=y CONFIG_EVENT_TRACING=y CONFIG_CONTEXT_SWITCH_TRACER=y CONFIG_TRACING=y CONFIG_GENERIC_TRACER=y CONFIG_TRACING_SUPPORT=y CONFIG_FTRACE=y # CONFIG_BOOTTIME_TRACING is not set # CONFIG_FUNCTION_TRACER is not set # CONFIG_STACK_TRACER is not set # CONFIG_IRQSOFF_TRACER is not set # CONFIG_SCHED_TRACER is not set # CONFIG_HWLAT_TRACER is not set # CONFIG_MMIOTRACE is not set # CONFIG_FTRACE_SYSCALLS is not set # CONFIG_TRACER_SNAPSHOT is not set CONFIG_BLK_DEV_IO_TRACE=y # CONFIG_TRACE_EVENT_INJECT is not set # CONFIG_TRACEPOINT_BENCHMARK is not set # CONFIG_TRACE_EVAL_MAP_FILE is not set # CONFIG_FTRACE_STARTUP_TEST is not set CONFIG_TRACE_IRQFLAGS_SUPPORT=y CONFIG_HAVE_MMIOTRACE_SUPPORT=y CONFIG_CC_HAS_SANCOV_TRACE_PC=y # CONFIG_BACKTRACE_SELF_TEST is not set Steve Berman
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Tue, 16 Jun 2020 09:38:27 +0200 Sebastian Andrzej Siewior wrote: > On 2020-06-16 09:14:37 [+0200], Stephen Berman wrote: >> >> I set CONFIG_ACPI_DEBUG=y, applied the patch and rebuilt 5.6.4, but: >> >> $ cat /sys/kernel/debug/tracing/trace > trace.txt >> cat: /sys/kernel/debug/tracing/trace: No such file or directory >> >> Here are all the 5.6.4 config options with "TRACING" or "TRACE"; do I >> need to set any that aren't set? > > I see that "Kernel hacking ---> Tracers" is enabled. You should have > one tracer enabled for that trace_printk() to work. You have context > switch tracer so it should be enough. > > You might need CONFIG_DEBUG_FS (but it is set in your 5.1 config) or it > is just not mounted. So I have here: > > | debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime) > | tracefs on /sys/kernel/debug/tracing type tracefs > (rw,nosuid,nodev,noexec,relatime) > > Does it work for you if you mount it? # mount -t debugfs debugfs /sys/kernel/debug # mount -t tracefs tracefs /sys/kernel/debug/tracing # mount ... debugfs on /sys/kernel/debug type debugfs (rw,relatime) tracefs on /sys/kernel/debug/tracing type tracefs (rw,relatime) Yes, thanks, that did it. Trace attached. Steve Berman trace.txt.gz Description: Binary data
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Tue, 16 Jun 2020 17:55:01 +0200 Sebastian Andrzej Siewior wrote: > On 2020-06-16 10:13:27 [+0200], Stephen Berman wrote: >> Yes, thanks, that did it. Trace attached. > > So TZ10 is a temperature sensor of some kind on your motherboard. In > your v5.6 dmesg there is: > | thermal LNXTHERM:00: registered as thermal_zone0 > | ACPI: Thermal Zone [TZ10] (17 C) > > So. In /sys/class/thermal/thermal_zone0/device/path you should also see > TZ10. And /sys/class/thermal/thermal_zone0/temp should show the actual > value. > This comes from the "thermal" module. Yes, TZ10 was in the thermal_zone0/device/path and the value in thermal_zone0/temp was 16800. > Looking at the trace, might query the temperature every second which > somehow results in "Dispatching Notify on". I don't understand how it > gets from reading of the temperature to the notify part, maybe it is > part of the ACPI… > > However. Could you please make sure that the thermal module is not > loaded at system startup? Adding > thermal.off=1 > > to the kernel commandline should do the trick. And you should see >thermal control disabled > > in dmesg. Confirmed. And the value in thermal_zone0/temp was now 33000. > That means your thermal_zone0 with TZ10 does not show up in > /sys and nothing should schedule the work-items. This in turn should > allow you to shutdown your system without the delay. It did! > If this works, could you please try to load the module with tzp=300? > If you add this > thermal.tzp=300 > > to the kernel commandline then it should do the trick. You can verify it > by >cat /sys/module/thermal/parameters/tzp > > This should change the polling interval from what ACPI says to 30secs. > This should ensure that you don't have so many worker waiting. So you > should also be able to shutdown the system. Your assessment and predictions are right on the mark! I'm fine with the thermal.tzp=300 workaround, but it would be good to find out why this problem started with commit 6d25be57, if my git bisection was correct, or if it wasn't, then at least somewhere between 5.1.0 and 5.2.0. Or can you already deduce why? If not, I'd be more than happy to continue applying any patches or trying any suggestions you have, if you want to continue debugging this issue. In any case, thanks for pursuing it to this point. Steve Berman
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Fri, 22 May 2020 18:40:12 +0200 Sebastian Andrzej Siewior wrote: > Sorry for the late reply. No problem, since as it turned out, I didn't have to time till now to follow up on your latest suggestions. Details below. > On 2020-05-14 23:39:40 [+0200], Stephen Berman wrote: >> >> How will I know if that happens, is there a specific message in the tty? >> > >> > On the tty console where you see the "timing out command, waited" >> > message, there should be something starting with >> > |BUG: workqueue lockup - pool >> > >> > following with the pool information that got stuck. That code checks the >> > workqueues every 30secs by default. So if you waited >= 60secs then >> > system is not detecting a stall. >> >> As you can see in the photo, there was no message about a workqueue >> lockup, only "task halt:5320 blocked for more than seconds" every >> two minutes. I suppose that comes from one of the other options I >> enabled. Does it reveal anything about the problem? > >>From the picture, you are on your way to level 0, which would issue the > final shutdown command, but you are not quite there yet. > > If you add a printk() to the reboot syscall, then I wouldn't expect you > to see it. (something like that): > > diff --git a/kernel/reboot.c b/kernel/reboot.c > index c4d472b7f1b42..19bc35bc0cda0 100644 > --- a/kernel/reboot.c > +++ b/kernel/reboot.c > @@ -314,6 +314,7 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, > unsigned int, cmd, > char buffer[256]; > int ret = 0; > > + pr_err("%s(%d)CMD: %lx\n", __func__, __LINE__, cmd); > /* We only trust the superuser with rebooting the system. */ > if (!ns_capable(pid_ns->user_ns, CAP_SYS_BOOT)) > return -EPERM; > > If you add "ignore_loglevel initcall_debug" to the command line then you > should see the init callbacks of each driver. But there will be nothing > on your shutdown (as I expect it). > > The "task X blocked for more than 120 secs" is part of the hung task > detector. With the "ignore_loglevel" above you should be able to see the > callchain of the task. I suspect that the task poked the cd-drive which > isn't answering. So from detector's point of view, the task issued a > system call which appears to hang an makes no progress. > >> > Could >> > you please check if the stall-dector says something? >> >> Is that the message I repeated above or do you mean the workqueue? > > The hung message is not workqueue related. It is the task `halt' that > makes no progress. There is not stall of the workqueue as far as the > system can tell. > The two boot options, I suggested above, may reveal additional > information that are printed but suppressed due to the loglevel. > > My guess now is that maybe shutting down wifi also paused the AHCI > controller which makes no progress now. So booting without cdrom/disk > should not cause any problems. > > Could you please: > - try booting with "ignore_loglevel initcall_debug" and see if > additional information is printed on the console. > > - Remove cd / ATA-disk to check if anything else causes a stall. As by > your report you only mentioned those two (and if I see it correctly, > you rootFS is on nvme so removing the disk might be doable). > > - Could you remove the Wifi (just the driver, no the physical hw) to see > if it makes any difference? I recompiled kernel 5.6.4 with the printk() call you suggested, then booted the kernel with "ignore_loglevel initcall_debug" (but leaving the CDROM and wifi intact for now). After working as I normally do, I called `shutdown -h now', again as usual. After the "Bringing down the loopback interface" message there were these two messages: reboot: __do_sys_reboot(317)CMD: 89abcdef reboot: __do_sys_reboot(317)CMD: 4321fedc Then nothing more for two minutes, then, as previously: sr 5:0:0:0: tag#10 timing out command, waited 120 seconds. Then I did a hard reboot. This morning I detached the cables to the CDROM (but left the disk in the box) and again booted 5.6.4 with "ignore_loglevel initcall_debug". After working for a bit, I called `shutdown -h now', and now there were quite a few more messages, but again the machine did not power off. Here is the ouput, which I transcribed carefully, hopefully without error (I omitted the fractional parts of the timestamps and some of the usb1-portX messages): no-cdrom-shutdown-5.6.4 Description: application/text/plain After the last message here, the "echo", Call Trace, RIP and register messages were repeated exactly as above. At this point I did a hard reboot. A web search found something similar to the above output ("System hangs (bad RIP value) when disk used in pool is removed"), but didn't otherwise seem to be related to the problem I'm having. Do you still want me to check whether removing the iwlwifi driver makes a differece? And with the CDROM still detached, or does that not matter? Steve Berman
Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Tue, 09 Jun 2020 12:06:23 +0200 Stephen Berman wrote: > Do you still want me to check whether removing the iwlwifi driver makes > a differece? And with the CDROM still detached, or does that not > matter? I'm not actually sure just what you mean by removing the wifi driver, but I just now tried moving the iwlwifi firmware out of /lib/firmware, then booted 5.6.4 as before (with "ignore_loglevel initcall_debug" and with the CDROM still detached), and of course there was now no wireless interface. I started X, Emacs, Firefox (of course with no internet), VLC, used them a bit then closed them all, then called `shutdown -h now' and the message output was essentially identical to what I posted previously (but of course without "iwlwifi :02:00.0: shutdown") and the machine did not power off. So if this was the test of removing the wifi driver, then it made no difference. Steve Berman