Mike Larkin <mlar...@nested.page> writes:

> On Sun, Nov 03, 2024 at 02:03:15PM +0100, Kirill A. Korinsky wrote:
>> On Sun, 03 Nov 2024 13:28:16 +0100,
>> Mike Larkin <mlar...@nested.page> wrote:
>> >
>> > This is exactly what many of us do, every day. So I'm not sure what's
>> > triggering your scenario. Any way to narrow it down more than "just use the
>> > system for a day or two"? Eg, "here's a script you can run inside an 
>> > alpine VM
>> > that triggers the issue"?
>> >
>>
>> Frankly, I have no idea how to narrow it down future. I use this VM to run
>> docker-compose and it works fine, until the system is degradated.
>>
>> > I'm guessing this isn't a vmd/vmm thing, as those components don't interact
>> > with acpi. We have seen stuck acpi threads on other machines after 
>> > un-zzz/un-ZZZ
>> > in some cases. Were you doing suspends or hibernates?
>>
>> Well, when the system degraded that time I, in addition to extremely slow
>> IO, had seen srdis consuming a lot of resources and quit toxic leads to stop
>> consuming it. So, I am not completly sure that this is only acpi related
>> things.
>>
>
> That's softraid and vmd is not likely causing this unless you're writing tons
> of data suddenly and even then it shouldn't even register as anything crazy
> high.
>
>> I do suspend often but during my expirement with snapshot's kernel I keep
>> system on AC power to avoid any suspend/hibernate and it had hit it.
>>

The combination of AC power (so your CPUs will be running at max
frequency) and Docker on Linux...how hot does this machine get out of
curiosity?

>> If you can suggest anything that I can do to collect some usefull data when
>> I hit it next time, I'll be appreciated and do it.
>>

Try to check the thermal sensors output. I'm wondering if this is a
hardware issue and you're pushing the temperature of the NVMe controller
past its comfort point and shit is going sideways. I think some can
throttle themselves in an effort to prevent turning into glass.

>
> dunno, your system seems to be behaving weird. I don't think vmd/vmm is 
> causing
> excessive acpi interrupts or softraid overhead though. Those things are pretty
> much unrelated.
>
> The only thing I can suggest is try and see if some specific action causes the
> problem and try to narrow it down.
>
>> --
>> wbr, Kirill

Reply via email to