Mischa <open...@mlst.nl> writes:

> On 2023-09-06 05:36, Dave Voutila wrote:
>> Mischa <open...@mlst.nl> writes:
>>> On 2023-09-05 14:27, Dave Voutila wrote:
>>>> Mike Larkin <mlar...@nested.page> writes:
>>>>
>>>>> On Mon, Sep 04, 2023 at 07:57:18PM +0200, Mischa wrote:
>>>>>> On 2023-09-04 18:58, Mischa wrote:
>>>>>> > On 2023-09-04 18:55, Mischa wrote:
>>>> /snip
>>>>
>>>>>> > > Adding the sleep 2 does indeed help. I managed to get 20 VMs started
>>>>>> > > this way, before it would choke on 2-3.
>>>>>> > >
>>>>>> > > Do I only need the unpatched kernel or also the vmd/vmctl from snap?
>>>>>> >
>>>>>> > I do still get the same message on the console, but the machine isn't
>>>>>> > freezing up.
>>>>>> >
>>>>>> > [umd173152/210775 sp=7a5f577a1780 inside 702698535000-702698d34fff: not
>>>>>> > MAP_STACK
>>>>>> Starting 30 VMs this way caused the machine to become unresponsive
>>>>>> again,
>>>>>> but nothing on the console. :(
>>>>>> Mischa
>>>>> Were you seeing these uvm errors before this diff? If so, this
>>>>> isn't
>>>>> causing the problem and something else is.
>>>> I don't believe we solved any of the underlying uvm issues in Bruges
>>>> last year. Mischa, can you test with just the latest
>>>> snapshot/-current?
>>>> I'd imagine starting and stopping many vm's now is exacerbating the
>>>> issue because of the fork/exec for devices plus the ioctl to do a uvm
>>>> share into the device process address space.
>>>>
>>>>> If this diff causes the errors to occur, and without the diff it's
>>>>> fine, then
>>>>> we need to look into that.
>>>>> Also I think a pid number in that printf might be useful, I'll see
>>>>> what I can
>>>>> find. If it's not vmd causing this and rather some other process
>>>>> then that
>>>>> would be good to know also.
>>>> Sadly it looks like that printf doesn't spit out the offending
>>>> pid. :(
>>> Just to confirm I am seeing this behavior on the latest snap
>>> without
>>> the patch as well.
>> Since this diff isn't the cause, I've committed it. Thanks for
>> testing. I'll see if I can reproduce your MAP_STACK issues.
>>
>>> Just started 10 VMs with sleep 2, machine freezes, but nothing on the
>>> console. :(
>> For now, I'd recommend spacing out vm launches. I'm pretty sure it's
>> related to the uvm corruption we saw last year when creating, starting,
>> and destroying vm's rapidly in a loop.
>
> That could very well be the case. I will adjust my start script, so
> far I've got good results with a 10 second sleep.
>
> Is there some additional debugging I can turn that makes sense for
> this? I can easily replicate.
>

Highly doubtful if the issue is what I think. The only thing would be
making sure you're running in a way to see any panic and drop into
ddb. If you're using X or not on the the primary console or serial
connection it might just appear as a deadlocked system during a panic.

-dv

Reply via email to