> From: Eric Grosse
> Date: Mon, 10 Jun 2024 11:17:23 -0700
>
> A bit of progress: in a kernel built with George Koehler's suggested
> replacement of isync by sync and with uncommented GENERIC.MP
> option MP_LOCKDEBUG
> option WITNESS
> and (getting rid of any dependence on Go) generating load
> b
A bit of progress: in a kernel built with George Koehler's suggested
replacement of isync by sync and with uncommented GENERIC.MP
option MP_LOCKDEBUG
option WITNESS
and (getting rid of any dependence on Go) generating load
by running make -j64 build in /usr/src, I fairly quickly get a panic:
panic
Disregard this speculation about a possible 32bit int issue. I've
reproduced the panic with a smaller pr_nget.
On Thu, Jun 6, 2024 at 11:37 PM Eric Grosse wrote:
>
> Is the large (greater than 2^32) value of pr_nget below
> exceptional? My crashes only happen for long-running
> heavy workloads so
Is the large (greater than 2^32) value of pr_nget below
exceptional? My crashes only happen for long-running
heavy workloads so a big value seems plausible but
maybe there is some limit I'm supposed to reconfigure
for such workloads?
panic: pmap_enter: failed to allocate pted
Stopped at panic
> There's a corruption...
>
> > ddb{7}> show panic
> > cpu6: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 &&
> > rw_lock_held(
> > uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file
> > "/sys/uvm/uvm_vnod
> > e.c", line 953
> >
> > *cpu7: assertwaitok: non-zero mutex count
George, thank you for the suggestion of changing membar_enter and
membar_consumer
from isync to sync. I did that and the frequency of crashes went way
down, admittedly on
a workload that is not solidly reproducible. But last night there was
finally another crash (see below)
so that's not the full s
On Thu, 30 May 2024 13:11:41 -0700
Eric Grosse wrote:
> ddb{7}> show panic
>
> cpu6: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 &&
> rw_lock_held(
> uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file
> "/sys/uvm/uvm_vnod
> e.c", line 953
>
> *cpu7: assertwaitok: non-ze
And, fairly quickly, another one. The load depends on what's in the Go
team build queue, which is not under my control.To avoid further
spamming the list I won't report any more of these until I can get
something reproducible under my control. Of course, anyone interested
may contact me directly if
openbsd-ppc64-n2vi got another crash:
UVM_PSEG_INUSE failed uvm_pager.c:227
panic
uvm_pseg_release
uvn_io
uvn_get
uvm_fault_lower
uvm_fault
trap
trapagain
type 300
during a bunch of go compiles.
On Mon, May 27, 2024 at 5:34 PM Jeremie Courreges-Anglas
wrote:
>
> On Sat, May 25, 2024 at
On Sat, May 25, 2024 at 12:35:16AM -0400, George Koehler wrote:
> On Tue, 21 May 2024 03:08:49 +0200
> Jeremie Courreges-Anglas wrote:
>
> > On Tue, May 21, 2024 at 02:51:39AM +0200, Jeremie Courreges-Anglas wrote:
> > > This doesn't look powerpc64-specific. It feels like
> > > uvm_km_kmemalloc_
On Tue, 21 May 2024 03:08:49 +0200
Jeremie Courreges-Anglas wrote:
> On Tue, May 21, 2024 at 02:51:39AM +0200, Jeremie Courreges-Anglas wrote:
> > This doesn't look powerpc64-specific. It feels like
> > uvm_km_kmemalloc_pla() should call pmap_enter() with PMAP_CANFAIL and
> > unwind in case of a
The -stable version "crash1" was reproducible almost every run; each
run is about an hour on this 8-processor Power9 running a load average
about 30. The -current version "crash2" has only happened once so far,
though because of other issues (hitting a user process limit of 126)
it was failing earl
On Tue, May 21, 2024 at 02:51:39AM +0200, Jeremie Courreges-Anglas wrote:
> On Sat, May 18, 2024 at 01:11:56PM -0700, Eric Grosse wrote:
> > The openbsd-ppc64-n2vi Go builder machine is converting over to LUCI
> > build infrastructure and the new workload may have stepped on a
> > pagedaemon corner
On Sat, May 18, 2024 at 01:11:56PM -0700, Eric Grosse wrote:
> The openbsd-ppc64-n2vi Go builder machine is converting over to LUCI
> build infrastructure and the new workload may have stepped on a
> pagedaemon corner case. While running 7.5-stable I reproducibly get
> kernel panics "pmap_enter: fa
14 matches
Mail list logo