On May 22, 2023, at 2:27 AM, Claudio Jeker <clau...@openbsd.org> wrote:
> I have seen these WITNESS warnings on other systems as well. I doubt this
> is the problem. IIRC this warning is because sys_mount() is doing it wrong
> but it is not really an issue since sys_mount is not called often.

Yup. I see that now that I have tested witness on several arches. They all
show this lock order reversal right after booting the system. I guess this
means what I am seeing isn’t something that witness detects.

On -current with my T4-1, I can reliably reproduce the issues I am seeing.
While the problem is intermittent I can’t get very far into the jdk build 
without
tripping it. Instructions for reproducing the issue are:

Add wxallowed to /usr/local/ and /usr/ports (or wherever WRKOBJDIR has
been changed to)

doas pkg_add jdk zip unzip cups-libs bash gmake libiconv giflib

cd /usr/ports/devel/jdk/1.8
FLAVOR=native_bootstrap make

There are two stages to the problem. A java command (or javac or javah)
gets stuck making forward progress and nearly all of its cpu time is in
sys time category. You can see this in top as 1500-3000% CPU time on
the java process. ktrace of the process in this state shows endless
sched_yield() calls. Debugging shows many threads in
pthread_cond_wait(3). The condition vars are configured to use
CLOCK_MONOTONIC.

The second stage of the problem is when things lock up. While java is
spinning in this sched_yield() state, if you display the process arguments in
top (pressing the right arrow) you trip the lockups. top stops responding.
getty will reprompt if enter is pressed, but locks up if a username is entered.
Most processes lock up when doing anything after this point. ddb ps at this
stage shows top waiting on vmmaplk and the rest of the stuck processes
waiting on sysctllk (sshd, systat, login).

I tried bisecting when this was introduced but as I go back in time with
kernels it becomes more intermittent and I didn’t notice that so I would need
to redo the bisecting. I can say I have seen the problem reproduce as far
back as Feb 16th kernel. When I updated the jdk in late January I didn’t notice
it but it could have been a lucky build as I tend to only do one 
native_bootstrap
build of the jdk when updating as a way to test the resulting package.

Here is some sample output of top, systat and ddb ps output on -current in
my last reproduction of the problem.

load averages: 15.27,  4.25,  1.68                                              
                                                    
oracle.intricatesoftware.com 16:13:10
64 processes: 62 idle, 2 on processor                                           
                                                                       up 0 
days 00:04:58
64  CPUs:  0.0% user,  1.8% nice, 55.5% sys,  0.4% spin,  0.1% intr, 42.2% idle
Memory: Real: 143M/2676M act/tot Free: 13G Cache: 2334M Swap: 0K/16G
  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
38582 _pbuild   10    5  290M   64M onproc/19 fsleep   17:16 2892.14% javac

   3 users Load 15.27 4.25 1.68                    oracle.intricatesof 16:13:09

            memory totals (in KB)            PAGING   SWAPPING     Interrupts
           real   virtual     free           in  out   in  out    12777 total
Active   146752    146752 13198032   ops                                vcons0
All     2740160   2740160 29713104   pages                            5 vpci0:0
                                                                      2 mpii0
Proc:r  d  s  w    Csw   Trp   Sys   Int   Sof  Flt       forks         mpii1
    18   148     78089 18666  8090 94097      18891       fkppw       2 em0
                                                          fksvm      31 ehci0
   0.1%Int   0.3%Spn  55.1%Sys   1.8%Usr  42.7%Idle       pwait   12737 clock
|    |    |    |    |    |    |    |    |    |    |       relck
============================>                             rlkok
                                                          noram
Namei         Sys-cache    Proc-cache    No-cache         ndcpy
    Calls     hits    %    hits     %    miss   %         fltcp
                                                          zfod
                                                          cow
Disks   cd0   sd0   sd1   sd2   cd1   sd3           66409 fmin
seeks                                               88545 ftarg
xfers                       3                             itarg
speed                     18K                           2 wired
  sec                     0.0                             pdfre
                                                          pdscn
                                                          pzidl       1 IPKTS
                                                       13 kmape       1 OPKTS

ddb{0}> ps
   PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
 94135  299634   9252      0  3        0x82  sysctllk      sshd
 38582  227559  44647     55  3    0x200083  fsleep        javac
 38582  430054  44647     55  3   0x4200083  fsleep        javac
 38582  454002  44647     55  7   0x4200003                javac
 38582   92205  44647     55  3   0x4200083  fsleep        javac
 38582  391595  44647     55  7   0x4200003                javac
 38582  306671  44647     55  7   0x4200003                javac
 38582  444915  44647     55  7   0x4200003                javac
 38582   68842  44647     55  7   0x4200003                javac
 38582  210759  44647     55  7   0x4200003                javac
 38582  295002  44647     55  3   0x4200003  uobjlk        javac
 38582   20089  44647     55  7   0x4200003                javac
 38582  426972  44647     55  7   0x4200003                javac
 38582  485651  44647     55  7   0x4200003                javac
 38582  198526  44647     55  7   0x4200003                javac
 38582  140392  44647     55  7   0x4200003                javac
 38582  426536  44647     55  7   0x4200003                javac
 38582  237845  44647     55  3   0x4200003  vmmaplk       javac
 38582   11519  44647     55  7   0x4200003                javac
 38582  493801  44647     55  7   0x4200003                javac
 38582  431112  44647     55  7   0x4200003                javac
 38582    5954  44647     55  3   0x4200003  uobjlk        javac
 38582  334521  44647     55  7   0x4200003                javac
 38582   63717  44647     55  7   0x4200003                javac
 38582  278840  44647     55  3   0x4200003  uobjlk        javac
 38582  288942  44647     55  3   0x4200003  vmmaplk       javac
 38582  358539  44647     55  3   0x4200003  vmmaplk       javac
 38582  292341  44647     55  3   0x4200003  uobjlk        javac
 38582   71353  44647     55  7   0x4200003                javac
 38582  173462  44647     55  7   0x4200003                javac
 38582  369162  44647     55  7   0x4200003                javac
 38582  103608  44647     55  3   0x4200003  uobjlk        javac
 38582  283692  44647     55  7   0x4200003                javac
 38582   73296  44647     55  7   0x4200003                javac
 38582  107123  44647     55  7   0x4200003                javac
 38582  190345  44647     55  7   0x4200003                javac
 38582  519032  44647     55  7   0x4200003                javac
 38582  512468  44647     55  7   0x4200003                javac
 38582   30652  44647     55  7   0x4200003                javac
 38582  379236  44647     55  3   0x4200003  uobjlk        javac
 38582  269178  44647     55  7   0x4200003                javac
 38582  268232  44647     55  3   0x4200003  uobjlk        javac
 38582  352770  44647     55  7   0x4200003                javac
 38582  377632  44647     55  7   0x4200003                javac
 38582  244819  44647     55  7   0x4200003                javac
 38582   72610  44647     55  7   0x4200003                javac
 38582  469953  44647     55  3   0x4200083  fsleep        javac
 38582  371383  44647     55  3   0x4200083  fsleep        javac
 38582  158857  44647     55  3   0x4200083  fsleep        javac
 38582   11735  44647     55  3   0x4200083  fsleep        javac
 38582   84102  44647     55  3   0x4200083  fsleep        javac
 38582   75248  44647     55  3   0x4200083  fsleep        javac
 38582  375420  44647     55  3   0x4200083  fsleep        javac
 38582  518610  44647     55  3   0x4200083  fsleep        javac
 38582  127230  44647     55  3   0x4200003  vmmaplk       javac
 38582   83423  44647     55  3   0x4200083  fsleep        javac
 38582  159603  44647     55  3   0x4200083  fsleep        javac
 38582   31091  44647     55  3   0x4200083  fsleep        javac
 38582  390888  44647     55  3   0x4200083  fsleep        javac
 38582  166886  44647     55  3   0x4200083  fsleep        javac
 38582  377933  44647     55  3   0x4200083  fsleep        javac
 38582  376867  44647     55  3   0x4200083  fsleep        javac
 38582  110280  44647     55  3   0x4200083  fsleep        javac
 38582  396501  44647     55  3   0x4200083  fsleep        javac
 38582  337157  44647     55  3   0x4200083  fsleep        javac
 38582  312539  44647     55  3   0x4200083  fsleep        javac
 38582  439725  44647     55  3   0x4200083  fsleep        javac
 38582  174728  44647     55  3   0x4200083  fsleep        javac
 38582  350117  44647     55  3   0x4200083  fsleep        javac
 38582  336583  44647     55  3   0x4200083  fsleep        javac
 44647   53285   6389     55  3    0x100089  sigsusp       sh
  6389    9627   3586     55  3    0x10008b  sigsusp       sh
  3586  519556  31484     55  3        0x83  wait          gmake
 29401  282226  87417     55  3    0x100083  piperd        tee
 31484  384291  87417     55  3        0x81  wait          bash
 10686  220950  84161     55  3    0x100083  piperd        tee
 87417   52900  84161     55  3        0x81  wait          bash
 84161  354717  90894     55  3        0x83  wait          bash
 90894  299418  54844     55  3    0x100089  sigsusp       sh
 54844   38202  23322     55  3    0x10008b  sigsusp       sh
 23322  499463  12410     55  3        0x83  wait          gmake
  2137  306483   5971   1000  3        0x83  sysctllk      systat
 96354  374280      1   1000  3    0x1000b0  kqread        ssh-agent
  5971  326285  53265   1000  3    0x10008b  sigsusp       ksh
 53265   95696  81543   1000  3        0x98  kqread        sshd
 81543  262878   9252      0  3        0x82  kqread        sshd
 12410   25695  53684   1000  3    0x10008b  sigsusp       make
 53684  389550   2559   1000  3    0x10008b  sigsusp       sh
 91269  101622  48618   1000  3    0x100003  vmmaplk       top
 26730   50864      1   1000  3    0x1000b0  kqread        ssh-agent
 48618  194268  38192   1000  3    0x10008b  sigsusp       ksh
 38192   94599  15809   1000  3        0x98  kqread        sshd
 15809  420302   9252      0  3        0x82  kqread        sshd
  2559  172546  11253   1000  3    0x10008b  sigsusp       make
 48347  451749      1   1000  3    0x1000b0  kqread        ssh-agent
 11253  329919  75474   1000  3    0x10008b  sigsusp       ksh
 75474  434695  24864   1000  3        0x98  kqread        sshd
 24864  460675   9252      0  3        0x82  kqread        sshd
 80339  457654      1      0  3        0x83  sysctllk      login
  2783  471804      1      0  3    0x100098  kqread        cron
 13986  329572      1     99  3   0x1100090  kqread        sndiod
 17161  216329      1    110  3    0x100090  kqread        sndiod
  7125  459497  81420     95  3   0x1100092  kqread        smtpd
 34588  306264  81420    103  3   0x1100092  kqread        smtpd
 35539  431962  81420     95  3   0x1100092  kqread        smtpd
 95046  110434  81420     95  3    0x100092  kqread        smtpd
 75832  395854  81420     95  3   0x1100092  kqread        smtpd
 90905  131444  81420     95  3   0x1100092  kqread        smtpd
 81420   40917      1      0  3    0x100080  kqread        smtpd
  9252  412804      1      0  3        0x88  kqread        sshd
 77215  300303  47896      0  3        0x80  nfsd          nfsd
 39163  328218  47896      0  3        0x80  nfsd          nfsd
 18940   88416  47896      0  3        0x80  nfsd          nfsd
 30770  448875  47896      0  3        0x80  nfsd          nfsd
 47896   35491      1      0  3        0x80  netcon        nfsd
 69834  381834   1172      0  3        0x80  kqread        mountd
  1172    8898      1      0  3    0x100080  kqread        mountd
 34849  159509      1     28  3   0x1100090  kqread        portmap
 99111  374653      1      0  3    0x100080  kqread        ntpd
 96514  463552  69056     83  3    0x100092  sysctllk      ntpd
 69056  185686      1     83  3   0x1100092  sysctllk      ntpd
 76215   96020  37258     74  3   0x1100092  bpf           pflogd
 37258  158509      1      0  3        0x80  netio         pflogd
 74134   93758  35337     73  3   0x1100090  kqread        syslogd
 35337  436947      1      0  3    0x100082  netio         syslogd
 49461  134918      1      0  3    0x100080  kqread        resolvd
 15802  138340  35598     77  3    0x100092  kqread        dhcpleased
 85374  145907  35598     77  3    0x100092  kqread        dhcpleased
 35598   69070      1      0  3        0x80  kqread        dhcpleased
 78965  485143  22355    115  3    0x100092  kqread        slaacd
 49372  473630  22355    115  3    0x100092  kqread        slaacd
 22355  252711      1      0  3    0x100080  kqread        slaacd
 77472  293410      0      0  3     0x14200  bored         smr
 96287    2177      0      0  3     0x14200  pgzero        zerothread
 22397  479660      0      0  3     0x14200  aiodoned      aiodoned
 40550  268537      0      0  3     0x14200  syncer        update
 30711  491254      0      0  3     0x14200  cleaner       cleaner
 19845  282487      0      0  3     0x14200  reaper        reaper
 85959  130144      0      0  3     0x14200  pgdaemon      pagedaemon
 70483  182566      0      0  3     0x14200  usbtsk        usbtask
 51296  152380      0      0  3     0x14200  usbatsk       usbatsk
 16787  444929      0      0  3     0x14200  bored         sensors
 63429   82409      0      0  7  0x40014200                idle63
 92645  493403      0      0  7  0x40014200                idle62
 20350  375563      0      0  7  0x40014200                idle61
 35222   74521      0      0  7  0x40014200                idle60
 33426  375581      0      0  7  0x40014200                idle59
 22679  177213      0      0  7  0x40014200                idle58
 14639  386227      0      0  7  0x40014200                idle57
 39400  120262      0      0  7  0x40014200                idle56
 12654    7800      0      0  7  0x40014200                idle55
 48608  174405      0      0  3  0x40014200                idle54
 94678  113021      0      0  7  0x40014200                idle53
  6170  359112      0      0  3  0x40014200                idle52
 92056  451703      0      0  7  0x40014200                idle51
 94062  149790      0      0  3  0x40014200                idle50
  7490  352645      0      0  7  0x40014200                idle49
 58618  321162      0      0  3  0x40014200                idle48
 34596   30496      0      0  3  0x40014200                idle47
 42019  493114      0      0  7  0x40014200                idle46
 19753  346089      0      0  3  0x40014200                idle45
 35289  255668      0      0  7  0x40014200                idle44
 71157  135665      0      0  7  0x40014200                idle43
  2489  156561      0      0  7  0x40014200                idle42
 69267  449921      0      0  3  0x40014200                idle41
 17078   23764      0      0  3  0x40014200                idle40
 18986  140559      0      0  7  0x40014200                idle39
 31565  508859      0      0  3  0x40014200                idle38
 83892  503397      0      0  3  0x40014200                idle37
  3971  218332      0      0  3  0x40014200                idle36
 56087  426177      0      0  3  0x40014200                idle35
  8255  404604      0      0  3  0x40014200                idle34
 88769  308362      0      0  7  0x40014200                idle33
 17805  369954      0      0  7  0x40014200                idle32
 44764  388232      0      0  7  0x40014200                idle31
  6184  164272      0      0  7  0x40014200                idle30
 57793  246870      0      0  3  0x40014200                idle29
 94847  282657      0      0  3  0x40014200                idle28
 20730  274685      0      0  7  0x40014200                idle27
 49977  358582      0      0  3  0x40014200                idle26
 76662  491808      0      0  3  0x40014200                idle25
   339  381654      0      0  7  0x40014200                idle24
 95798  435838      0      0  7  0x40014200                idle23
 17729  220014      0      0  3  0x40014200                idle22
 32933  457100      0      0  3  0x40014200                idle21
 93991  300506      0      0  7  0x40014200                idle20
 83270   35155      0      0  3  0x40014200                idle19
   113  190949      0      0  7  0x40014200                idle18
 47202   13764      0      0  3  0x40014200                idle17
 26572   32690      0      0  7  0x40014200                idle16
 79156   94305      0      0  3  0x40014200                idle15
 91078  227738      0      0  3  0x40014200                idle14
 92007  190955      0      0  3  0x40014200                idle13
 30258  465800      0      0  7  0x40014200                idle12
 21174  286794      0      0  7  0x40014200                idle11
 38572  384863      0      0  3  0x40014200                idle10
 39753  266722      0      0  7  0x40014200                idle9
  2402  328673      0      0  3  0x40014200                idle8
  7204  135132      0      0  3  0x40014200                idle7
 50857  147720      0      0  3  0x40014200                idle6
 98253  405788      0      0  3  0x40014200                idle5
 45666  511079      0      0  3  0x40014200                idle4
 31437  507896      0      0  3  0x40014200                idle3
 22223  410682      0      0  7  0x40014200                idle2
 66020  259824      0      0  3  0x40014200                idle1
  1702  438104      0      0  3     0x14200  bored         softnet3
 79408  470282      0      0  3     0x14200  bored         softnet2
 65184  187165      0      0  3     0x14200  bored         softnet1
 19984  214261      0      0  3     0x14200  bored         softnet0
 96155  318872      0      0  3     0x14200  bored         systqmp
 56400  116597      0      0  3     0x14200  bored         systq
 90431  390796      0      0  3  0x40014200  bored         softclock
*64111  191058      0      0  7  0x40014200                idle0
 18862  308472      0      0  3     0x14200  kmalloc       kmthread
     1  421752      0      0  3        0x82  wait          init
     0       0     -1      0  3     0x10200  scheduler     swapper


Reply via email to