Just in case this is an ARM64 specific thing anyone else has seen. Or
an ethernet-on-USB thing anyway.

TL;DR: the usb device under the smsc ethernet device started crashing
hard under load with the snapshot in the dmesg below. The easiest way
to reproduce was to try and sysupgrade -s (or boot with the snapshot
bsd.rd and attempt and upgrade) and every time: "stalled", and
apparent hang. Duplicated simply ftping a single source set from the
snapshot directory.

Corrections to original posting:
- My comments below about the serial console being on USB is dumb
because the USB device is on the host machine I'm using as the serial
console.
- My assertion about the power use is based on a Killawatt device,
which shows current based on the 120VAC going into the power supply.
Though, even scaled up by 10 I'm still only drawing milliamps.

I'm willing to sysupgrade to the latest snapshot for more
comprehensive testing. I can't tell from the recent changes around usb
in the tree what might have caused this, or if it's just my machine
somehow. If there is some specific instrumentation or setup I can
apply before upgrading to the recent snapshot that might help because
the failure mode doesn't really allow me to collect info afterwards.
Or, at least, I don't know how to do that.

---------- Forwarded message ---------
From: John Verne <john.ve...@gmail.com>
Date: Mon, 12 Sept 2022 at 23:56
Subject: Freeze/hang on arm64 when pushing smsc/usb
To: <m...@openbsd.org>


(Sending this to misc@ because I'm not sure it falls within the
purview of arm@, but I can repost to there if that would be more
appropriate.)

I've been tracking snapshots on a RaspPi 3+ and it looks like a recent
snapshot may have introduced some issues with the USB bus.

I don't have a lot of detail to offer, unfortunately, because the
failure mode is so sudden and complete. The symptoms are that pushing
the smsc0 ethernet device a little (for example, when fetching sets as
part of a sysupgrade -s) ftp will report "-- stalled--", barf some
messages, and then eventually all ability to interact with the system
halts.

e.g, this is me simulating a sysupgrade download by directly fetching
the current snapshot base72.tgz package:

[...]
fnord# ftp https://openbsd.cs.toronto.edu/pub/OpenBSD/snapshots/arm64/base72.tgz
Trying 128.100.17.240...
Requesting https://openbsd.cs.toronto.edu/pub/OpenBSD/snapshots/arm64/base72.tgz
  3% |**
        |  9984 KB    02:03 ETAusbd_start_next: error=5
usbd_free_xfer: xfer=0xffffff8004b6b7e8 not free
smsc0: warning: Failed to write register 0x14
  4% |**
        | 12032 KB    02:22 ETA^C
fetch aborted.
fnord# sync
[...]

That cntrl-c and sync is a desperate attempt to tell the filesystems
to get their lives in order before their untimely demise. There was no
more console activity after that sync command. In many other attempts
(either with sysupgrade -s, or booting the snapshot bsd.rd and trying
[U]pgrade that way) a cntrl-c does nothing. At most, the fetch can be
aborted but we never get a prompt again.

I'm using the serial console via a USB serial device, and both the
disk and the ethernet is on the USB bus. So I don't know if the device
is actually hung or just inaccessible. It certainly has no network
link to the local DHCP server. A forced reboot always comes with a
long fsck.

If it matters, smsc0 and bwfm0 were trunked, and bwfm0 was forced to
use the lladdr of bwfm0. At one point I pulled the ethernet and tried
a sysupgrade -s with Wi-Fi only. No surprises that this worked without
any errors. But it was too slow to actually try and sysupgrade myself
out of this mess. I'm not sure this info matters, but I'm trying to
disclose anything that was different. This Pi has a third-party touch
HDMI screen attached, but xenodm is disabled (though both work when I
enable X). There is an Apple keyboard and mouse attached as well.

When I booted from the snapshot bsd.rd I also disconnected all the
unnecessary USB devices just in case one of the devices was causing a
problem. But this didn't change anything; once we got to part of the
upgrade where it wanted to fetch sets, the download would stall and
then the entire system appeared to hang.

Since the filesystem was getting a bit ragged from all the unscheduled
restarts (at one point man started returning the wrong pages; "man
ping" returned the page for pflogd, etc.) I decided to boot from
install71 and do a fresh install and all went well with no usb or smsc
log messages.

As a data point, I'm only drawing 8uA/5.6W (not including the USB
drive, which has its own power supply) on a known-good 3A supply. So I
don't think I'm pushing the power limits here.

I saved the last dmesg.boot it made with the snapshot before rebooting
to install a release:

OpenBSD 7.2 (GENERIC) #1747: Sun Sep 11 18:58:53 MDT 2022
    dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC
real mem  = 970924032 (925MB)
avail mem = 906727424 (864MB)
random: good seed from bootblocks
mainbus0 at root: Raspberry Pi 3 Model B Rev 1.2
cpu0 at mainbus0 mpidr 0: ARM Cortex-A53 r0p4
cpu0: 32KB 64b/line 2-way L1 VIPT I-cache, 32KB 64b/line 4-way L1 D-cache
cpu0: 512KB 64b/line 16-way L2 cache
cpu0: CRC32,ASID16
apm0 at mainbus0
efi0 at mainbus0: UEFI 2.8
efi0: Das U-Boot rev 0x20211000
simplefb0 at mainbus0: 640x480, 32bpp
wsdisplay0 at simplefb0 mux 1
wsdisplay0: screen 0-5 added (std, vt100 emulation)
"system" at mainbus0 not configured
"axi" at mainbus0 not configured
simplebus0 at mainbus0: "soc"
bcmclock0 at simplebus0
bcmmbox0 at simplebus0
bcmgpio0 at simplebus0
bcmaux0 at simplebus0
bcmdmac0 at simplebus0: DMA0 DMA2 DMA4 DMA5 DMA8 DMA9 DMA10 DMA11
bcmintc0 at simplebus0
pluart0 at simplebus0: rev 2, 16 byte fifo
pluart0: console
bcmsdhost0 at simplebus0: 250 MHz base clock
sdmmc0 at bcmsdhost0: 4-bit, sd high-speed, mmc high-speed, dma
dwctwo0 at simplebus0
bcmdog0 at simplebus0
bcmrng0 at simplebus0
bcmtemp0 at simplebus0
"local_intc" at simplebus0 not configured
sdhc0 at simplebus0
sdhc0: SDHC 3.0, 200 MHz base clock
sdmmc1 at sdhc0: 4-bit, sd high-speed, mmc high-speed
"firmware" at simplebus0 not configured
"power" at simplebus0 not configured
"mailbox" at simplebus0 not configured
"gpiomem" at simplebus0 not configured
"fb" at simplebus0 not configured
"vcsm" at simplebus0 not configured
"virtgpio" at simplebus0 not configured
"clocks" at mainbus0 not configured
"phy" at mainbus0 not configured
"arm-pmu" at mainbus0 not configured
agtimer0 at mainbus0: 19200 kHz
gpioleds0 at mainbus0: "led0"
"fixedregulator_3v3" at mainbus0 not configured
"fixedregulator_5v0" at mainbus0 not configured
"bootloader" at mainbus0 not configured
usb0 at dwctwo0: USB revision 2.0
scsibus0 at sdmmc0: 2 targets, initiator 0
sd0 at scsibus0 targ 1 lun 0: <SD/MMC, NCard, 0010> removable
sd0: 29818MB, 512 bytes/sector, 61067264 sectors
uhub0 at usb0 configuration 1 interface 0 "Broadcom DWC2 root hub" rev
2.00/1.00 addr 1
uhub1 at uhub0 port 1 configuration 1 interface 0 "Standard
Microsystems product 0x9514" rev 2.00/2.00 addr 2
bwfm0 at sdmmc1 function 1
manufacturer 0x02d0, product 0xa9a6 at sdmmc1 function 2 not configured
smsc0 at uhub1 port 1 configuration 1 interface 0 "Standard
Microsystems SMSC9512/14" rev 2.00/2.00 addr 3
smsc0: address b8:27:eb:83:2b:99
ukphy0 at smsc0 phy 1: Generic IEEE 802.3u media interface, rev. 3:
OUI 0x0001f0, model 0x000c
umass0 at uhub1 port 3 configuration 1 interface 0 "USB2.0 & IEEE1394
Combo Device ATAPI-6 Bridge Controller" rev 2.00/0.01 addr 4
umass0: using SCSI over Bulk-Only
scsibus1 at umass0: 2 targets, initiator 0
sd1 at scsibus1 targ 1 lun 0: <ST325062, 0A, 3.AA>
sd1: 238475MB, 512 bytes/sector, 488397169 sectors
uhub2 at uhub1 port 4 configuration 1 interface 0 "Apple Inc. Keyboard
Hub" rev 2.00/96.15 addr 5
uhidev0 at uhub2 port 1 configuration 1 interface 0 "Mitsumi Electric
Apple Optical USB Mouse" rev 1.10/1.10 addr 6
uhidev0: iclass 3/1
ums0 at uhidev0: 4 buttons, Z and W dir
wsmouse0 at ums0 mux 0
uhidev1 at uhub2 port 2 configuration 1 interface 0 "Apple Inc. Apple
Keyboard" rev 2.00/0.70 addr 7
uhidev1: iclass 3/1
ukbd0 at uhidev1: 8 variable keys, 5 key codes, country code 33
wskbd0 at ukbd0 mux 1
wskbd0: connecting to wsdisplay0
uhidev2 at uhub2 port 2 configuration 1 interface 1 "Apple Inc. Apple
Keyboard" rev 2.00/0.70 addr 7
uhidev2: iclass 3/0
ucc0 at uhidev2: 7 usages, 7 keys, enum
wskbd1 at ucc0 mux 1
wskbd1: connecting to wsdisplay0
uhidev3 at uhub1 port 5 configuration 1 interface 0 "wch.cn
USB2IIC_CTP_CONTROL" rev 0.01/0.00 addr 8
uhidev3: iclass 3/0, 3 report ids
ums1 at uhidev3 reportid 1: 1 button, tip
wsmouse1 at ums1 mux 0
uhid0 at uhidev3 reportid 2: input=0, output=0, feature=1
ums2 at uhidev3 reportid 3: 0 buttons
wsmouse2 at ums2 mux 0
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
root on sd1a (9d45e2bf18327ad1.a) swap on sd1b dump on sd1b
WARNING: /mnt was not properly unmounted
WARNING: CHECK AND RESET THE DATE!
gpio0 at bcmgpio0: 54 pins
bwfm0: address b8:27:eb:d6:7e:cc


-- 
John Verne <john.ve...@gmail.com>

Reply via email to