On Thu, May 16, 2024 at 08:52:24AM -0400, Johan Huldtgren wrote:
> hello,
> 
> On 2024-05-16  8:14, Dave Voutila wrote:
> > 
> > Johan Huldtgren <johan+openbsd-b...@huldtgren.com> writes:
> > 
> > > hello,
> > >
> > > On 2024-05-15 17:31, Dave Voutila wrote:
> > >>
> > >> Johan Huldtgren <johan+openbsd-b...@huldtgren.com> writes:
> > >>
> > >> >> Synopsis:     vmm guest does not get IP after upgrade to 7.5
> > >> >> Category:     vmd
> > >> >> Environment:
> > >> >        System      : OpenBSD 7.5
> > >> >        Details     : OpenBSD 7.5 (GENERIC.MP) #82: Wed Mar 20 15:48:40 
> > >> > MDT 2024
> > >> >                         
> > >> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > >> >
> > >> >        Architecture: OpenBSD.amd64
> > >> >        Machine     : amd64
> > >> >> Description:
> > >> > I recently upgraded one of my machines from 7.4 to 7.5, and noticed
> > >> > that the vmm guest I run on there wasn't getting an IP. I did
> > >> > some rudimentary tcpdumping on each side but nothing jumped out, I
> > >> > saw the dhcp request go out on the guest and I saw it being received
> > >> > on the host but that was it. Configuring the guest with a static IP
> > >> > resolves the issue, so the issue seems to be directly related to dhcp.
> > >> >
> > >> > The guest I'm running is quite old and cannot be upgraded, however it's
> > >> > been working fine as a guest for a long time and hasn't been changed.
> > >> >
> > >> > For completness sake I did try creating a switch stanza for bridge0
> > >> > and directing interface tap0 to use that, but it made no discernable
> > >> > difference.
> > >> >
> > >> > Relevant configs:
> > >> >
> > >> > # host (OpenBSD 7.5 + syspatches)
> > >> >
> > >> > $ doas cat /etc/vm.conf
> > >> > vm "guest.vm" {
> > >> >         disk "/home/vm/guest.img"
> > >> >         owner johan
> > >> >         memory 4G
> > >> >         local interface tap0
> > >>
> > >> Why are you using "local interface tap0" and then putting tap0 in a
> > >> bridge(4) with a trunk(4)? I'm not an networking person but that seems
> > >> odd to me.
> > >
> > > Entierly possible I'm doing this wrong. This is the only setup I have
> > > where I tried using local interface, everywhere else I define the switch
> > > so I probably just carried that part of the config over. I modified it
> > > to normalize my config so it's similar to all my others.
> > >
> > > $ doas cat /etc/vm.conf
> > >
> > > switch "uplink" {
> > >         interface bridge0
> > > }
> > >
> > > vm "guest.vm" {
> > >         disk "/home/vm/gallery.img"
> > >         owner johan
> > >         memory 3.5G
> > >         interface tap0 {
> > >                 switch "uplink"
> > >         }
> > > }
> > >
> > >> The major change in 7.5 is the emulated virtio network device is now
> > >> multi-threaded. If removing tap0 from your bridge doesn't fix it, can
> > >> you run vmd with debug logging and check the output for that particular
> > >> guests's vionet process?
> > >>
> > >> It will potentially be pretty chatty, but you should see messages about
> > >> dhcp packet interception and reply injection.
> > >>
> > >> # rcctl stop vmd
> > >> # $(which vmd) -dvv
> > >>
> > >> You might need to tweak the guest memory to 3.5G to get around memory
> > >> limits when running vmd in the foreground.
> > >
> > > # $(which vmd) -dvv
> > > vmd: startup
> > > vmd: /etc/vm.conf:11: switch "uplink" registered
> > > vmd: vm_register: registering vm 1
> > > vmd: /etc/vm.conf:27: vm "guest.vm" registered (enabled)
> > > warning: macro 'sets' not used
> > > vmd: vm_priv_brconfig: interface bridge0 description switch1-uplink
> > > vmd: vmd_configure: setting staggered start configuration to parallelism: 
> > > 4 and delay: 30
> > > vmd: vmd_configure: starting vms in staggered fashion
> > > vmd: start_vm_batch: starting batch of 4 vms
> > > vmd: vm_opentty: vm guest.vm tty /dev/ttyp0 uid 1000 gid 4 mode 620
> > > vmd: start_vm_batch: done starting vms
> > > vmm: config_getconfig: vmm retrieving config
> > > vmm: vm_register: registering vm 1
> > > priv: config_getconfig: priv retrieving config
> > > control: config_getconfig: control retrieving config
> > > agentx: config_getconfig: agentx retrieving config
> > > vmd: vm_priv_ifconfig: interface tap0 description vm1-if0-guest.vm
> > > vmd: vm_priv_ifconfig: switch "uplink" interface bridge0 add tap0
> > > vmd: started guest.vm (vm 1) successfully, tty /dev/ttyp0
> > > vm/guest.vm: loadfile_bios: loaded BIOS image
> > > vm/guest.vm: pic_set_elcr: setting level triggered mode for irq 3
> > > vm/guest.vm: pic_set_elcr: setting level triggered mode for irq 5
> > > vm/guest.vm: virtio_init: vm "guest.vm" vio0 lladdr fe:e1:bb:d1:ae:e3
> > > vm/guest.vm: pic_set_elcr: setting level triggered mode for irq 6
> > > vm/guest.vm: guest.vm: launching vioblk0
> > > vm/guest.vm: virtio_dev_launch: sending 'd' type device struct
> > > vm/guest.vm: virtio_dev_launch: sending vm message for 'guest.vm'
> > > vm/guest.vm/vioblk: vioblk_main: got viblk dev. num disk fds = 1, sync fd 
> > > = 16, async fd = 18, capacity = 0 seg_max = 126, vmm fd = 5
> > > vm/guest.vm/vioblk0: vioblk_main: initialized vioblk0 with raw image 
> > > (capacity=83886080)
> > > vm/guest.vm/vioblk0: vioblk_main: wiring in async vm event handler (fd=18)
> > > vm/guest.vm/vioblk0: vm_device_pipe: initializing 'd' device pipe (fd=18)
> > > vm/guest.vm/vioblk0: vioblk_main: wiring in sync channel handler (fd=16)
> > > vm/guest.vm/vioblk0: vioblk_main: telling vm guest.vm device is ready
> > > vm/guest.vm/vioblk0: vioblk_main: sending heartbeat
> > > vm/guest.vm: virtio_dev_launch: receiving reply
> > > vm/guest.vm: virtio_dev_launch: device reports ready via sync channel
> > > vm/guest.vm: vm_device_pipe: initializing 'd' device pipe (fd=17)
> > > vm/guest.vm: guest.vm: launching vionet0
> > > vm/guest.vm: virtio_dev_launch: sending 'n' type device struct
> > > vm/guest.vm: virtio_dev_launch: sending vm message for 'guest.vm'
> > > vm/guest.vm/vionet: vionet_main: got vionet dev. tap fd = 8, syncfd = 16, 
> > > asyncfd = 19, vmm fd = 5
> > > vm/guest.vm/vionet0: vionet_main: wiring in async vm event handler (fd=19)
> > > vm/guest.vm/vionet0: vm_device_pipe: initializing 'n' device pipe (fd=19)
> > > vm/guest.vm/vionet0: vionet_main: wiring in sync channel handler (fd=16)
> > > vm/guest.vm/vionet0: vionet_main: telling vm guest.vm device is ready
> > > vm/guest.vm/vionet0: vionet_main: sending async ready message
> > > vm/guest.vm: virtio_dev_launch: receiving reply
> > > vm/guest.vm: virtio_dev_launch: device reports ready via sync channel
> > > vm/guest.vm: vm_device_pipe: initializing 'n' device pipe (fd=18)
> > > vm/guest.vm: pic_set_elcr: setting level triggered mode for irq 7
> > > vm/guest.vm: run_vm: starting 1 vcpu thread(s) for vm guest.vm
> > > vm/guest.vm: vcpu_reset: resetting vcpu 0 for vm 8
> > > vm/guest.vm: run_vm: waiting on events for VM guest.vm
> > > vm/guest.vm: guest.vm: received tap addr fe:e1:ba:d0:78:97 for nic 0
> > > vm/guest.vm: handle_dev_msg: device reports ready
> > > vm/guest.vm: handle_dev_msg: device reports ready
> > > vm/guest.vm/vionet0: dev_dispatch_vm: set hostmac
> > > vm/guest.vm: vcpu_exit_fw_cfg: selector 0x0000
> > > vm/guest.vm: vcpu_exit_fw_cfg: selector 0x0001
> > > vm/guest.vm: fw_cfg_handle_dma: selector 0x0019
> > > vm/guest.vm: fw_cfg_file_dir: file directory with 2 files
> > > vm/guest.vm:      100B 0020 etc/e820
> > > vm/guest.vm:        4B 0021 etc/screen-and-debug
> > > vm/guest.vm: vcpu_exit_fw_cfg: selector 0x0020
> > > vm/guest.vm: fw_cfg_select_file: accessing file etc/e820
> > > vm/guest.vm: fw_cfg_handle_dma: selector 0x000d
> > > vm/guest.vm: fw_cfg_select: unhandled selector d
> > > vm/guest.vm: fw_cfg_handle_dma: selector 0x000f
> > > vm/guest.vm: fw_cfg_select: unhandled selector f
> > > vm/guest.vm: fw_cfg_handle_dma: selector 0x8000
> > > vm/guest.vm: fw_cfg_select: unhandled selector 8000
> > > vm/guest.vm: fw_cfg_handle_dma: selector 0x8001
> > > vm/guest.vm: fw_cfg_select: unhandled selector 8001
> > > vm/guest.vm: fw_cfg_handle_dma: selector 0x0019
> > > vm/guest.vm: fw_cfg_file_dir: file directory with 2 files
> > > vm/guest.vm:      100B 0020 etc/e820
> > > vm/guest.vm:        4B 0021 etc/screen-and-debug
> > > vm/guest.vm: fw_cfg_handle_dma: selector 0x0004
> > > vm/guest.vm: i8259_write_datareg: master pic, reset IRQ vector to 0x8
> > > vm/guest.vm: i8259_write_datareg: slave pic, reset IRQ vector to 0x70
> > > vm/guest.vm: fw_cfg_handle_dma: selector 0x000f
> > > vm/guest.vm: fw_cfg_select: unhandled selector f
> > > vm/guest.vm: fw_cfg_handle_dma: selector 0x0005
> > > vm/guest.vm: fw_cfg_select: unhandled selector 5
> > > vm/guest.vm: fw_cfg_handle_dma: selector 0x0021
> > > vm/guest.vm: fw_cfg_select_file: accessing file etc/screen-and-debug
> > > vm/guest.vm: vcpu_process_com_lcr: set baudrate = 115200
> > > vm/guest.vm: vcpu_process_com_lcr: set baudrate = 115200
> > > vm/guest.vm: vcpu_process_com_lcr: set baudrate = 115200
> > > vm/guest.vm: vcpu_process_com_lcr: set baudrate = 115200
> > > vm/guest.vm: i8259_write_datareg: master pic, reset IRQ vector to 0x20
> > > vm/guest.vm: i8259_write_datareg: slave pic, reset IRQ vector to 0x28
> > > vm/guest.vm/vionet0: read_pipe_main: resetting virtio network device 0
> > > vm/guest.vm: vcpu_process_com_lcr: set baudrate = 115200
> > > vm/guest.vm: vcpu_exit_i8253_misc: counter 2 clear, returning 0x0
> > > vm/guest.vm: vcpu_exit_i8253_misc: discarding data written to PIT misc 
> > > port
> > > vm/guest.vm: vcpu_exit_i8253_misc: counter 2 clear, returning 0x0
> > > vm/guest.vm: vcpu_exit_i8253_misc: discarding data written to PIT misc 
> > > port
> > > vm/guest.vm: vcpu_exit_i8253_misc: counter 2 clear, returning 0x0
> > > vm/guest.vm: vcpu_exit_eptviolation: fault already handled
> > > vm/guest.vm: vcpu_exit_eptviolation: fault already handled
> > >
> > > <snip>This continues for many times<snip>
> > >
> > > vm/guest.vm/vionet0: read_pipe_main: resetting virtio network device 0
> > >
> > > vm/guest.vm: vcpu_exit_eptviolation: fault already handled
> > > vm/guest.vm: vcpu_exit_eptviolation: fault already handled
> > >
> > > <snip>This continues for hundreds of lines<snip>
> > >
> > > vmd: vmd_dispatch_vmm: running vm: 1, vm_state: 0x1
> > >
> > 
> > So it looks like the guest isn't sending a DHCP lease request. See my
> > next comment below.
> > 
> > >> > }
> > >> >
> > >> > $ doas cat /etc/hostname.tap0
> > >> > up
> > >> >
> > >> > $ doas cat /etc/hostname.bridge0
> > >> > add trunk0
> > >> > add tap0
> > >> >
> > >> > $ doas ifconfig tap0
> > >> > tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 
> > >> > 1500
> > >> >         lladdr fe:e1:ba:d0:78:97
> > >> >         description: vm1-if0-guest.vm
> > >> >         index 6 priority 0 llprio 3
> > >> >         groups: tap
> > >> >         status: active
> > >> >         inet 100.64.1.2 netmask 0xfffffffe
> > >> >
> > >> > $ doas ifconfig bridge0
> > >> > bridge0: flags=41<UP,RUNNING> mtu 1500
> > >> >         description: switch1-uplink
> > >> >         index 5 llprio 3
> > >> >         groups: bridge
> > >> >         priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 
> > >> > proto rstp
> > >> >         designated: id 00:00:00:00:00:00 priority 0
> > >> >         tap0 flags=3<LEARNING,DISCOVER>
> > >> >                 port 6 ifpriority 0 ifcost 0
> > >> >         trunk0 flags=3<LEARNING,DISCOVER>
> > >> >                 port 8 ifpriority 0 ifcost 0
> > >> >         Addresses (max cache: 100, timeout: 240):
> > >> >                 fe:e1:bb:d1:d2:bb tap0 1 flags=0<>
> > >> >                 64:9e:f3:ec:fc:7f trunk0 1 flags=0<>
> > >> >
> > >> > # guest (OpenBSD 6.4)
> > >> >
> > >> > $ doas cat /etc/hostname.vio0
> > >> > dhcp
> > 
> > Just realized this doesn't look correct. It should be:
> > 
> > inet autoconf
> > 
> > 
> > I believe "dhcp" was deprecated during the dhclient deprecation.
> 
> This client predates that change. From a quick glance at the changelog
> it seems 'inet autoconf' started being a valid replacement for 'dhcp'
> somewhere around the 6.9 release timeframe.
> 
> $ doas cat /etc/hostname.vio0
> inet autoconf
> 
> # /bin/sh /etc/netstart vio0
> ifconfig: autoconf not allowed for this AF
> 
> thanks,
> 
> .jh
>  
> > >> >
> > >> > $ doas ifconfig vio0
> > >> > vio0: 
> > >> > flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> 
> > >> > mtu 1500
> > You should see AUTOCONF4 in this list of flags ^
> > 
> > >> >         lladdr fe:e1:bb:d1:7d:0d
> > >> >         index 1 priority 0 llprio 3
> > >> >         media: Ethernet autoselect
> > >> >         status: active
> > >> >
> > >> > Example tcpdump on guest (limited it to the dhcp requests, there are 
> > >> > also lots of "icmp6:neighbor sol: who has" messages)
> > >> >
> > >> > May 14 18:37:51.132856 fe:e1:bb:d1:7d:0d ff:ff:ff:ff:ff:ff 0800 342: 
> > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x1f15c47d secs:14 vend-rfc1048 
> > >> > DHCP:DISCOVER HN:"guest" PR:SM+BR+TZ+121+DG+DN+119+NS+HN+BF+TFTP 
> > >> > CID:1.254.225.187.209.125.13 [tos 0x10]
> > >> > May 14 18:38:17.202879 fe:e1:bb:d1:7d:0d ff:ff:ff:ff:ff:ff 0800 342: 
> > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x876492de vend-rfc1048 
> > >> > DHCP:DISCOVER HN:"guest" PR:SM+BR+TZ+121+DG+DN+119+NS+HN+BF+TFTP 
> > >> > CID:1.254.225.187.209.125.13 [tos 0x10]
> > >> > May 14 18:38:19.212820 fe:e1:bb:d1:7d:0d ff:ff:ff:ff:ff:ff 0800 342: 
> > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x876492de secs:2 vend-rfc1048 
> > >> > DHCP:DISCOVER HN:"guest" PR:SM+BR+TZ+121+DG+DN+119+NS+HN+BF+TFTP 
> > >> > CID:1.254.225.187.209.125.13 [tos 0x10]
> > >> > May 14 18:38:21.222848 fe:e1:bb:d1:7d:0d ff:ff:ff:ff:ff:ff 0800 342: 
> > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x876492de secs:4 vend-rfc1048 
> > >> > DHCP:DISCOVER HN:"guest" PR:SM+BR+TZ+121+DG+DN+119+NS+HN+BF+TFTP 
> > >> > CID:1.254.225.187.209.125.13 [tos 0x10]
> > >> > May 14 18:38:25.222831 fe:e1:bb:d1:7d:0d ff:ff:ff:ff:ff:ff 0800 342: 
> > >> > 0.0.0.0.68 > 255.255.255.255.67:  xid:0x876492de secs:8 vend-rfc1048 
> > >> > DHCP:DISCOVER HN:"guest" PR:SM+BR+TZ+121+DG+DN+119+NS+HN+BF+TFTP 
> > >> > CID:1.254.225.187.209.125.13 [tos 0x10]
> > >> >
> > >> > On the host we see it received
> > >> >
> > >> > May 14 18:10:21.073328 rule 189/(match) pass out on trunk0: 0.0.0.0.68 
> > >> > > 255.255.255.255.67:  xid:0x34bf962a secs:4 [|bootp] [tos 0x10]
> > >> > May 14 18:10:41.073407 rule 183/(match) pass in on tap0: 0.0.0.0.68 > 
> > >> > 255.255.255.255.67:  xid:0x34bf962a secs:24 [|bootp] [tos 0x10]
> > >> >

I'm confused. You changed the config away from local dhcp intercept to
using bridge0. So are you running a dhcp server on and interface connected
to bridge0?

It seems there is an issue with the vmm internal dhcp (which is more
bootp) server. So the debug output would be helpful for that case since
there is an assumption that the dhcp packets are somehow lost.
-- 
:wq Claudio

Reply via email to