>-----Original Message-----
>From: Kyle Evans <kev...@freebsd.org>
>Sent: Wednesday, April 26, 2023 3:39 AM
>To: Souradeep Chakrabarti <schakraba...@microsoft.com>
>Cc: Kyle Evans <kev...@freebsd.org>; Wei Hu <w...@freebsd.org>; src-
>committ...@freebsd.org; dev-commits-src-all@freebsd.org; dev-commits-src-
>m...@freebsd.org
>Subject: Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
>enablement for ARM64 in Hyper-V (Part 3, final)
>
>On Tue, Apr 25, 2023 at 2:42 AM Souradeep Chakrabarti
><schakraba...@microsoft.com> wrote:
>>
>>
>>
>>
>> >-----Original Message-----
>> >From: Souradeep Chakrabarti
>> >Sent: Monday, April 24, 2023 6:24 PM
>> >To: Kyle Evans <kev...@freebsd.org>; Wei Hu <w...@freebsd.org>
>> >Cc: src-committ...@freebsd.org; dev-commits-src-all@freebsd.org;
>> >dev-commits- src-m...@freebsd.org
>> >Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
>> >enablement for ARM64 in Hyper-V (Part 3, final)
>> >
>> >
>> >
>> >
>> >>-----Original Message-----
>> >>From: Kyle Evans <kev...@freebsd.org>
>> >>Sent: Wednesday, April 19, 2023 11:00 AM
>> >>To: Wei Hu <w...@freebsd.org>; Souradeep Chakrabarti
>> >><schakraba...@microsoft.com>
>> >>Cc: src-committ...@freebsd.org; dev-commits-src-all@freebsd.org;
>> >>dev-commits- src-m...@freebsd.org
>> >>Subject: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
>> >>enablement for ARM64 in Hyper-V (Part 3, final)
>> >>
>> >>On Thu, Oct 27, 2022 at 8:54 AM Wei Hu <w...@freebsd.org> wrote:
>> >>>
>> >>> The branch main has been updated by whu:
>> >>>
>> >>> URL:
>> >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2F
>> >>>
>cgi%2F&data=05%7C01%7Cschakrabarti%40microsoft.com%7Ce1cfc3aff7f94
>> >>>
>d5c2ca408db45d9c4c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7
>C6
>> >>>
>38180573867118793%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
>LCJQ
>> >>>
>IjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=k
>> >>>
>MntMYzegWgF%2B90vq6FERBLjwfM%2FVb1GUNwUP9pv90Q%3D&reserved=0
>> >>> t
>> >>>
>>
>>>.freebsd.org%2Fsrc%2Fcommit%2F%3Fid%3D9729f076e4d93c5a37e78d427bf
>e
>> >0
>> >>f1a
>> >>>
>>
>>>b99bbcc6&data=05%7C01%7Cschakrabarti%40microsoft.com%7C393e855f13c
>6
>> >>49a
>> >>>
>>
>>>8883708db4097211e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
>7
>> >C
>> >>6381747
>> >>>
>>
>>>90106786449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQI
>j
>> >oi
>> >>V2luMz
>> >>>
>>
>>>IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=huiaOBSB2
>Z
>> >>QDn5Q
>> >>> cw%2FLWnmI%2FqbbriQd7HkYtpxnGLGE%3D&reserved=0
>> >>>
>> >>> commit 9729f076e4d93c5a37e78d427bfe0f1ab99bbcc6
>> >>> Author: Souradeep Chakrabarti <schakraba...@microsoft.com>
>> >>> AuthorDate: 2022-10-27 13:46:08 +0000
>> >>> Commit: Wei Hu <w...@freebsd.org>
>> >>> CommitDate: 2022-10-27 13:53:22 +0000
>> >>>
>> >>> arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3,
>> >>> final)
>> >>>
>> >>> This is the last part for ARM64 Hyper-V enablement. This includes
>> >>> commone files and make file changes to enable the ARM64 FreeBSD
>> >>> guest on Hyper-V. With this patch, it should be able to build
>> >>> the ARM64 image and install it on Hyper-V.
>> >>>
>> >>
>> >>Hi,
>> >>
>> >>First off- thanks for doing this work! I can't seem to boot a
>> >>-CURRENT image under Hyper-V on a Volterra machine, seemingly due to
>> >>vmbus. It stalls right
>> >after "vmbus:
>> >>the irq 18" but before emitting a version number, I have some other
>> >>comments here...
>> >>
>> >>> [... snip ...]
>> >>> diff --git a/sys/dev/hyperv/vmbus/vmbus.c
>> >>> b/sys/dev/hyperv/vmbus/vmbus.c index b0cd750b26c8..f370f2a75b99
>> >>> 100644
>> >>> --- a/sys/dev/hyperv/vmbus/vmbus.c
>> >>> +++ b/sys/dev/hyperv/vmbus/vmbus.c
>> >>> [... snip ...]
>> >>> @@ -107,7 +113,7 @@ static uint32_t
>> >>vmbus_get_vcpu_id_method(device_t bus,
>> >>> device_t dev, int cpu);
>> >>> static struct taskqueue
>> >>> *vmbus_get_eventtq_method(device_t,
>device_t,
>> >>> int); -#ifdef EARLY_AP_STARTUP
>> >>> +#if defined(EARLY_AP_STARTUP) || defined(__aarch64__)
>> >>> static void vmbus_intrhook(void *);
>> >>> #endif
>> >>>
>> >>
>> >>My gut reaction to this is that this is a red flag. EARLY_AP_STARTUP
>> >>implies characteristics that aarch64 doesn't exhibit; it's hard to
>> >>see why this conditional is OK, or whether it's testing
>> >>EARLY_AP_STARTUP as a bad way to write __i386__ || __amd64__.
>> >[Souradeep] We had or'd the aarch64 here, as we wanted to defer the
>> >vmbus attachment after gic and arm generic timer is attached. As we
>> >need PAUSE and DELAY for vmbus and hyper-v drivers. For that we
>> >wanted to use vmbus_intrhook().
>> >>
>> >>> [... snip ...]
>> >>> @@ -760,7 +736,7 @@ vmbus_synic_setup(void *xsc)
>> >>>
>> >>> if (hyperv_features & CPUID_HV_MSR_VP_INDEX) {
>> >>> /* Save virtual processor id. */
>> >>> - VMBUS_PCPU_GET(sc, vcpuid, cpu) = rdmsr(MSR_HV_VP_INDEX);
>> >>> + VMBUS_PCPU_GET(sc, vcpuid, cpu) =
>> >>> + RDMSR(MSR_HV_VP_INDEX);
>> >>> } else {
>> >>> /* Set virtual processor id to 0 for compatibility. */
>> >>> VMBUS_PCPU_GET(sc, vcpuid, cpu) = 0;
>> >>
>> >>This one, vmbus_synic_setup(), is invoked via vmbus_intrhook ->
>> >>vmbus_doattach -
>> >>> smp_rendezvous. On !EARLY_AP_STARTUP (e.g., aarch64), SMP isn't
>> >>> functional in
>> >>intrhooks and smp_rendezvous() will just call vmbus_synic_setup() on
>> >>the boot
>> >AP.
>> >>There's nothing that will initialize the pcpu data on every other AP,
>> >>AFAICT.
>> >>
>> >>That said, the !EARLY_AP_STARTUP path is also wrong. Quoting lines
>> >>that weren't in this e-mail from vmbus_attach():
>> >[Souradeep] Thanks! This is a really good catch, and yes, we are
>> >missing the smp initialization during vmbus synic setup.
>> >>
>> >>1527 #else /* !EARLY_AP_STARTUP */
>> >>1528 /*
>> >>1529 * If the system has already booted and thread
>> >>1530 * scheduling is possible indicated by the global
>> >>1531 * cold set to zero, we just call the driver
>> >>1532 * initialization directly.
>> >>1533 */
>> >>1534 if (!cold)
>> >>1535 vmbus_doattach(vmbus_sc);
>> >>1536 #endif /* EARLY_AP_STARTUP and aarch64 */
>> >>
>> >>The two immediate issues I see is that in a device_attach, SMP won't
>> >>be started. It's also going to be before SI_SUB_CONFIGURE, so `cold`
>> >>will never be 0 here -- this is effectively dead code, and if it
>> >>weren't dead code it would suffer the same problem as above where
>> >>other APs aren't started yet so we won't set their pcpu data. This
>> >>is OK, though, because then we have this sysinit later on that does
>> >>the same thing for !EARLY_AP_STARTUP && !__aarch64__`, but that
>> >>should really just be `!EARLY_AP_STARTUP` as aarch64 also needs to
>> >>invoke
>> >>vmbus_doattach() at SI_SUB_SMP (much later than SI_SUB_DRIVERS).
>> >[Souradeep] I have tried to use the suggestion but in multi processor
>> >system the boot is getting stuck at the end and "vmbus0: device scan, probe
>> >and
>attach done"
>> >is not happening.
>> >I am trying to figure out what is going wrong here, as the same
>> >change is working with single cpu system. Any help or pointer will be really
>helpful.
>> [Souradeep] I can see the problem the task vmbus_scandone_task is
>> getting enqueued but it is not getting executed.
>>
>> 525 static void
>> 526 vmbus_scan_done(struct vmbus_softc *sc,
>> 527 const struct vmbus_message *msg __unused)
>> 528 {
>> 529 device_printf(sc->vmbus_dev, " vmbus_scan_done is called\n");
>> 530
>> 531 taskqueue_enqueue(sc->vmbus_devtq, &sc->vmbus_scandone_task);
>> 532 }
>> vmbus scan done is getting called.
>>
>> But
>> 514 static void
>> 515 vmbus_scan_done_task(void *xsc, int pending __unused)
>> 516 {
>> 517 struct vmbus_softc *sc = xsc;
>> 518 device_printf(sc->vmbus_dev, " vmbus_scan_done_task is called\n");
>> 519 bus_topo_lock();
>> 520 sc->vmbus_scandone = true;
>> 521 bus_topo_unlock();
>> 522 wakeup(&sc->vmbus_scandone);
>> 523 }
>> vmbus_scan_done_task is not getting called the task .
>>
>> But the same is happening when single cpu is running, for multi cpu it is
>> getting
>stuck.
>
>Hi,
>
>That seems odd. What happens if you bump the SYSINIT up to SI_SUB_SMP
>+ 1, SI_ORDER_FIRST? We don't know for a fact that all APs are ready
>for scheduling until after smp_after_idle_runnable(), which is also at
>SI_ORDER_ANY
>-- maybe there's just something going horribly wrong.
>That would perhaps explain why it's fine on a single processor system, which
>won't
>do anything useful (at least in later parts of SI_SUB_SMP).
[Souradeep]
In ARM64 SMP(VM with two cpu), storvsc attach is happening two times for
single scsi controller.
But for intel similar VM (two cpu), it is happening once.
For the dummy/fake storvsc in arm64, we are getting stuck at device_attach.
Details:
vmbus_scan_done(), not getting invoked because vmbus_add_child() is not
complete for a channel 15,
because of which vmbus_devtq is having one task pending.
Now
By passing NMI in the hung system, after examining all threads:
sched_switch() at sched_switch+0x4dc
mi_switch() at mi_switch+0x194
sleepq_switch() at sleepq_switch+0xfc
_cv_wait() at _cv_wait+0x160
_sema_wait() at _sema_wait+0x50
storvsc_attach() at storvsc_attach+0x610
device_attach() at device_attach+0x3f8
device_probe_and_attach() at device_probe_and_attach+0x7c
vmbus_add_child() at vmbus_add_child+0x64
Now ,
It is stuck at waiting on sema_wait() on request->synch_sema in
hv_storvsc_channel_init() because
sema_post() on request->synch_sema is not getting invoked. Which unlocks it.
This is because we are waiting on sema_wait on synch_sema
hv_storvsc_channel_init(), for storvsc1 ,
but there is no storvsc1 device. So not getting a callback called for storvsc1.
From ARM64 debug log:
If you see at line 545 again SCI device got detected.
Line 370: storvsc0: Enlightened SCSI device detected
Line 371: storvsc0: <Hyper-V SCSI> on vmbus0
Line 406: (probe0:storvsc0:0:0:0): storvsc scsi_status = 2, srb_status
= 6
Line 421: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
Line 436: da0: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI
device
Line 443: pass1: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3
SCSI device
Line 447: cd0: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3 SCSI
device
Line 545: storvsc1: Enlightened SCSI device detected
Line 547: storvsc1: Enlightened SCSI device detected
Line 549: storvsc1: <Hyper-V SCSI>hv_storvsc_on_channel_callback is
called
From Log:
unknown: device_add_child for chan15
storvsc1: Enlightened SCSI device detected
storvsc1: Enlightened SCSI device detected
storvsc1: <Hyper-V SCSI> on vmbus0
storvsc ringbuffer size: 262144, max_io: 512
storvsc1: chan15 assigned to cpu1 [vcpu1]
hn0: link state changed to UP
vmbus0: vmbus_chanmsg_handle type 0xa
storvsc1: gpadl_conn(chan15) succeeded
vmbus0: vmbus_chanmsg_handle type 0x6
storvsc1: chan15 opened
waiting on sema wait synch_sema hv_storvsc_channel_init
>
>Thanks,
>
>Kyle Evans