On 2018-Oct-21, at 8:30 PM, Warner Losh <imp at bsdimp.com> wrote:
> On Sun, Oct 21, 2018 at 9:28 PM Warner Losh <imp at bsdimp.com> wrote:
>
> On Sun, Oct 21, 2018 at 8:57 PM Mark Millard via freebsd-stable
> <freebsd-sta...@freebsd.org> wrote:
>> [I built based on WITHOUT_ZFS= for other reasons. But,
>> after installing the build, Hyper-V based boots are
>> working.]
>>
>> On 2018-Oct-20, at 2:09 AM, Mark Millard <marklmi at yahoo.com> wrote:
>>
>> > On 2018-Oct-20, at 1:39 AM, Mark Millard <marklmi at yahoo.com> wrote:
>> >
>> >> I attempted to jump from head -r334014 to -r339076
>> >> on a threadripper 1950X board and the boot fails.
>> >> This is both native booting and under Hyper-V,
>> >> same machine and root file system in both cases.
>> >
>> > I did my investigation under Hyper-V after seeing
>> > a boot failure native.
>> >
>> > Looks like the native failure is even earlier,
>> > before db> is even possible, possibly during
>> > early loader activity.
>> >
>> > So this report is really for running under
>> > Hyper-V: -r338804 boots and -r338810 does
>> > not. By contrast -r334804 does not boot native.
>> > (But I've little information for that context.)
>> >
>> > Sorry for the confusion. I rushed the report
>> > in hopes of getting to sleep. It was not to be.
>> >
>> >> It fails just after the FreeBSD/SMP lines,
>> >> reporting "kernel trap 9 with interrupts disabled".
>> >>
>> >> It fails in pmap_force_invaldiate_cache_range at
>> >> a clflusl (%rax) instruction that produces a
>> >> "Fatal trap 9: general protection fault while
>> >> in kernel mode". cpudid=0 apic id= 00
>> >>
>> >> I used kernel.txz files from:
>> >>
>> >> https://artifact.ci.freebsd.org/snapshot/head/r*/amd64/amd64/
>> >>
>> >> to narrow the range of kernel builds for working -> failing
>> >> and got:
>> >>
>> >> -r338804 boots fine
>> >> (no amd64 kernel builds between to try)
>> >> -r338810+ fails (any that I tried, anyway)
>> >>
>> >> In that range is -r338807 :
>> >>
>> >> QUOTE
>> >> Author: kib
>> >> Date: Wed Sep 19 19:35:02 2018
>> >> New Revision: 338807
>> >> URL:
>> >> https://svnweb.freebsd.org/changeset/base/338807
>> >>
>> >>
>> >> Log:
>> >> Convert x86 cache invalidation functions to ifuncs.
>> >>
>> >> This simplifies the runtime logic and reduces the number of
>> >> runtime-constant branches.
>> >>
>> >> Reviewed by: alc, markj
>> >> Sponsored by: The FreeBSD Foundation
>> >> Approved by: re (gjb)
>> >> Differential revision:
>> >> https://reviews.freebsd.org/D16736
>> >>
>> >> Modified:
>> >> head/sys/amd64/amd64/pmap.c
>> >> head/sys/amd64/include/pmap.h
>> >> head/sys/dev/drm2/drm_os_freebsd.c
>> >> head/sys/dev/drm2/i915/intel_ringbuffer.c
>> >> head/sys/i386/i386/pmap.c
>> >> head/sys/i386/i386/vm_machdep.c
>> >> head/sys/i386/include/pmap.h
>> >> head/sys/x86/iommu/intel_utils.c
>> >> END QUOTE
>> >>
>> >> There do seem to be changes associated with
>> >> clflush(...) use. Looking at:
>> >>
>> >> https://svnweb.freebsd.org/base/head/sys/amd64/amd64/pmap.c?annotate=339432
>> >>
>> >> it appears that pmap_force_invalidate_cache_range has not
>> >> changed since -r338807.
>> >>
>> >> It seems that -r338806 and -r3388810 would be unlikely
>> >> contributors.
>> >
>>
>> I went after my native-boot loader problem first because I
>> could switch kernels via the loader for booting FreeBSD under
>> Hyper-V. Switching loaders is more of a problem.
>>
>> In order to avoid the loader-time crash I switched to building
>> installing based on WITHOUT_ZFS= . I've had no active use of
>> ZFS in years. (The old official-build loaders that worked were
>> non-ZFS ones.)
>>
>> This took care of the native-boot loader-crash --and, to my
>> surprise, also the Hyper-V-boot kernel-time crash.
>>
>> My private builds now boot the 1950X in both contexts just
>> fine.
>>
>> During my early investigation I did pick up specific changes
>> from after -r339076 that seemed to be tied to Ryzen and such.
>> (They made no difference to the boot problems at the time
>> but I saw no reason to remove them.)
>>
>> # uname -apKU
>> FreeBSD FBSDFSSD 12.0-ALPHA8 FreeBSD 12.0-ALPHA8 #5 r339076:339432M: Sun Oct
>> 21 16:44:25 PDT 2018
>> markmi@FBSDFSSD:/usr/obj/amd64_clang/amd64.amd64/usr/src/amd64.amd64/sys/GENERIC-NODBG
>> amd64 amd64 1200084 1200084
>>
>> (stupid gmail)
>
> The phrase "no active use" bothers me. What does that mean? Are there any ZFS
> pools or any disks that any whiff of ZFSish thing on it at all? Clearly,
> there's something in the zfs boot loader that's freaking out by something on
> your system, but absent that information I can't help you.
No ZFS pools: Strictly UFS for FreeBSD file systems
for the last few years, UFS before I had access to
the 1950X system.
I've never before bothered to use WITHOUT_ZFS= in
my builds. So the system had the ZFS support,
such as kernel modules, over all the time that
this system had been in use.
Prior to the recent versions I saw no such problems.
But the default loader was not ZFS capable.
As seen in the under-Hyper-V use-context:
# gpart show -p
=> 40 937703008 da0 GPT (447G)
40 1024 da0p1 freebsd-boot (512K)
1064 746586112 da0p2 freebsd-ufs (356G)
746587176 31457280 da0p3 freebsd-swap (15G)
778044456 159383552 da0p4 freebsd-swap (76G)
937428008 275040 - free - (134M)
=> 40 937703008 da1 GPT (447G)
40 1024 da1p1 freebsd-boot (512K)
1064 369098752 da1p2 freebsd-ufs (176G)
369099816 406846424 da1p3 freebsd-swap (194G)
775946240 130024488 - free - (62G)
905970728 31457280 da1p4 freebsd-swap (15G)
937428008 275040 - free - (134M)
=> 40 419430320 da2 GPT (200G)
40 4056 - free - (2.0M)
4096 419426263 da2p1 freebsd-ufs (200G)
419430359 1 - free - (512B)
=> 40 2000409184 da3 GPT (954G)
40 1024 da3p1 freebsd-boot (512K)
1064 2000408159 da3p2 freebsd-ufs (954G)
2000409223 1 - free - (512B)
So no ZFS pools.
The above context never had the ZFS-capable loader
problem but did have the kernel problem. I was
booting the 356G freebsd-ufs partition: the only
one that I have updated the FreeBSD version on
so far.
FreeBSD booted natively more drives are seen in
gpart show, some not from/for FreeBSD. But the
above drives are present and I was booting from
the same partition of the same drive: the 356G
freebsd-ufs partition. Still no ZFS pools
anywhere:
# gpart show -p
=> 34 4000797293 nvd0 GPT (1.9T)
34 262144 nvd0p1 ms-reserved (128M)
262178 2014 - free - (1.0M)
264192 3600451584 nvd0p2 ms-basic-data (1.7T)
3600715776 400081551 - free - (191G)
=> 40 937703008 nvd1 GPT (447G)
40 1024 nvd1p1 freebsd-boot (512K)
1064 746586112 nvd1p2 freebsd-ufs (356G)
746587176 31457280 nvd1p3 freebsd-swap (15G)
778044456 159383552 nvd1p4 freebsd-swap (76G)
937428008 275040 - free - (134M)
=> 40 937703008 nvd2 GPT (447G)
40 1024 nvd2p1 freebsd-boot (512K)
1064 369098752 nvd2p2 freebsd-ufs (176G)
369099816 406846424 nvd2p3 freebsd-swap (194G)
775946240 130024488 - free - (62G)
905970728 31457280 nvd2p4 freebsd-swap (15G)
937428008 275040 - free - (134M)
=> 34 2000409197 nvd3 GPT (954G)
34 2014 - free - (1.0M)
2048 1021952 nvd3p1 ms-recovery (499M)
1024000 202752 nvd3p2 efi (99M)
1226752 32768 nvd3p3 ms-reserved (16M)
1259520 1859119104 nvd3p4 ms-basic-data (886G)
1860378624 140030607 - free - (67G)
=> 40 2000409184 nvd4 GPT (954G)
40 1024 nvd4p1 freebsd-boot (512K)
1064 2000408159 nvd4p2 freebsd-ufs (954G)
2000409223 1 - free - (512B)
=> 63 2000409201 ada0 MBR (954G)
63 1985 - free - (993K)
2048 4096 ada0s1 linux-data (2.0M)
6144 2093056 - free - (1.0G)
2099200 1998309376 ada0s2 linux-lvm (953G)
2000408576 688 - free - (344K)
=> 34 2000409197 ada1 GPT (954G)
34 262144 ada1p1 ms-reserved (128M)
262178 2000147053 - free - (954G)
=> 34 2000409197 ada2 GPT (954G)
34 262144 ada2p1 ms-reserved (128M)
262178 2000147053 - free - (954G)
=> 34 1953497022 da0 GPT (932G)
34 262144 da0p1 ms-reserved (128M)
262178 2014 - free - (1.0M)
264192 1953230848 da0p2 ms-basic-data (931G)
1953495040 2016 - free - (1.0M)
=> 1 60062499 da1 MBR (29G)
1 31 - free - (16K)
32 60062468 da1s1 fat32lba (29G)
The 356G freebsd-ufs partition is the only one
of the freebsd-ufs partitions updated so far.
This is the context that had the problem with
the ZFS-capable loaders --but no later kernel
problem when a not-ZFS-capable loader was used
(via copying over an older one --until I did the
WITHOUT_ZFS= build/install).
As for the ZFS-capable loader: May it has
problems when it sees one or more of:
ms-reserved (on GPT)
ms-basic-data (on GPT) (NTFS file system)
ms-recovery (on GPT)
efi (on GPT)
linux-data (on MBR)
linux-lvm (on MBR)
fat32lba (on MBR)
(given that none of these is available in
the Hyper-V context as the virtual machine
has been configured).
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"