Re: source upgrade FBSD10 -> 11 (then 12)

2019-03-01 Thread Miroslav Lachman

Lee Damon wrote on 2019/02/28 17:18:
I have three old FBSD 10 boxes that I need to upgrade. Ordinarily I do 
this by building a new box with the latest OS then migrating services 
and data. Unfortunately I don't have that option this time, the upgrade 
has to happen in-place. My plan is to go from 10 to 11 then from 11 to 12.


I was looking at the "Upgrading FreeBSD" part of 
https://www.freebsd.org/releases/11.1R/installation.html#upgrade but 
unfortunately it seems to be missing a critical URL or two:


"The procedure for doing a source code based update is described in and ."

I tracked down where I think it points and the instructions appear to be 
just for same-version updates. Can I safely move /usr/src out of the way 
then check out the stable/11 and compile/install it or are there other 
things I need to do?


The upgrade procedure is the same for major upgrade (10.x to 11.x ) as 
for minor upgrade (10.3 to 10.4).


It is recommended to upgrade to the latest 10.x then to 11.x but 
sometimes I did source upgrade with skipping the whole branch - from 8.4 
to 10.1 or 10.2. It worked with one exception but this must be tested first.


If you plan to upgrade from 10.x to 11.2 then everything you need is 
described in /usr/src/Makefile


Just checkout the sources in to /usr/src as usual and then

For individuals wanting to upgrade their sources (even if only a
delta of a few days):

 1.  `cd /usr/src'   (or to the directory containing your source tree).
 2.  `make buildworld'
 3.  `make buildkernel KERNCONF=YOUR_KERNEL_HERE' (default is GENERIC).
 4.  `make installkernel KERNCONF=YOUR_KERNEL_HERE'   (default is GENERIC).
  [steps 3. & 4. can be combined by using the "kernel" target]
 5.  `reboot'(in single user mode: boot -s from the loader prompt).
 6.  `mergemaster -p'
 7.  `make installworld'
 8.  `mergemaster'  (you may wish to use -i, along with -U or -F).
 9.  `make delete-old'
10.  `reboot'
11.  `make delete-old-libs' (in case no 3rd party program uses them anymore)


I always skip step 5. doing all without reboot but beware of dragons!

On upgrade from latest 10 (10-STABLE or 10.4-RELEASE) you may see 
notices on "make installkernel" phase:


kldxref: unknown metadata record 4 in file atacard.ko
kldxref: unknown metadata record 4 in file atp.ko

They can be safely ignored.

More details about the upgrade process
https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html

Kind regards
Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


12.0-RELEASE zfs/vnode deadlock issue

2019-03-01 Thread Nick Rogers
Recently a number of my production 12.0 systems have experienced what I can
only gather is a ZFS deadlock related to vnodes. It seems similar to the
relatively recent FreeBSD-EN-18:18.zfs (ZFS vnode reclaim deadlock)
problem. Previously the same systems were running 11.1-RELEASE without
problem.

Threads are always stuck with the stack around
vn_lock->zfs_root->lookup->namei. When the system is in this state, a
simple `ls /` or `ls /tmp` always hangs, but other datasets seem
unaffected. I have a fairly straightforward ZFS root setup on a single pool
with one SSD. The workload is a ruby/rails/nginx/postgresql backed web
application combined with some data warehousing and other periodic tasks.

Sometimes I can remote SSH in, other times that fails because the user
shell fails to load, and I can run commands via `ssh ... command`.
Sometimes the system is not accessible remotely at all, or it eventually
becomes inaccessible if left long enough in this state. I always have to
physically reboot the device because the shutdown procedure fails. The
network stack (e.g. ping) seems to work completely fine whilst this is
going on, until you try to interact with or spawn a process that hits the
deadlock.

Like previous similar ZFS deadlock issues, increasing kern.vnodes seems to
make the system last longer by up to a few weeks, but is still a bandaid.
However, I have yet to witness vnodes usage actually getting close to the
maximum.

I haven't had any luck reproducing this reliably, but eventually it happens
after a few days or a few weeks... I managed to connect to a system in this
state and grab a procstat and get (hopefully) something useful out of kgdb.
I will note that although I was able to install debug symbols, I couldn't
manage to get the source files onto it for kgdb purposes before I lost SSH
access.

I am hoping someone can help me figure out if this is a legitimate bug, or
something already fixed in 12-STABLE. I wish I could reproduce it reliably
to try against STABLE, but there doesn't appear to be any related ZFS fixes
not in RELEASE. Thanks.

Below is an abbreviated procstat and what I was able to get out of kgdb for
an affected thread. Note that the thread backtrace is from a simple `ls`
command. The procstat dump below is truncated because my last attempt to
send this was rejected by this list for being too long - so a number of
sh/cron processes and some zfs threads in a hung state were removed.

ld# kgdb
GNU gdb (GDB) 8.2.1 [GDB v8.2.1 for FreeBSD]
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...Reading symbols from
/usr/lib/debug//boot/kernel/kernel.debug...done.
done.
sched_switch (td=0xf8002452a000, newtd=0xf80003625580,
flags=)
at /usr/src/sys/kern/sched_ule.c:2112
2112 /usr/src/sys/kern/sched_ule.c: No such file or directory.
(kgdb) tid 102023
(kgdb) bt
#0  sched_switch (td=0xf801a83dc580, newtd=0xf80003550580,
flags=)
at /usr/src/sys/kern/sched_ule.c:2112
#1  0x80d0e0a1 in mi_switch (flags=, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:439
#2  0x80d5c80c in sleepq_wait (wchan=,
pri=)
at /usr/src/sys/kern/subr_sleepqueue.c:692
#3  0x80cd9105 in sleeplk (lk=0xf800247307e8, flags=, ilk=,
wmesg=, pri=, timo=51, queue=1) at
/usr/src/sys/kern/kern_lock.c:300
#4  0x80cd7f85 in lockmgr_slock_hard (lk=,
flags=, ilk=,
file=, line=0, lwa=) at
/usr/src/sys/kern/kern_lock.c:646
#5  0x813acc5e in VOP_LOCK1_APV (vop=,
a=0xfe00f89dd450) at vnode_if.c:2087
#6  0x80de2820 in VOP_LOCK1 (vp=0xf80024730780, flags=2105344,
file=0x814d4f74
"/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c",
line=2074)
at ./vnode_if.h:859
#7  _vn_lock (vp=0xf80024730780, flags=2105344,
file=0x814d4f74
"/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c",
line=2074)
at /usr/src/sys/kern/vfs_vnops.c:1533
#8  0x8049f68d in zfs_root (vfsp=, flags=2105344,
vpp=0xfe00f89dd558)
at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:2074
#9  0x80dc5d43 in lookup (ndp=0xfe00f89dd780) at
/usr/src/sys/kern/vfs_lookup.c:961
#10 0x80dc4f9b in namei (ndp=0xfe00f89dd780) at
/usr/src/sys/kern/vfs_lookup.c:444
#11 0x80ddc637 in kern

Re: FreeBSD 12.0 RELEASE i386 can not build a kernel?

2019-03-01 Thread Lucas Nali de Magalhães
> On Feb 28, 2019, at 7:40 PM, Warner Losh  wrote:
> 
> On Thu, Feb 28, 2019 at 10:00 AM Rodney W. Grimes <
> free...@pdx.rh.cn85.dnsmgr.net> wrote:
> 
> .if defined(KERNEL_LD_OVERRIDE)
> LD=${KERNEL_LD_OVERRIDE}
> .else
> LD=ld.lld
> .endif

My suggestion would be to test $LD == ld  and then use LD=ld.lld with a comment 
about the reason but my makefile syntax knowledge is limited.

Lc

-- 
rollingbits — 📧 rollingb...@gmail.com 📧 rollingb...@terra.com.br 📧 
rollingb...@yahoo.com 📧 rollingb...@globo.com 📧 rollingb...@icloud.com

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

2019-03-01 Thread Lee Damon via freebsd-stable
After discussion with Bob Bishop (thanks for the help!) I've tried to do 
the following to upgrade one of the old boxes I mentioned previously.


cd /usr/src
tar ... .
rm -rf .??* *
svn checkout httpg://svn.freebsd.org/base/releng/10.3 /usr/src
compile, installkernel, installworld...

Now that the host is running RELENG the next step was to update from 
10.4 to 11.2 via freebsd-update


freebsd-update
freebsd-install
freebsd-update upgrade -r 11.2-RELEASE
freebsd-update install

so far, so good. Now it all falls apart

shutdown -r now
... why isn't the host coming back? Oh look, kernel panic.

  Fatal trap 12: page fault while in kernel mode
  cpuid = 1; apci id = 01
  fault virtual address = 0x84
  fault code = supervisor read data, page not present

Google searches find references to the same panic type in VMs running 
11.1, including https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220923


The differences are, that's 11.1 not 11.2 (I would presume the fix made 
it into 11.2 but maybe not) and most notably, that's against VMs and the 
host I'm doing this on is bare iron (Sun x4500).


Still, I gave the two entries in /boot/loader.conf a try, no joy. 
Exactly the same panic. Recording the boot with slow-mo shows the panic 
happening just after the USB devices are enumerated by the kernel. It 
never even tries to mount root.


I am able to boot to kernel.old, which appears to be my old 10.4-STABLE 
kernel. So now I'm kind of stuck. The update has already modified the 
config files as part of the first pass so rolling back may be a problem 
and moving forward seems unwise.


I have only one x4500 but I have three x4540s running 11.2-STABLE (also 
installed from source) just fine.


Anyone have any brilliant suggestions? I'm thinking of trying to compile 
11.2-RELENG in /usr/src so I can try installing that kernel but that'll 
take several hours at least (it's an old box).


nomad
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

2019-03-01 Thread Miroslav Lachman

Lee Damon via freebsd-stable wrote on 2019/03/01 22:53:
After discussion with Bob Bishop (thanks for the help!) I've tried to do 
the following to upgrade one of the old boxes I mentioned previously.


cd /usr/src
tar ... .
rm -rf .??* *
svn checkout httpg://svn.freebsd.org/base/releng/10.3 /usr/src
compile, installkernel, installworld...

Now that the host is running RELENG the next step was to update from 
10.4 to 11.2 via freebsd-update


freebsd-update
freebsd-install
freebsd-update upgrade -r 11.2-RELEASE
freebsd-update install

so far, so good. Now it all falls apart

shutdown -r now
... why isn't the host coming back? Oh look, kernel panic.

   Fatal trap 12: page fault while in kernel mode
   cpuid = 1; apci id = 01
   fault virtual address = 0x84
   fault code = supervisor read data, page not present


I went back from freebsd-update to source upgrades few years ago and now 
use exclusively source builds (build it on powerful build machine and 
distribute it to clients thru NFS so clients can just run make 
installkernel and make installworld) because I was bitten by failed 
freebsd-update upgrade many times...


Google searches find references to the same panic type in VMs running 
11.1, including https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220923


The differences are, that's 11.1 not 11.2 (I would presume the fix made 
it into 11.2 but maybe not) and most notably, that's against VMs and the 
host I'm doing this on is bare iron (Sun x4500).


Still, I gave the two entries in /boot/loader.conf a try, no joy. 
Exactly the same panic. Recording the boot with slow-mo shows the panic 
happening just after the USB devices are enumerated by the kernel. It 
never even tries to mount root.


I am able to boot to kernel.old, which appears to be my old 10.4-STABLE 
kernel. So now I'm kind of stuck. The update has already modified the 
config files as part of the first pass so rolling back may be a problem 
and moving forward seems unwise.


I have only one x4500 but I have three x4540s running 11.2-STABLE (also 
installed from source) just fine.


Anyone have any brilliant suggestions? I'm thinking of trying to compile 
11.2-RELENG in /usr/src so I can try installing that kernel but that'll 
take several hours at least (it's an old box).


If you can boot with the old 10.4 kernel and go online, just fetch 
kernel.txz from the net: 
http://ftp.freebsd.org/pub/FreeBSD/releases/amd64/11.2-RELEASE/kernel.txz 
and unpack it to /boot/kernel112 then you can try to reboot a manually 
select to boot this kernel instead of default /boot/kernel.

If you cannot access the boot loader prompt you can try "nextboot" command.
1) unpack the kernel
2) set nextboot: nextboot -k kernel112
3) shutdown -r now and hope for a luck

If your machine boots fine with 11.2 kernel, you can fetch sources and 
rebuild kernel and userland for 11.2 as usual.
Or you can try to fetch and unpack base.txz 
http://ftp.freebsd.org/pub/FreeBSD/releases/amd64/11.2-RELEASE/base.txz 
over your current files. It can make a mess but you can always clean it 
with "make delete-old & make delete-old-libs"


Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Would someone commit the revised patch in usb/234503, please

2019-03-01 Thread Mel Pilgrim
usb/234503 contains a patch to broaden the scope of an existing scsi_da 
quirk (attachment labeled "Patch to broaden quirk coverage to all 
Chipfancier devices").  It's a minor change, but it's a showstopper for 
me and I need to see it get into 11-stable ahead of 11.3-R and 12-stable 
ahead of 12.1-R.


Would someone have a look at it and commit it?

Thank you
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

2019-03-01 Thread Lee Damon via freebsd-stable

On 3/1/19 14:19 , Miroslav Lachman wrote:
If you can boot with the old 10.4 kernel and go online, just fetch 
kernel.txz from the net: 
http://ftp.freebsd.org/pub/FreeBSD/releases/amd64/11.2-RELEASE/kernel.txz and 
unpack it to /boot/kernel112 then you can try to reboot a manually 
select to boot this kernel instead of default /boot/kernel.


Darn it. I get the same kernel panic with that one.

I'm compiling locally but I don't expect that to make any difference. 
I'll need to go pawing through the release notes and see if there are 
any references to deprecated hardware that might be involved.


I'm attaching a copy of dmesg output from a successful boot into 
10.4-STABLE. The kernel panic appears to happen around 15% of the way 
into the output, around


...
mvsch13:  at channel 5 on mvs1
mvsch14:  at channel 6 on mvs1
mvsch15:  at channel 7 on mvs1
pcib3:  at device 6.0 on pci0
pci3:  on pcib3
ohci0:  mem 0xfd1fe000-0xfd1fefff irq 19 
at device 0.0 on pci3

usbus0 on ohci0
ohci1:  mem 0xfd1fd000-0xfd1fdfff irq 19 
at device 0.1 on pci3

usbus1 on ohci1
...

(Just before it enumerates vgapci0)

but I can't be sure because the screen moves so fast that even slow-mo 
video is just a blur.


nomad
Copyright (c) 1992-2017 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.4-STABLE #25 r342947: Fri Jan 11 14:17:40 PST 2019
l...@goose.ee.washington.edu:/usr/obj/usr/src/sys/GENERIC amd64
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
CPU: Dual Core AMD Opteron(tm) Processor 290 (2792.11-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x20f12  Family=0xf  Model=0x21  Stepping=2
  
Features=0x178bfbff
  Features2=0x1
  AMD Features=0xe2500800
  AMD Features2=0x3
real memory  = 17179869184 (16384 MB)
avail memory = 16418484224 (15657 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 2 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
random:  initialized
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0  irqs 0-23 on motherboard
ioapic1  irqs 24-30 on motherboard
ioapic2  irqs 31-37 on motherboard
ioapic3  irqs 38-44 on motherboard
ioapic4  irqs 45-51 on motherboard
ioapic5  irqs 52-58 on motherboard
ioapic6  irqs 59-65 on motherboard
ioapic7  irqs 66-72 on motherboard
ioapic8  irqs 73-79 on motherboard
ioapic9  irqs 80-86 on motherboard
ioapic10  irqs 87-93 on motherboard
kbd1 at kbdmux0
acpi0:  on motherboard
acpi0: Power Button (fixed)
cpu0:  on acpi0
cpu1:  on acpi0
cpu2:  on acpi0
cpu3:  on acpi0
attimer0:  port 0x40-0x43 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0:  port 0x70-0x71 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
hpet0:  iomem 0xfec01000-0xfec013ff irq 0,8 on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1808-0x180b on acpi0
pcib0:  on acpi0
pci0:  on pcib0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
mvs0:  port 0x7c00-0x7cff mem 
0xfae0-0xfaef irq 24 at device 1.0 on pci1
mvs0: Gen-II, 8 3Gbps ports, Port Multiplier supported
mvsch0:  at channel 0 on mvs0
mvsch1:  at channel 1 on mvs0
mvsch2:  at channel 2 on mvs0
mvsch3:  at channel 3 on mvs0
mvsch4:  at channel 4 on mvs0
mvsch5:  at channel 5 on mvs0
mvsch6:  at channel 6 on mvs0
mvsch7:  at channel 7 on mvs0
pcib2:  at device 2.0 on pci0
pci2:  on pcib2
mvs1:  port 0x8c00-0x8cff mem 
0xfb00-0xfb0f irq 32 at device 1.0 on pci2
mvs1: Gen-II, 8 3Gbps ports, Port Multiplier supported
mvsch8:  at channel 0 on mvs1
mvsch9:  at channel 1 on mvs1
mvsch10:  at channel 2 on mvs1
mvsch11:  at channel 3 on mvs1
mvsch12:  at channel 4 on mvs1
mvsch13:  at channel 5 on mvs1
mvsch14:  at channel 6 on mvs1
mvsch15:  at channel 7 on mvs1
pcib3:  at device 6.0 on pci0
pci3:  on pcib3
ohci0:  mem 0xfd1fe000-0xfd1fefff irq 19 at 
device 0.0 on pci3
usbus0 on ohci0
ohci1:  mem 0xfd1fd000-0xfd1fdfff irq 19 at 
device 0.1 on pci3
usbus1 on ohci1
vgapci0:  port 0x9800-0x98ff mem 
0xfc00-0xfcff,0xfd1ff000-0xfd1f irq 16 at device 3.0 on pci3
vgapci0: Boot video device
ohci2:  mem 0xfd1fc000-0xfd1fcfff irq 17 at device 
4.0 on pci3
usbus2 on ohci2
ohci3:  mem 0xfd1fb000-0xfd1fbfff irq 18 at device 
4.1 on pci3
usbus3 on ohci3
ehci0:  mem 0xfd1fac00-0xfd1facff irq 19 at 
device 4.2 on pci3
usbus4: EHCI version 1.0
usbus4 on ehci0
isab0:  at device 7.0 on pci0
isa0:  on isab0
atapci0:  port 
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 7.1 on pci0
ata0:  at channel 0 on atapci0
ata1:  at channel 1 on atapci0
pci0:  at device 7.3 (no driver attached)
pcib4:

Re: more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

2019-03-01 Thread Miroslav Lachman

Lee Damon wrote on 2019/03/02 00:06:


Darn it. I get the same kernel panic with that one.

I'm compiling locally but I don't expect that to make any difference. 
I'll need to go pawing through the release notes and see if there are 
any references to deprecated hardware that might be involved.


I'm attaching a copy of dmesg output from a successful boot into 
10.4-STABLE. The kernel panic appears to happen around 15% of the way 
into the output, around


I am running 11.2 on SunFire X2100 M2 but according to your dmesg it 
uses different chips. X2100 M2 has nVidia nForce MCP55 chipset for ATA 
devices, nfe for 2 NICs and Broadcom bge for the other 2 NIC's.


Did you tried to boot "safe mode"? (selectable in boot menu).
Or you can try to disable / enable some settings in the BIOS. Something 
related to USB or onboard VGA etc. may help.


Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

2019-03-01 Thread Lee Damon via freebsd-stable

On 3/1/19 15:38 , Miroslav Lachman wrote:

Did you tried to boot "safe mode"? (selectable in boot menu).


I completely forgot about safe mode.

Yep. It boots. I'm going to finish the freebsd-update process then 
reboot into safe mode again. I'm out of time to work on this today and 
am only in this lab on Fridays so I'll have to pick up working on this 
problem next Friday.


Thanks for the help,
nomad
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"