Re: krpc: unbootable ZFS-on-root after major upgrade to 11.2

2018-10-19 Thread Eugene Grosbein
On 19.10.2018 13:28, Andriy Gapon wrote:

>> It was brought to my attention that 10.x did not require availability
>> of krpc for ZFS-on-root system to be bootable but 11.x does.
>>
>> That is, major upgrade of 10.x ZFS-on-root system to 11.x
>> results in non-bootable broken system if it uses custom kernel without NFS 
>> bits
>> that automatically come with krpc, and the system was built with 
>> MODULES_OVERRIDE="zfs opensolaris"
>> and no krpc mentioned.
> 
> Could you please also describe specifics of the problem?
> It's kidn of strange that root-on-zfs requires krpc.

https://svnweb.freebsd.org/base/stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c?revision=339111&view=markup#l7146

This code uses some xdr(3) functions to parse zpool.cache
and kernel-side implementation of xdr(3) is contained in krpc.ko

Out of curiosity, I've commented out mentioned MODULE_DEPEND(zfsctrl, krpc, 1, 
1, 1),
rebuilt zfs.ko and tried to kldload it using UFS-only system having no NFS code 
in the kernel
and it failed with a note in dmesg:

link_elf: symbol xdrmem_create undefined

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


FreeBSD 11.2 kernel crash when dd

2018-10-19 Thread Sebastian Wojtczak
Hi,

I would like to report a kernel crash while dd on ssd drive.

Just found that my PC crashed several times during below command:
dd if=/dev/ada2 of=file_name bs=10m.

I was trying to make an image from my ssd drive. Once dump file hit size
41G or 52G kernel crashes and reboot the system.

Oct 18 12:30:11 username syslogd: kernel boot file is /boot/kernel/kernel
Oct 18 12:30:11 username kernel:
Oct 18 12:30:11 username kernel:
Oct 18 12:30:11 username kernel: Fatal trap 12: page fault while in kernel
mode
Oct 18 12:30:11 username kernel: cpuid = 1; apic id = 01
Oct 18 12:30:11 username kernel: fault virtual address  = 0x5a
Oct 18 12:30:11 username kernel: fault code = supervisor read
data, page not present
Oct 18 12:30:11 username kernel: instruction pointer=
0x20:0x80e67f6d
Oct 18 12:30:11 username kernel: stack pointer  =
0x28:0xfe084b408f40
Oct 18 12:30:11 username kernel: frame pointer  =
0x28:0xfe084b408f80
Oct 18 12:30:11 username kernel: code segment   = base 0x0, limit
0xf, type 0x1b
Oct 18 12:30:11 username kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
Oct 18 12:30:11 username kernel: processor eflags   = interrupt
enabled, resume, IOPL = 0
Oct 18 12:30:11 username kernel: current process= 0
(zio_write_issue_8)
Oct 18 12:30:11 username kernel: trap number= 12
Oct 18 12:30:11 username kernel: panic: page fault
Oct 18 12:30:11 username kernel: cpuid = 1
Oct 18 12:30:11 username kernel: KDB: stack backtrace:
Oct 18 12:30:11 username kernel: #0 0x80b50087 at kdb_backtrace+0x67
Oct 18 12:30:11 username kernel: #1 0x80b099f7 at vpanic+0x177
Oct 18 12:30:11 username kernel: #2 0x80b09873 at panic+0x43
Oct 18 12:30:11 username kernel: #3 0x80fe105f at trap_fatal+0x35f
Oct 18 12:30:11 username kernel: #4 0x80fe10b9 at trap_pfault+0x49
Oct 18 12:30:11 username kernel: #5 0x80fe0887 at trap+0x2c7
Oct 18 12:30:11 username kernel: #6 0x80fc04cc at calltrap+0x8
Oct 18 12:30:11 username kernel: #7 0x80e56df2 at kmem_back+0xf2
Oct 18 12:30:11 username kernel: #8 0x80e56cd0 at kmem_malloc+0x60
Oct 18 12:30:11 username kernel: #9 0x80e4e752 at
keg_alloc_slab+0xe2
Oct 18 12:30:11 username kernel: #10 0x80e5118e at
keg_fetch_slab+0x14e
Oct 18 12:30:11 username kernel: #11 0x80e509a4 at
zone_fetch_slab+0x64
Oct 18 12:30:11 username kernel: #12 0x80e50a7f at zone_import+0x3f
Oct 18 12:30:11 username kernel: #13 0x80e4d199 at
uma_zalloc_arg+0x3d9
Oct 18 12:30:11 username kernel: #14 0x832d2ab2 at
zio_write_compress+0x1e2
Oct 18 12:30:11 username kernel: #15 0x832d174c at zio_execute+0xac
Oct 18 12:30:11 username kernel: #16 0x80b617e4 at
taskqueue_run_locked+0x154
Oct 18 12:30:11 username kernel: #17 0x80b62918 at
taskqueue_thread_loop+0x98
Oct 18 12:30:11 username kernel: Uptime: 5m50s

One virtual machine is started with bhyve at startup but even if I shutdown
it, same crash happen. Disabling vmm does not help but only extend time to
crash during ssd dump.

Current zfs setup is zraid on 3 (500GB) hdd drives with compress=on. Drive
ada0 is not part of zraid and is not attached/mount what ever.

Any help how to investigate it is appreciated.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: krpc: unbootable ZFS-on-root after major upgrade to 11.2

2018-10-19 Thread Andriy Gapon
On 19/10/2018 12:24, Eugene Grosbein wrote:
> On 19.10.2018 13:28, Andriy Gapon wrote:
> 
>>> It was brought to my attention that 10.x did not require availability
>>> of krpc for ZFS-on-root system to be bootable but 11.x does.
>>>
>>> That is, major upgrade of 10.x ZFS-on-root system to 11.x
>>> results in non-bootable broken system if it uses custom kernel without NFS 
>>> bits
>>> that automatically come with krpc, and the system was built with 
>>> MODULES_OVERRIDE="zfs opensolaris"
>>> and no krpc mentioned.
>>
>> Could you please also describe specifics of the problem?
>> It's kidn of strange that root-on-zfs requires krpc.
> 
> https://svnweb.freebsd.org/base/stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c?revision=339111&view=markup#l7146
> 
> This code uses some xdr(3) functions to parse zpool.cache
> and kernel-side implementation of xdr(3) is contained in krpc.ko
> 
> Out of curiosity, I've commented out mentioned MODULE_DEPEND(zfsctrl, krpc, 
> 1, 1, 1),
> rebuilt zfs.ko and tried to kldload it using UFS-only system having no NFS 
> code in the kernel
> and it failed with a note in dmesg:
> 
> link_elf: symbol xdrmem_create undefined


It's strange that this is a 10.x vs 11.x issue.
I see that zfs has the krpc dependency since r193128.
And the call to xdrmem_create is there since r168404.


-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 11.2 kernel crash when dd

2018-10-19 Thread rainer

Am 2018-10-19 13:10, schrieb Sebastian Wojtczak:

Hi,

I would like to report a kernel crash while dd on ssd drive.




Reducing ARC may help:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231296
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231794

See here:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=163461








___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 11.2 kernel crash when dd

2018-10-19 Thread Mark Johnston
On Fri, Oct 19, 2018 at 01:10:15PM +0200, Sebastian Wojtczak wrote:
> Hi,
> 
> I would like to report a kernel crash while dd on ssd drive.
> 
> Just found that my PC crashed several times during below command:
> dd if=/dev/ada2 of=file_name bs=10m.
> 
> I was trying to make an image from my ssd drive. Once dump file hit size
> 41G or 52G kernel crashes and reboot the system.
> 
> Oct 18 12:30:11 username syslogd: kernel boot file is /boot/kernel/kernel
> Oct 18 12:30:11 username kernel:
> Oct 18 12:30:11 username kernel:
> Oct 18 12:30:11 username kernel: Fatal trap 12: page fault while in kernel
> mode
> Oct 18 12:30:11 username kernel: cpuid = 1; apic id = 01
> Oct 18 12:30:11 username kernel: fault virtual address  = 0x5a
> Oct 18 12:30:11 username kernel: fault code = supervisor read
> data, page not present
> Oct 18 12:30:11 username kernel: instruction pointer=
> 0x20:0x80e67f6d
> Oct 18 12:30:11 username kernel: stack pointer  =
> 0x28:0xfe084b408f40
> Oct 18 12:30:11 username kernel: frame pointer  =
> 0x28:0xfe084b408f80
> Oct 18 12:30:11 username kernel: code segment   = base 0x0, limit
> 0xf, type 0x1b
> Oct 18 12:30:11 username kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
> Oct 18 12:30:11 username kernel: processor eflags   = interrupt
> enabled, resume, IOPL = 0
> Oct 18 12:30:11 username kernel: current process= 0
> (zio_write_issue_8)
> Oct 18 12:30:11 username kernel: trap number= 12
> Oct 18 12:30:11 username kernel: panic: page fault
> Oct 18 12:30:11 username kernel: cpuid = 1
> Oct 18 12:30:11 username kernel: KDB: stack backtrace:
> Oct 18 12:30:11 username kernel: #0 0x80b50087 at kdb_backtrace+0x67
> Oct 18 12:30:11 username kernel: #1 0x80b099f7 at vpanic+0x177
> Oct 18 12:30:11 username kernel: #2 0x80b09873 at panic+0x43
> Oct 18 12:30:11 username kernel: #3 0x80fe105f at trap_fatal+0x35f
> Oct 18 12:30:11 username kernel: #4 0x80fe10b9 at trap_pfault+0x49
> Oct 18 12:30:11 username kernel: #5 0x80fe0887 at trap+0x2c7
> Oct 18 12:30:11 username kernel: #6 0x80fc04cc at calltrap+0x8
> Oct 18 12:30:11 username kernel: #7 0x80e56df2 at kmem_back+0xf2
> Oct 18 12:30:11 username kernel: #8 0x80e56cd0 at kmem_malloc+0x60
> Oct 18 12:30:11 username kernel: #9 0x80e4e752 at
> keg_alloc_slab+0xe2
> Oct 18 12:30:11 username kernel: #10 0x80e5118e at
> keg_fetch_slab+0x14e
> Oct 18 12:30:11 username kernel: #11 0x80e509a4 at
> zone_fetch_slab+0x64
> Oct 18 12:30:11 username kernel: #12 0x80e50a7f at zone_import+0x3f
> Oct 18 12:30:11 username kernel: #13 0x80e4d199 at
> uma_zalloc_arg+0x3d9
> Oct 18 12:30:11 username kernel: #14 0x832d2ab2 at
> zio_write_compress+0x1e2
> Oct 18 12:30:11 username kernel: #15 0x832d174c at zio_execute+0xac
> Oct 18 12:30:11 username kernel: #16 0x80b617e4 at
> taskqueue_run_locked+0x154
> Oct 18 12:30:11 username kernel: #17 0x80b62918 at
> taskqueue_thread_loop+0x98
> Oct 18 12:30:11 username kernel: Uptime: 5m50s
> 
> One virtual machine is started with bhyve at startup but even if I shutdown
> it, same crash happen. Disabling vmm does not help but only extend time to
> crash during ssd dump.
> 
> Current zfs setup is zraid on 3 (500GB) hdd drives with compress=on. Drive
> ada0 is not part of zraid and is not attached/mount what ever.
> 
> Any help how to investigate it is appreciated.

The stack suggests a bug in the kmem_* KPI, but I'm having trouble
seeing the problem.  In particular, the fault address suggests that we
crashed while testing (m->flags & PG_ZERO) == 0, but it shouldn't be
possible for m to be NULL there.  My attempts to reproduce this on
12-CURRENT haven't yielded anything yet.  Would you (or anyone else
seeing the problem) be willing to share a kernel dump?  I'd need the
vmcore, the contents of /boot/kernel and /usr/lib/debug/boot/kernel.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"