Re: releng/13 release/13.0.0 : odd/incorrect diff result over nfs (in a zfs file systems context)

2021-05-23 Thread Mark Millard via freebsd-stable
On 2021-May-21, at 17:56, Rick Macklem  wrote:

> Mark Millard wrote:
> [stuff snipped]
>> Well, why is it that ls -R, find, and diff -r all get file
>> name problems via genet0 but diff -r gets no problems
>> comparing the content of files that it does match up (the
>> vast majority)? Any clue how could the problems possibly
>> be unique to the handling of file names/paths? Does it
>> suggest anything else to look into for getting some more
>> potentially useful evidence?
> Well, all I can do is describe the most common TSO related
> failure:
> - When a read RPC reply (including NFS/RPC/TCP/IP headers)
>  is slightly less than 64K bytes (many TSO implementations are
>  limited to 64K or 32 discontiguous segments, think 32 2K
>  mbuf clusters), the driver decides it is ok, but when the MAC
>  header is added it exceeds what the hardware can handle correctly...
> --> This will happen when reading a regular file that is slightly less
>   than a multiple of 64K in size.
> or
> --> This will happen when reading just about any large directory,
>  since the directory reply for a 64K request is converted to Sun XDR
>  format and clipped at the last full directory entry that will fit within 
> 64K.
> For ports, where most files are small, I think you can tell which is more
> likely to happen.
> --> If TSO is disabled, I have no idea how this might matter, but??
> 
>> I'll note that netstat -I ue0 -d and netstat -I genet0 -d
>> do not report changes in Ierrs or Idrop in a before vs.
>> after failures comparison. (There may be better figures
>> to look at for all I know.)
>> 
>> I tried "ifconfig genet0 -rxcsum -rxcsum -rxcsum6 -txcsum6"
>> and got no obvious change in behavior.
> All we know is that the data is getting corrupted somehow.
> 
> NFS traffic looks very different than typical TCP traffic. It is
> mostly small messages travelling in both directions concurrently,
> with some large messages thrown in the mix.
> All I'm saying is that, testing a net interface with something like
> bulk data transfer in one direction doesn't verify it works for NFS
> traffic.
> 
> Also, the large RPC messages are a chain of about 33 mbufs of
> various lengths, including a mix of partial clusters and regular
> data mbufs, whereas a bulk send on a socket will typically
> result in an mbuf chain of a lot of full 2K clusters.
> --> As such, NFS can be good at tickling subtle bugs it the
>  net driver related to mbuf handling.
> 
> rick
> 
>>> W.r.t. reverting r367492...the patch to replace r367492 was just
>>> committed to "main" by rscheff@ with a two week MFC, so it
>>> should be in stable/13 soon. Not sure if an errata can be done
>>> for it for releng13.0?
>> 
>> That update is reported to be causing "rack" related panics:
>> 
>> https://lists.freebsd.org/pipermail/dev-commits-src-main/2021-May/004440.html
>> 
>> reports (via links):
>> 
>> panic: _mtx_lock_sleep: recursed on non-recursive mutex so_snd @ 
>> /syzkaller/managers/i386/kernel/sys/modules/tcp/rack/../../../netinet/tcp_stacks/rack.c:10632
>> 
>> Still, I have a non-debug update to main building and will
>> likely do a debug build as well. llvm is rebuilding, so
>> the builds will take a notable time.

I got the following built and installed on the two
machines:

# uname -apKU
FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #1 
main-n246854-03b0505b8fe8-dirty: Sat May 22 16:25:04 PDT 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-dbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-DBG-CA72
  arm64 aarch64 1400013 1400013

# uname -apKU
FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #1 
main-n246854-03b0505b8fe8-dirty: Sat May 22 16:25:04 PDT 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-dbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-DBG-CA72
  arm64 aarch64 1400013 1400013

Note that both are booted with debug builds of main.

Using the context with the alternate EtherNet device that has not
had an associated diff -r, find, pr ls -R failure yet
yet got a panic that looks likely to be unrelated:

# mount -onoatime 192.168.1.187:/usr/ports/ /mnt/
# diff -r /usr/ports/ /mnt/ | more
nvme0: cpl does not map to outstanding cmd
cdw0: sqhd:0020 sqid:0003 cid:007e p:1 sc:00 sct:0 m:0 dnr:0
panic: received completion for unknown cmd
cpuid = 3
time = 1621743752
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x188
panic() at panic+0x44
nvme_qpair_process_completions() at nvme_qpair_process_completions+0x1fc
nvme_timeout() at nvme_timeout+0x3c
softclock_call_cc() at softclock_call_cc+0x124
softclock() at softclock+0x60
ithread_loop() at ithread_loop+0x2a8
fork_exit() at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 12 tid 100028 ]
Stopped at  kdb_enter+0x48: undefined   f904411f
db> 

Based on the "nvme" references, I expect this is tied to
handling the Optane 480 GiByte that is in the PCIe s

Re: releng/13 release/13.0.0 : odd/incorrect diff result over nfs (in a zfs file systems context)

2021-05-23 Thread Mark Millard via freebsd-stable
On 2021-May-23, at 00:44, Mark Millard  wrote:

> On 2021-May-21, at 17:56, Rick Macklem  wrote:
> 
>> Mark Millard wrote:
>> [stuff snipped]
>>> Well, why is it that ls -R, find, and diff -r all get file
>>> name problems via genet0 but diff -r gets no problems
>>> comparing the content of files that it does match up (the
>>> vast majority)? Any clue how could the problems possibly
>>> be unique to the handling of file names/paths? Does it
>>> suggest anything else to look into for getting some more
>>> potentially useful evidence?
>> Well, all I can do is describe the most common TSO related
>> failure:
>> - When a read RPC reply (including NFS/RPC/TCP/IP headers)
>> is slightly less than 64K bytes (many TSO implementations are
>> limited to 64K or 32 discontiguous segments, think 32 2K
>> mbuf clusters), the driver decides it is ok, but when the MAC
>> header is added it exceeds what the hardware can handle correctly...
>> --> This will happen when reading a regular file that is slightly less
>>  than a multiple of 64K in size.
>> or
>> --> This will happen when reading just about any large directory,
>> since the directory reply for a 64K request is converted to Sun XDR
>> format and clipped at the last full directory entry that will fit within 
>> 64K.
>> For ports, where most files are small, I think you can tell which is more
>> likely to happen.
>> --> If TSO is disabled, I have no idea how this might matter, but??
>> 
>>> I'll note that netstat -I ue0 -d and netstat -I genet0 -d
>>> do not report changes in Ierrs or Idrop in a before vs.
>>> after failures comparison. (There may be better figures
>>> to look at for all I know.)
>>> 
>>> I tried "ifconfig genet0 -rxcsum -rxcsum -rxcsum6 -txcsum6"
>>> and got no obvious change in behavior.
>> All we know is that the data is getting corrupted somehow.
>> 
>> NFS traffic looks very different than typical TCP traffic. It is
>> mostly small messages travelling in both directions concurrently,
>> with some large messages thrown in the mix.
>> All I'm saying is that, testing a net interface with something like
>> bulk data transfer in one direction doesn't verify it works for NFS
>> traffic.
>> 
>> Also, the large RPC messages are a chain of about 33 mbufs of
>> various lengths, including a mix of partial clusters and regular
>> data mbufs, whereas a bulk send on a socket will typically
>> result in an mbuf chain of a lot of full 2K clusters.
>> --> As such, NFS can be good at tickling subtle bugs it the
>> net driver related to mbuf handling.
>> 
>> rick
>> 
 W.r.t. reverting r367492...the patch to replace r367492 was just
 committed to "main" by rscheff@ with a two week MFC, so it
 should be in stable/13 soon. Not sure if an errata can be done
 for it for releng13.0?
>>> 
>>> That update is reported to be causing "rack" related panics:
>>> 
>>> https://lists.freebsd.org/pipermail/dev-commits-src-main/2021-May/004440.html
>>> 
>>> reports (via links):
>>> 
>>> panic: _mtx_lock_sleep: recursed on non-recursive mutex so_snd @ 
>>> /syzkaller/managers/i386/kernel/sys/modules/tcp/rack/../../../netinet/tcp_stacks/rack.c:10632
>>> 
>>> Still, I have a non-debug update to main building and will
>>> likely do a debug build as well. llvm is rebuilding, so
>>> the builds will take a notable time.
> 
> I got the following built and installed on the two
> machines:
> 
> # uname -apKU
> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #1 
> main-n246854-03b0505b8fe8-dirty: Sat May 22 16:25:04 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-dbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-DBG-CA72
>   arm64 aarch64 1400013 1400013
> 
> # uname -apKU
> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #1 
> main-n246854-03b0505b8fe8-dirty: Sat May 22 16:25:04 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-dbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-DBG-CA72
>   arm64 aarch64 1400013 1400013
> 
> Note that both are booted with debug builds of main.
> 
> Using the context with the alternate EtherNet device that has not
> had an associated diff -r, find, pr ls -R failure yet
> yet got a panic that looks likely to be unrelated:
> 
> # mount -onoatime 192.168.1.187:/usr/ports/ /mnt/
> # diff -r /usr/ports/ /mnt/ | more
> nvme0: cpl does not map to outstanding cmd
> cdw0: sqhd:0020 sqid:0003 cid:007e p:1 sc:00 sct:0 m:0 dnr:0
> panic: received completion for unknown cmd
> cpuid = 3
> time = 1621743752
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
> vpanic() at vpanic+0x188
> panic() at panic+0x44
> nvme_qpair_process_completions() at nvme_qpair_process_completions+0x1fc
> nvme_timeout() at nvme_timeout+0x3c
> softclock_call_cc() at softclock_call_cc+0x124
> softclock() at softclock+0x60
> ithread_loop() at ithread_loop+0x2a8
> fork_exit() at fork_exit+0x74
> fork_trampoline() at fork_trampoline+0x14
> KDB: enter: panic
> [ 

Re: releng/13 release/13.0.0 : odd/incorrect diff result over nfs (in a zfs file systems context)

2021-05-23 Thread Mark Millard via freebsd-stable
On 2021-May-23, at 01:27, Mark Millard  wrote:

> On 2021-May-23, at 00:44, Mark Millard  wrote:
> 
>> On 2021-May-21, at 17:56, Rick Macklem  wrote:
>> 
>>> Mark Millard wrote:
>>> [stuff snipped]
 Well, why is it that ls -R, find, and diff -r all get file
 name problems via genet0 but diff -r gets no problems
 comparing the content of files that it does match up (the
 vast majority)? Any clue how could the problems possibly
 be unique to the handling of file names/paths? Does it
 suggest anything else to look into for getting some more
 potentially useful evidence?
>>> Well, all I can do is describe the most common TSO related
>>> failure:
>>> - When a read RPC reply (including NFS/RPC/TCP/IP headers)
>>> is slightly less than 64K bytes (many TSO implementations are
>>> limited to 64K or 32 discontiguous segments, think 32 2K
>>> mbuf clusters), the driver decides it is ok, but when the MAC
>>> header is added it exceeds what the hardware can handle correctly...
>>> --> This will happen when reading a regular file that is slightly less
>>> than a multiple of 64K in size.
>>> or
>>> --> This will happen when reading just about any large directory,
>>>since the directory reply for a 64K request is converted to Sun XDR
>>>format and clipped at the last full directory entry that will fit within 
>>> 64K.
>>> For ports, where most files are small, I think you can tell which is more
>>> likely to happen.
>>> --> If TSO is disabled, I have no idea how this might matter, but??
>>> 
 I'll note that netstat -I ue0 -d and netstat -I genet0 -d
 do not report changes in Ierrs or Idrop in a before vs.
 after failures comparison. (There may be better figures
 to look at for all I know.)
 
 I tried "ifconfig genet0 -rxcsum -rxcsum -rxcsum6 -txcsum6"
 and got no obvious change in behavior.
>>> All we know is that the data is getting corrupted somehow.
>>> 
>>> NFS traffic looks very different than typical TCP traffic. It is
>>> mostly small messages travelling in both directions concurrently,
>>> with some large messages thrown in the mix.
>>> All I'm saying is that, testing a net interface with something like
>>> bulk data transfer in one direction doesn't verify it works for NFS
>>> traffic.
>>> 
>>> Also, the large RPC messages are a chain of about 33 mbufs of
>>> various lengths, including a mix of partial clusters and regular
>>> data mbufs, whereas a bulk send on a socket will typically
>>> result in an mbuf chain of a lot of full 2K clusters.
>>> --> As such, NFS can be good at tickling subtle bugs it the
>>>net driver related to mbuf handling.
>>> 
>>> rick
>>> 
> W.r.t. reverting r367492...the patch to replace r367492 was just
> committed to "main" by rscheff@ with a two week MFC, so it
> should be in stable/13 soon. Not sure if an errata can be done
> for it for releng13.0?
 
 That update is reported to be causing "rack" related panics:
 
 https://lists.freebsd.org/pipermail/dev-commits-src-main/2021-May/004440.html
 
 reports (via links):
 
 panic: _mtx_lock_sleep: recursed on non-recursive mutex so_snd @ 
 /syzkaller/managers/i386/kernel/sys/modules/tcp/rack/../../../netinet/tcp_stacks/rack.c:10632
 
 Still, I have a non-debug update to main building and will
 likely do a debug build as well. llvm is rebuilding, so
 the builds will take a notable time.
>> 
>> I got the following built and installed on the two
>> machines:
>> 
>> # uname -apKU
>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #1 
>> main-n246854-03b0505b8fe8-dirty: Sat May 22 16:25:04 PDT 2021 
>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-dbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-DBG-CA72
>>   arm64 aarch64 1400013 1400013
>> 
>> # uname -apKU
>> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #1 
>> main-n246854-03b0505b8fe8-dirty: Sat May 22 16:25:04 PDT 2021 
>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-dbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-DBG-CA72
>>   arm64 aarch64 1400013 1400013
>> 
>> Note that both are booted with debug builds of main.
>> 
>> Using the context with the alternate EtherNet device that has not
>> had an associated diff -r, find, pr ls -R failure yet
>> yet got a panic that looks likely to be unrelated:
>> 
>> # mount -onoatime 192.168.1.187:/usr/ports/ /mnt/
>> # diff -r /usr/ports/ /mnt/ | more
>> nvme0: cpl does not map to outstanding cmd
>> cdw0: sqhd:0020 sqid:0003 cid:007e p:1 sc:00 sct:0 m:0 dnr:0
>> panic: received completion for unknown cmd
>> cpuid = 3
>> time = 1621743752
>> KDB: stack backtrace:
>> db_trace_self() at db_trace_self
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>> vpanic() at vpanic+0x188
>> panic() at panic+0x44
>> nvme_qpair_process_completions() at nvme_qpair_process_completions+0x1fc
>> nvme_timeout() at nvme_timeout+0x3c
>> softclock_call_cc() at softclock_call_cc+0x124
>> softclock()