[main test example and main/releng/13 mixed example] On 2021-May-20, at 20:36, Mark Millard <marklmi at yahoo.com> wrote:
> [stable/13 test: example ends up being odder. That might > allow eliminating some potential alternatives.] > > On 2021-May-20, at 19:38, Mark Millard <marklmi at yahoo.com> wrote: >> >> On 2021-May-20, at 18:09, Rick Macklem <rmack...@uoguelph.ca> wrote: >>> >>> Oh, one additional thing that I'll dare to top post... >>> r367492 broke the TCP upcalls that the NFS server uses, such >>> that intermittent hangs of NFS mounts to FreeBSD13 servers can occur. >>> This has not yet been resolved in "main" etc and could explain >>> why an RPC could time out for a soft mount. >> >> See later notes that I added: soft mount is not required >> to see the problem. >> >>> You can revert the patch in r367492 to avoid the problem. >> >> If I understand right, you are indicating that this would >> not apply to the non-soft mount case that I got. >> >>> Disabling TSO, LRO are also de-facto standard things to do when >>> you observe weird NFS behaviour, because they are often broken >>> in various network device drivers. >> >> I'll have to figure out how to experiment with such. Things >> are at defaults rather generally on the systems. I'm not >> literate in the subject areas. >> >> I'm the only user of the machines and network. It is not >> outward facing. It is a rather small EtherNet network. >> >>> rick >>> >>> ________________________________________ >>> From: owner-freebsd-sta...@freebsd.org <owner-freebsd-sta...@freebsd.org> >>> on behalf of Rick Macklem <rmack...@uoguelph.ca> >>> Sent: Thursday, May 20, 2021 8:55 PM >>> To: FreeBSD-STABLE Mailing List; Mark Millard >>> Subject: Re: releng/13 release/13.0.0 : odd/incorrect diff result over nfs >>> (in a zfs file systems context) >>> >>> Mark Millard wrote: >>>> [I warn that I'm a fairly minimal user of NFS >>>> mounts, not knowing all that much. I'm mostly >>>> reporting this in case it ends up as evidence >>>> via eventually matching up with others observing >>>> possibly related oddities.] >>>> >>>> I got the following odd sequence (that I've >>>> mixed notes into). It involved a diff -r over NFS >>>> showing differences (files missing) and then a >>>> later diff finding matches for the same files, >>>> no file system changes made on either machine. >>>> I'm unable to reproduce the oddity on demand. >>>> >>>> Note: A larger scope diff -r originally returned the >>>> below as well, but doing the narrower diff -r did >>>> repeat the result and that is what I show. (I >>>> make no use of devel/ice .) >>>> >>>> # diff -r /usr/ports/devel/ice/files /mnt/devel/ice/files | more >>>> Only in /usr/ports/devel/ice/files: Make.rules.FreeBSD >> . . . >>>> Only in /usr/ports/devel/ice/files: patch-scripts-TestUtil.py >>>> >>>> Note: The above was not expected. So I tried: >>>> >>>> # ls -Tld /mnt/devel/ice/files/* >>>> -rw-r--r-- 1 root wheel 755 Apr 21 21:07:54 2021 >>>> /mnt/devel/ice/files/Make.rules.FreeBSD >> . . . >>>> -rw-r--r-- 1 root wheel 2588 Apr 21 21:07:54 2021 >>>> /mnt/devel/ice/files/patch-scripts-TestUtil.py >>>> >>>> Note: So that indicated that the files were there on the >>>> machine that /mnt references. So attempting the original >>>> diff -r again: >>>> >>>> # diff -r /usr/ports/devel/ice/files /mnt/devel/ice/files | more >>>> # >>>> >>>> (Empty difference.) >>>> >>>> Note: So after the explicit "ls -Tld /mnt/devel/ice/files/*" >>>> the odd result of the diff -r no longer happened: no >>>> differences reported. >>>> >>>> >>>> >>>> For reference (both machines reported): >>>> >>>> . . . >>>> The original mount command was on CA72_16Gp_ZFS: >>>> >>>> # mount -onoatime,soft 192.168.1.170:/usr/ports/ /mnt/ >>> The likely explanation for this is your use of a "soft" mount. >>> - If the NFS server is slow to respond or there is a temporary network >>> issue, >>> the RPC request can time out and then the >>> syscall can fail with EINT/ETIMEDOUT. Since almost nothing, including the >>> readdir(3) libc functions expect syscalls to fail this way... >>> Then the cached directory is messed up. >>> Doing the "ls" read the directory again and fixed the problem. >>> >>> Try to reproduce it for a mount without the "soft" option. >>> (If a mount point is hung, due to an unresponsive server "umount -N /mnt" >>> can usually get rid of it.) >>> Personally, I thought "soft" was a bad idea when Sun introduced it in NFS >>> in 1985 >>> and I still feel that way. >>> --> If you can reproduce it without "soft" then I can't explain it. >>> To be honest, the directory reading/caching code in the NFSv3 client >>> hasn't changed significantly in literally decades, as far as I can >>> remember. >> >> Well . . . trying an even wider scope diff than >> the original . . . >> >> # umount /mnt/ >> # mount -onoatime 192.168.1.170:/usr/ports/ /mnt/ >> # diff -r /usr/ports/ /mnt/ | more >> Only in /mnt/databases/mongodb42/files/aarch64: >> patch-src_third__party_mozjs-60_ >> Only in /usr/ports/databases/mongodb42/files/aarch64: >> patch-src_third__party_mozjs-60_platform_aarch64_freebsd_build_Unified__cpp__js__src25.cpp >> Only in /usr/ports/devel/ice/files: Make.rules.FreeBSD >> Only in /usr/ports/devel/ice/files: patch-config-Make.common.rules >> Only in /usr/ports/devel/ice/files: patch-cpp-Makefile >> . . . >> Only in /usr/ports/devel/ice/files: >> patch-python-test-Slice-unicodePaths-run.py >> Only in /usr/ports/devel/ice/files: patch-scripts-Expect.py >> Only in /usr/ports/devel/ice/files: patch-scripts-IceGridAdmin.py >> Only in /usr/ports/devel/ice/files: patch-scripts-TestUtil.py >> >> So the devel/ice files showed up again. >> >> But 2 other lines show up, one finding a file supposedly only >> on /mnt/. . . >> >> QUOTE >> Only in /mnt/databases/mongodb42/files/aarch64: >> patch-src_third__party_mozjs-60_ >> END QUOTE >> >> That seems to be a truncated file name. Looking directly on the machine that >> /mnt/ references (hitting tab at the end of the partial name to show a >> list): >> >> # ls -Tld >> /usr/ports/databases/mongodb42/files/aarch64/patch-src_third__party_mozjs-60_ >> /usr/ports/databases/mongodb42/files/aarch64/patch-src_third__party_mozjs-60_gen-config.sh >> >> /usr/ports/databases/mongodb42/files/aarch64/patch-src_third__party_mozjs-60_platform_aarch64_freebsd_build_js-confdefs.h >> >> /usr/ports/databases/mongodb42/files/aarch64/patch-src_third__party_mozjs-60_platform_aarch64_freebsd_build_Unified__cpp__js__src0.cpp >> >> /usr/ports/databases/mongodb42/files/aarch64/patch-src_third__party_mozjs-60_platform_aarch64_freebsd_build_Unified__cpp__js__src1.cpp >> >> . . . >> /usr/ports/databases/mongodb42/files/aarch64/patch-src_third__party_mozjs-60_platform_aarch64_freebsd_build_Unified__cpp__js__src9.cpp >> >> /usr/ports/databases/mongodb42/files/aarch64/patch-src_third__party_mozjs-60_platform_aarch64_freebsd_include_js-config.h >> >> >> The other machine agrees (machine-local usage). >> >> The other of the 2 new names is one of the matches to the prefix: >> >> QUOTE >> Only in /usr/ports/databases/mongodb42/files/aarch64: >> patch-src_third__party_mozjs-60_platform_aarch64_freebsd_build_Unified__cpp__js__src25.cpp >> END QUOTE >> >> For reference: I've not gotten any console messages about >> anything during these. >> >>> One additional thing to note is that cached directory contents are >>> invalidated >>> when the directory's ctime changes. >> >> I'm not aware of anything that should have been touching the >> /usr/ports file systems on either machine any time near my >> diff activities. (I'm the only system user.) >> >>> I am not sure how/if/when ZFS changes a >>> directory's ctime. However, if it was badly broken, I'd hear about this a >>> lot. >>> (If the ZFS change to ZoL has changed its ctime handling, that might also >>> explain it >>> and I'll be hearing a lot more soon as FreeBSD13 becomes adopted. I never >>> use ZFS and, >>> as such, never test with it.) >> >> I recently decided to try using bectl, which lead to my recent >> ZFS-based system experiments. >> >> This means I can boot the stable/13 or main [so: 14] that >> I last built and try the same experiments with the same >> /usr/ports file sysystems. releng/13 's release/13.0.0 , >> stable/13 , and main are all non-debug builds as stands. I >> could add debug builds of any or all, but it would take >> a while. (aarch64 4-core Cortex-A72 contexts.) >> >>> --> For UFS, if you use mtime, directory caching does not work as well, >>> which is >>> why the client directory caching code uses ctime and not mtime to >>> detect that >>> a directory has changed and cached directory blocks need to be >>> invalidated. >>> >>> Jason Bacon did report a directory reading issue some months ago that never >>> quite got resolved, although I recall he said he couldn't reproduce it >>> after a >>> system update, so he thought it was related to some local change he had >>> made. >>> (I can't remember his email or I'd add him to the cc so he could remind me >>> what >>> his case was. I do recall it being somewhat reproducible and happened for >>> both >>> UFS and ZFS.) >>>> The network is just a local EtherNet. >>> >> > > > stable/13 got similar "diff -r /usr/ports/ /mnt/ | more" results but > /mnt/devel/electron12/files indications of the /usr/ports/devel/ice/files > ones. It did again start with: > > Only in /mnt/databases/mongodb42/files/aarch64: > patch-src_third__party_mozjs-60_ > Only in /usr/ports/databases/mongodb42/files/aarch64: > patch-src_third__party_mozjs-60_platform_aarch64_freebsd_build_Unified__cpp__js__src25.cpp > > for this rather wide range diff -r . It continued with: > > Only in /mnt/devel/electron12/files: > Only in /mnt/devel/electron12/files: package.json > Only in /mnt/devel/electron12/files: > patch-apps_ui_views_app__window__frame__view.cc > Only in /mnt/devel/electron12/files: > patch-ash_display_mirror__window__controller.cc > Only in /mnt/devel/electron12/files: patch-base_BUILD.gn > . . . > > It finished with: > > Only in /mnt/devel/electron12/files: yarn.lock > Only in /mnt/devel/electron12/files: <A0><CE><C8>֏<DC>ܦ2<B2><E2><AA>^H > Only in /mnt/www/chromium/files: patch-chrome_browser_chrome__browser > Only in /usr/ports/www/chromium/files: > patch-chrome_browser_chrome__browser__main__posix.cc > > > That last is the only /usr/ports/ prefixed path this time: the > only one where it was under /mnt/ that something appeared to > be missing. > > It appears that the file name on the line after the yarn.lock > line is garbage with no matching file present when using ls > on the system that /mnt/ references. > > Locally on each machine the devel/electron12/files/* files > are shown by ls as present ( through yarn.lock ). > > NOTE: > I find it odd that the local /usr/ports/ ended up being where > most of the files were reported as missing, instead of under > /mnt/ : Wrong side for a network/network-protocol issue? > > > For reference (David W. indicated I should look at ifconfig > for figuring out controlling TSO and such so I figured I'd > show the default ifconfig output): > > # ifconfig > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 > options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> > inet6 ::1 prefixlen 128 > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 > inet 127.0.0.1 netmask 0xff000000 > groups: lo > nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> > ue0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 > > options=68009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> > ether REPLACED > inet 192.168.1.148 netmask 0xffffff00 broadcast 192.168.1.255 > media: Ethernet autoselect (1000baseT <full-duplex>) > status: active > nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> > > # ifconfig > genet0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 > > options=68000b<RXCSUM,TXCSUM,VLAN_MTU,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> > ether REPLACED > inet6 REPLACED%genet0 prefixlen 64 scopeid 0x1 > inet6 REPLACED prefixlen 64 autoconf > inet 192.168.1.170 netmask 0xffffff00 broadcast 192.168.1.255 > media: Ethernet autoselect (1000baseT <full-duplex>) > status: active > nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL> > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 > options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> > inet6 ::1 prefixlen 128 > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 > inet 127.0.0.1 netmask 0xff000000 > groups: lo > nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> > > > # uname -apKU > FreeBSD CA72_16Gp_ZFS 13.0-STABLE FreeBSD 13.0-STABLE #1 > stable/13-n245474-fb34817c686c-dirty: Sat May 1 02:27:02 PDT 2021 > root@CA72_4c8G_ZFS:/usr/obj/BUILDs/13S-CA72-nodbg-clang/usr/13S-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 > arm64 aarch64 1300504 1300504 > > # ~/fbsd-based-on-what-commit.sh > branch: stable/13 > merge-base: fb34817c686cc130449325499870e36979899801 > merge-base: CommitDate: 2021-05-01 00:56:57 +0000 > fb34817c686c (HEAD -> stable/13, freebsd/stable/13) param.h: bump > __FreeBSD_version for commits efe7f12cd37b and 9781105bea58 > n245474 (--first-parent --count for merge-base) > > # uname -apKU > FreeBSD CA72_4c8G_ZFS 13.0-STABLE FreeBSD 13.0-STABLE #1 > stable/13-n245474-fb34817c686c-dirty: Sat May 1 02:27:02 PDT 2021 > root@CA72_4c8G_ZFS:/usr/obj/BUILDs/13S-CA72-nodbg-clang/usr/13S-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 > arm64 aarch64 1300504 1300504 > > # ~/fbsd-based-on-what-commit.sh > branch: stable/13 > merge-base: fb34817c686cc130449325499870e36979899801 > merge-base: CommitDate: 2021-05-01 00:56:57 +0000 > fb34817c686c (HEAD -> stable/13, freebsd/stable/13) param.h: bump > __FreeBSD_version for commits efe7f12cd37b and 9781105bea58 > n245474 (--first-parent --count for merge-base) Both systems running main: # diff -r /usr/ports/ /mnt/ | more Only in /mnt/databases/mongodb42/files/aarch64: patch-src_third__party_mozjs-60_ Only in /usr/ports/databases/mongodb42/files/aarch64: patch-src_third__party_mozjs-60_platform_aarch64_freebsd_build_Unified__cpp__js__src25.cpp Only in /mnt/devel/electron12/files: Only in /mnt/devel/electron12/files: Only in /mnt/devel/electron12/files: patch-chrome2 Only in /usr/ports/devel/electron12/files: patch-chrome_browser_media_webrtc_webrtc__logging__controller.cc Only in /usr/ports/devel/electron12/files: patch-chrome_browser_ui_webui_settings_appearance__handler.h Only in /usr/ports/devel/electron12/files: patch-components_previews_core_previews__features.cc Only in /usr/ports/devel/electron12/files: patch-ui_compositor_compositor.cc Only in /mnt/devel/electron12/files: <A0><CE><C8>֏<DC>ܦ2<B2><E2><AA>^H (That was all that was listed.) # uname -apKU FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #1 main-n246411-a6ca7519f89c-dirty: Sat May 1 19:07:50 PDT 2021 root@CA72_4c8G_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400013 1400013 # ~/fbsd-based-on-what-commit.sh branch: main merge-base: a6ca7519f89c52e9fab205cded0f2bf32d914cd6 merge-base: CommitDate: 2021-05-01 00:58:11 +0000 a6ca7519f89c (HEAD -> main, freebsd/main, freebsd/HEAD) powerpc64: Optimize radix trap handling a little more n246411 (--first-parent --count for merge-base) # uname -apKU FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #1 main-n246411-a6ca7519f89c-dirty: Sat May 1 19:07:50 PDT 2021 root@CA72_4c8G_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400013 1400013 # ~/fbsd-based-on-what-commit.sh branch: main merge-base: a6ca7519f89c52e9fab205cded0f2bf32d914cd6 merge-base: CommitDate: 2021-05-01 00:58:11 +0000 a6ca7519f89c (HEAD -> main, freebsd/main, freebsd/HEAD) powerpc64: Optimize radix trap handling a little more n246411 (--first-parent --count for merge-base) I tried main on the /usr/ side with releng/13 's release/13.0.0 where /mnt/ references and got: # diff -r /usr/ports/ /mnt/ | more Only in /mnt/devel/electron12/files: package.json Only in /mnt/devel/electron12/files: patch-apps_ui_views_app__window__frame__view.cc Only in /mnt/devel/electron12/files: patch-ash_display_mirror__window__controller.cc Only in /mnt/devel/electron12/files: patch-base_BUILD.gn . . . Only in /mnt/devel/electron12/files: patch-weblayer_browser_system__network__context__manager.cc Only in /mnt/devel/electron12/files: patch-weblayer_common_weblayer__paths.cc Only in /mnt/devel/electron12/files: yarn.lock Only in /usr/ports/devel/ice/files: Make.rules.FreeBSD Only in /usr/ports/devel/ice/files: patch-config-Make.common.rules Only in /usr/ports/devel/ice/files: patch-cpp-Makefile . . . Only in /usr/ports/devel/ice/files: patch-scripts-Expect.py Only in /usr/ports/devel/ice/files: patch-scripts-IceGridAdmin.py Only in /usr/ports/devel/ice/files: patch-scripts-TestUtil.py Only in /mnt/games: 0ad Only in /mnt/games: 0verkill Only in /mnt/games: 2048 . . . Only in /mnt/games: zaz Only in /mnt/games: zhlt Only in /mnt/games: ztrack No obvious garbage or truncated names. Another mix of /mnt/ vs. /usr/ being the "missing" side. NOTE: So far I do not see an obvious reason to prefer any specific one of releng/13 vs. stable/13 vs. main at either end of the connection for the vintages that I happen to have in place for them. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) _______________________________________________ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"