Re: HEADS UP: NFS changes coming into CURRENT early February
On Tue, Jan 21, 2025 at 10:27 PM Gleb Smirnoff wrote: > > CAUTION: This email originated from outside of the University of Guelph. Do > not click links or open attachments unless you recognize the sender and know > the content is safe. If in doubt, forward suspicious emails to > ith...@uoguelph.ca. > > > Hi, > > TLDR version: > users of NFS with Kerberos (e.g. running gssd(8)) as well as users of NFS with > TLS (e.g. running rpc.tlsclntd(8) or rpc.tlsservd(8)) as well as users of > network lock manager (e.g. having 'options NFSLOCKD' and running rpcbind(8)) > are affected. You would need to recompile & reinstall both the world and the > kernel together. Of course this is what you'd normally do when you track > FreeBSD CURRENT, but better be warned. I will post hashes of the specific > revisions that break API/ABI when they are pushed. > > Longer version: > last year I tried to check-in a new implementation of unix(4) SOCK_STREAM and > SOCK_SEQPACKET in d80a97def9a1, but was forced to back it out due to several > kernel side abusers of a unix(4) socket. The most difficult ones are the NFS > related RPC services, that act as RPC clients talking to an RPC servers in > userland. Since it is impossible to fully emulate a userland process > connection to a unix(4) socket they need to work with the socket internal > structures bypassing all the normal KPIs and conventions. Of course they > didn't tolerate the new implementation that totally eliminated intermediate > buffer on the sending side. > > While the original motivation for the upcoming changes is the fact that I want > to go forward with the new unix/stream and unix/seqpacket, I also tried to > make > kernel to userland RPC better. You judge if I succeeded or not :) Here are > some highlights: > > - Code footprint both in kernel clients and in userland daemons is reduced. > Example: gssd:1 file changed, 5 insertions(+), 64 deletions(-) >kgssapi: 1 file changed, 26 insertions(+), 78 deletions(-) > 4 files changed, 1 insertion(+), 11 deletions(-) > - You can easily see all RPC calls from kernel to userland with genl(1): > # genl monitor rpcnl > - The new transport is multithreaded in kernel by default, so kernel clients > can send a bunch of RPCs without any serialization and if the userland > figures out how to parallelize their execution, such parallelization would > happen. Note: new rpc.tlsservd(8) will use threads. > - One ad-hoc single program syscall is removed - gssd_syscall. Note: > rpctls syscall remains, but I have some ideas on how to improve that, too. > Not at this step though. > - All sleeps of kernel RPC calls are now in single place, and they all have > timeouts. I believe NFS services are now much more resilient to hangs. > A deadlock when NFS kernel thread is blocked on unix socket buffer, and > the socket can't go away because its application is blocked in some other > syscall is no longer possible. > > The code is posted on phabricator, reviews D48547 through D48552. > Reviewers are very welcome! > > I share my branch on Github. It is usually rebased on today's CURRENT: > > https://github.com/glebius/FreeBSD/commits/gss-netlink/ > > Early testers are very welcome! Ok, I can now do minimal testing and crashed it... I did a mount with option "tls" and then partitioned it from the NFS server by doing "ifconfig bridge0 down". Waited until the TCP connection closed and then did "ifconfig bridge0 up". The crash is a NULL pointer at rpctls_impl.c:255 (in rpctls_connect(), called from nfscl_renewthread(). The problem is that you made rpctls_connect_handle a vnet'd variable. The client side (aka an NFS mount) does not happen inside a jail and cannot use any vnet'd variables. Why? Well, any number of threads enter the NFS client via VOP_xxx() calls etc. Any one of them might end up doing a TCP reconnect when the underlying TCP connection is broken and then heals. I don't know why you made rpctls_connect_handle a vnet'd variable, but it cannot be that way. (I once looked at making NFS mounts work inside a vnet prison and gave up when I realized any old thread ends up in the code and it would have taken many, many CURVNET_SET() calls to make it work.) In summary, no global variable on the client side can be vnet'd and no global variable on the server side that is vnet'd can be shared with the client side code. I realize you are enthusiastic about this, but I'd suggest you back off to the minimal changes required to make this stuff work with netlink instead of unix domain sockets and stick with that, at least for the initial commit cycle. One thing to note is that few (if any) people who run main test this stuff. It may be 1-2years before it sees third party testing and I can only do minimal testing until at least April. Anyhow, thanks for all the good work you are doing with this, rick > > -- > Gleb Smirnoff >
Re: Difference in "netstat -rn" output in the last 2 months
Hi Alexander, It looks like the source of this change can be found here: https://reviews.freebsd.org/rG9206c79961986c2114a9a2cfccf009ac010ad259 Additional context can be found in the differential: https://reviews.freebsd.org/D10320 On Sun, Jan 26, 2025 at 3:42 PM Alexander Leidinger wrote: > Hi, > > something has changed in the output of "netstat -rn" between > 2024-11-23-195545 and 2025-01-22-151306. The default route is not listed > as "default" anymore, but with "0.0.0.0" resp. "::/0". This breaks some > tools (e.g. iocage). Iocage uses python, I'm not sure if it uses netstat > or some other interface, so it may not be directly related to netstat > itself but could be related to some other stuff (netlink maybe?). > > Does this ring a bell for someone? > > Bye, > Alexander. > > -- > http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF > http://www.FreeBSD.orgnetch...@freebsd.org : PGP 0x8F31830F9F2772BF > -- Chaleureusement, Pier-Luc
Re: "don't know how to make /usr/main-src/sys/contrib/dev/iwm/iwm-3160-17.fw.uu. Stop"
On Jan 25, 2025, at 02:24, Mark Millard wrote: > On Jan 25, 2025, at 02:10, Stefan Esser wrote: > >> Am 25.01.25 um 10:54 schrieb Mark Millard: >>> Unfortunately, for now my reporting is based on my personal build >>> environment, >>> not on anofficial FreeBSD build. >>> Context doing the building: >> >> Hi Mark, >> >> probably an issue due to commit af0a81b6470 from 2024-12-12: >> >> commit af0a81b6470aba4af4a24ae9804053722846ded4 >> Author: Emmanuel Vadot >> Date: Thu Dec 12 17:13:58 2024 +0100 >> >> iwm: Stop shipping firmware as kernel module >> >> Since we can load raw firmware start shipping them as is. >> This also remove the uuencode format that don't add any value and garbage >> collect old firmwares version. >> For pkgbase users they are now in the FreeBSD-firmware-iwm package. >> >> Sponsored by: Beckhoff Automation GmbH & Co. KG >> >> Maybe your sources are out of sync with regard to that commit? > > https://cgit.freebsd.org/src/blame/sys/conf/files shows the lines that > I quoted that indicate dependencies on various *.fw.uu files. > > It is true that a pre 2024-Dec-12 installation is attempting to > build what I reported: 2025-01-25 00:07:01 + ( i.e., n275030 > 46a9fb7287f41eedf321d81a68a826f231d11bfe ). I had not updated > at all between those times and was finally trying to update. > It appears that the following is enough to have the build failures: # grep -r iwm /usr/main-src/sys/*/conf*/ /usr/main-src/sys/amd64/conf/GENERIC-NODBG:device iwm /usr/main-src/sys/amd64/conf/GENERIC-NODBG:device iwmfw /usr/main-src/sys/amd64/conf/GENERIC-DBG:device iwm /usr/main-src/sys/amd64/conf/GENERIC-DBG:device iwmfw /usr/main-src/sys/amd64/conf/GENERIC-NODBG-NONUMA:device iwm /usr/main-src/sys/amd64/conf/GENERIC-NODBG-NONUMA:device iwmfw Commenting out the "device iwmfw" lines allowed the build to complete --but based on my /usr/main-src/ source tree for what was being built, not on building an official tree. To my knowledge, the builds shown by the likes of: https://pkg-status.freebsd.org/builds?type=package&all=1 do not involve "device iwmfw" --and so do not make for a valid comparison to my context. (I normally check if I'm getting unusual results vs. official builds.) Those GENERIC-*DBG* files are based on GENERIC . For example: # more /usr/main-src/sys/amd64/conf/GENERIC-NODBG # # GENERIC -- Custom configuration for the amd64/amd64 # include "GENERIC" ident GENERIC-NODBG makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols makeoptions WITH_CTF=1 # Run ctfconvert(1) for DTrace support options NUMA #optionsALT_BREAK_TO_DEBUGGER options KDB # Enable kernel debugger support # For minimum debugger support (stable branch) use: options KDB_TRACE # Print a stack trace for a panic options DDB # Enable the kernel debugger # Extra stuff: #optionsVERBOSE_SYSINIT=0 # Enable verbose sysinit messages #optionsBOOTVERBOSE=1 #optionsBOOTHOWTO=RB_VERBOSE #optionsKTR #optionsKTR_MASK=KTR_TRAP ##options KTR_CPUMASK=0xF #optionsKTR_VERBOSE #optionsACPI_DEBUG # Disable any extra checking for. . . nooptions DEADLKRES # Would enable the deadlock resolver nooptions INVARIANTS # Would enable calls of extra sanity checking nooptions INVARIANT_SUPPORT # Would enable extra sanity checks of internal structures, required by INVARIANTS nooptions WITNESS # Would enable checks to detect deadlocks and cycles nooptions WITNESS_SKIPSPIN# Would enable running witness on spinlocks for speed nooptions DIAGNOSTIC nooptions MALLOC_DEBUG_MAXZONES # Kernel Sanitizers nooptions COVERAGE# Would enable generic kernel coverage. Used by KCOV nooptions KCOV# Would enable Kernel Coverage Sanitizer # Warning: KUBSAN can result in a kernel too large for loader to load nooptions KUBSAN # Would enable Kernel Undefined Behavior Sanitizer device iwm device iwmfw Maybe the IWM(4) man page is out of date and the above need to be changed in some way and so I should follow updated instructions? IWM(4) FreeBSD Kernel Interfaces Manual IWM(4) NAME iwm – Intel IEEE 802.11ac wireless network driver SYNOPSIS To compile this driver into the kernel, include the following lines in your kernel configuration file: device iwm device pci device wlan device firmware You also need to select a firmware for your device. Choose one from: device iwm3160fw device iwm3168fw device iwm7260fw device iwm7265fw device iwm7265Dfw device iwm8000Cfw device iwm8265fw
Re: HEADS UP: NFS changes coming into CURRENT early February
On Sun, Jan 26, 2025 at 1:44 PM Rick Macklem wrote: > > On Tue, Jan 21, 2025 at 10:27 PM Gleb Smirnoff wrote: > > > > CAUTION: This email originated from outside of the University of Guelph. Do > > not click links or open attachments unless you recognize the sender and > > know the content is safe. If in doubt, forward suspicious emails to > > ith...@uoguelph.ca. > > > > > > Hi, > > > > TLDR version: > > users of NFS with Kerberos (e.g. running gssd(8)) as well as users of NFS > > with > > TLS (e.g. running rpc.tlsclntd(8) or rpc.tlsservd(8)) as well as users of > > network lock manager (e.g. having 'options NFSLOCKD' and running rpcbind(8)) > > are affected. You would need to recompile & reinstall both the world and > > the > > kernel together. Of course this is what you'd normally do when you track > > FreeBSD CURRENT, but better be warned. I will post hashes of the specific > > revisions that break API/ABI when they are pushed. > > > > Longer version: > > last year I tried to check-in a new implementation of unix(4) SOCK_STREAM > > and > > SOCK_SEQPACKET in d80a97def9a1, but was forced to back it out due to several > > kernel side abusers of a unix(4) socket. The most difficult ones are the > > NFS > > related RPC services, that act as RPC clients talking to an RPC servers in > > userland. Since it is impossible to fully emulate a userland process > > connection to a unix(4) socket they need to work with the socket internal > > structures bypassing all the normal KPIs and conventions. Of course they > > didn't tolerate the new implementation that totally eliminated intermediate > > buffer on the sending side. > > > > While the original motivation for the upcoming changes is the fact that I > > want > > to go forward with the new unix/stream and unix/seqpacket, I also tried to > > make > > kernel to userland RPC better. You judge if I succeeded or not :) Here are > > some highlights: > > > > - Code footprint both in kernel clients and in userland daemons is reduced. > > Example: gssd:1 file changed, 5 insertions(+), 64 deletions(-) > >kgssapi: 1 file changed, 26 insertions(+), 78 deletions(-) > > 4 files changed, 1 insertion(+), 11 deletions(-) > > - You can easily see all RPC calls from kernel to userland with genl(1): > > # genl monitor rpcnl > > - The new transport is multithreaded in kernel by default, so kernel clients > > can send a bunch of RPCs without any serialization and if the userland > > figures out how to parallelize their execution, such parallelization would > > happen. Note: new rpc.tlsservd(8) will use threads. > > - One ad-hoc single program syscall is removed - gssd_syscall. Note: > > rpctls syscall remains, but I have some ideas on how to improve that, too. > > Not at this step though. > > - All sleeps of kernel RPC calls are now in single place, and they all have > > timeouts. I believe NFS services are now much more resilient to hangs. > > A deadlock when NFS kernel thread is blocked on unix socket buffer, and > > the socket can't go away because its application is blocked in some other > > syscall is no longer possible. > > > > The code is posted on phabricator, reviews D48547 through D48552. > > Reviewers are very welcome! > > > > I share my branch on Github. It is usually rebased on today's CURRENT: > > > > https://github.com/glebius/FreeBSD/commits/gss-netlink/ > > > > Early testers are very welcome! > Ok, I can now do minimal testing and crashed it... > > I did a mount with option "tls" and then partitioned it from the NFS server > by doing "ifconfig bridge0 down". Waited until the TCP connection closed > and then did "ifconfig bridge0 up". > > The crash is a NULL pointer at rpctls_impl.c:255 (in rpctls_connect(), > called from nfscl_renewthread(). > The problem is that you made rpctls_connect_handle a vnet'd variable. > The client side (aka an NFS mount) does not happen inside a jail and > cannot use any vnet'd variables. > Why? Well, any number of threads enter the NFS client via VOP_xxx() > calls etc. Any one of them might end up doing a TCP reconnect when the > underlying TCP connection is broken and then heals. > > I don't know why you made rpctls_connect_handle a vnet'd variable, > but it cannot be that way. > (I once looked at making NFS mounts work inside a vnet prison and > gave up when I realized any old thread ends up in the code and it > would have taken many, many CURVNET_SET() calls to make it work.) > > In summary, no global variable on the client side can be vnet'd and no > global variable on the server side that is vnet'd can be shared with the > client side code. Ok,I now see you've fixed this crash. I'd still like to limit commits to main to the ones that are required to use netlink for the upcalls at this time. rick > > I realize you are enthusiastic about this, but I'd suggest you back off to > the minimal changes required to make this stuff work with netlink instead > of
Re: "don't know how to make /usr/main-src/sys/contrib/dev/iwm/iwm-3160-17.fw.uu. Stop"
Hi! So, there's no longer a build target for the firmware uuencoded files -> kernel module. Being able to build iwm in the kernel rather than a module is broken. Now, the real issue(s) are that iwm needs firmware to initialise, and the firmware needs to exist, and thus it needs access to the rootfs for firmware_get() to find the now binary files in /boot/firmware instead of the kernel module old way, and that whole pipeline is broken if it's loaded at boot time or included in the kernel directly. There isn't a nice way to defer the firmware load attempt until /after/ rootfs is up. -adrian
Re: WIFI compiling error: Can anyone fix it on main branch?
Hi, I'm sorry i only just saw this. is it still broken? I'm cc'ing bz@ as it looks like it's in the linuxkpi stuff. -adrian On Tue, 14 Jan 2025 at 22:10, Chen, Alvin W wrote: > For my case, It is due to different default compiling options for > different clang versions as I build it on FreeBSD 14.0 ENV. > > Please check the attached discussing thread. > > The macros defined for auto pointer free borrowed from Linux kernel would > be improved. > > > > Internal Use - Confidential > > *From:* Adrian Chadd > *Sent:* Wednesday, January 15, 2025 4:55 AM > *To:* Dave Cottlehuber > *Cc:* Chen, Alvin W ; freebsd-current < > freebsd-current@freebsd.org> > *Subject:* Re: WIFI compiling error: Can anyone fix it on main branch? > > > > [EXTERNAL EMAIL] > > I did temporarily break -head on some platforms for like 30 minutes like a > week ago? Is everything ok now? > > > > What was the error? > > > > -adrian > > > > > > On Tue, 7 Jan 2025 at 04:17, Dave Cottlehuber wrote: > > On Tue, 7 Jan 2025, at 09:39, Chen, Alvin W wrote: > > Thanks! > > > > Internal Use - Confidential > > Hi Alvin, > > Can you share your HEAD commit, & an error message? > > 749b3b2c0629 works fine on my machine, its current as of 2 hours ago. > > A+ > Dave > >
Difference in "netstat -rn" output in the last 2 months
Hi, something has changed in the output of "netstat -rn" between 2024-11-23-195545 and 2025-01-22-151306. The default route is not listed as "default" anymore, but with "0.0.0.0" resp. "::/0". This breaks some tools (e.g. iocage). Iocage uses python, I'm not sure if it uses netstat or some other interface, so it may not be directly related to netstat itself but could be related to some other stuff (netlink maybe?). Does this ring a bell for someone? Bye, Alexander. -- http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF http://www.FreeBSD.orgnetch...@freebsd.org : PGP 0x8F31830F9F2772BF signature.asc Description: OpenPGP digital signature