Re: HEADS UP: NFS changes coming into CURRENT early February

2025-01-26 Thread Rick Macklem
On Tue, Jan 21, 2025 at 10:27 PM Gleb Smirnoff  wrote:
>
> CAUTION: This email originated from outside of the University of Guelph. Do 
> not click links or open attachments unless you recognize the sender and know 
> the content is safe. If in doubt, forward suspicious emails to 
> ith...@uoguelph.ca.
>
>
>   Hi,
>
> TLDR version:
> users of NFS with Kerberos (e.g. running gssd(8)) as well as users of NFS with
> TLS (e.g. running rpc.tlsclntd(8) or rpc.tlsservd(8)) as well as users of
> network lock manager (e.g. having 'options NFSLOCKD' and running rpcbind(8))
> are affected.  You would need to recompile & reinstall both the world and the
> kernel together.  Of course this is what you'd normally do when you track
> FreeBSD CURRENT, but better be warned.  I will post hashes of the specific
> revisions that break API/ABI when they are pushed.
>
> Longer version:
> last year I tried to check-in a new implementation of unix(4) SOCK_STREAM and
> SOCK_SEQPACKET in d80a97def9a1, but was forced to back it out due to several
> kernel side abusers of a unix(4) socket.  The most difficult ones are the NFS
> related RPC services, that act as RPC clients talking to an RPC servers in
> userland.  Since it is impossible to fully emulate a userland process
> connection to a unix(4) socket they need to work with the socket internal
> structures bypassing all the normal KPIs and conventions.  Of course they
> didn't tolerate the new implementation that totally eliminated intermediate
> buffer on the sending side.
>
> While the original motivation for the upcoming changes is the fact that I want
> to go forward with the new unix/stream and unix/seqpacket, I also tried to 
> make
> kernel to userland RPC better.  You judge if I succeeded or not :) Here are
> some highlights:
>
> - Code footprint both in kernel clients and in userland daemons is reduced.
>   Example: gssd:1 file changed, 5 insertions(+), 64 deletions(-)
>kgssapi: 1 file changed, 26 insertions(+), 78 deletions(-)
> 4 files changed, 1 insertion(+), 11 deletions(-)
> - You can easily see all RPC calls from kernel to userland with genl(1):
>   # genl monitor rpcnl
> - The new transport is multithreaded in kernel by default, so kernel clients
>   can send a bunch of RPCs without any serialization and if the userland
>   figures out how to parallelize their execution, such parallelization would
>   happen.  Note: new rpc.tlsservd(8) will use threads.
> - One ad-hoc single program syscall is removed - gssd_syscall.  Note:
>   rpctls syscall remains, but I have some ideas on how to improve that, too.
>   Not at this step though.
> - All sleeps of kernel RPC calls are now in single place, and they all have
>   timeouts.  I believe NFS services are now much more resilient to hangs.
>   A deadlock when NFS kernel thread is blocked on unix socket buffer, and
>   the socket can't go away because its application is blocked in some other
>   syscall is no longer possible.
>
> The code is posted on phabricator, reviews D48547 through D48552.
> Reviewers are very welcome!
>
> I share my branch on Github. It is usually rebased on today's CURRENT:
>
> https://github.com/glebius/FreeBSD/commits/gss-netlink/
>
> Early testers are very welcome!
Ok, I can now do minimal testing and crashed it...

I did a mount with option "tls" and then partitioned it from the NFS server
by doing "ifconfig bridge0 down". Waited until the TCP connection closed
and then did "ifconfig bridge0 up".

The crash is a NULL pointer at rpctls_impl.c:255 (in rpctls_connect(),
called from nfscl_renewthread().
The problem is that you made rpctls_connect_handle a vnet'd variable.
The client side (aka an NFS mount) does not happen inside a jail and
cannot use any vnet'd variables.
Why? Well, any number of threads enter the NFS client via VOP_xxx()
calls etc. Any one of them might end up doing a TCP reconnect when the
underlying TCP connection is broken and then heals.

I don't know why you made rpctls_connect_handle  a vnet'd variable,
but it cannot be that way.
(I once looked at making NFS mounts work inside a vnet prison and
gave up when I realized any old thread ends up in the code and it
would have taken many, many CURVNET_SET() calls to make it work.)

In summary, no global variable on the client side can be vnet'd and no
global variable on the server side that is vnet'd can be shared with the
client side code.

I realize you are enthusiastic about this, but I'd suggest you back off to
the minimal changes required to make this stuff work with netlink instead
of unix domain sockets and stick with that, at least for the initial
commit cycle.

One thing to note is that few (if any) people who run main test this stuff.
It may be 1-2years before it sees third party testing and I can only do minimal
testing until at least April.

Anyhow, thanks for all the good work you are doing with this, rick

>
> --
> Gleb Smirnoff
>



Re: Difference in "netstat -rn" output in the last 2 months

2025-01-26 Thread Pier-Luc Caron St-Pierre
Hi Alexander,

It looks like the source of this change can be found here:
https://reviews.freebsd.org/rG9206c79961986c2114a9a2cfccf009ac010ad259

Additional context can be found in the differential:
https://reviews.freebsd.org/D10320

On Sun, Jan 26, 2025 at 3:42 PM Alexander Leidinger 
wrote:

> Hi,
>
> something has changed in the output of "netstat -rn" between
> 2024-11-23-195545 and 2025-01-22-151306. The default route is not listed
> as "default" anymore, but with "0.0.0.0" resp. "::/0". This breaks some
> tools (e.g. iocage). Iocage uses python, I'm not sure if it uses netstat
> or some other interface, so it may not be directly related to netstat
> itself but could be related to some other stuff (netlink maybe?).
>
> Does this ring a bell for someone?
>
> Bye,
> Alexander.
>
> --
> http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
> http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF
>


-- 
Chaleureusement,
Pier-Luc


Re: "don't know how to make /usr/main-src/sys/contrib/dev/iwm/iwm-3160-17.fw.uu. Stop"

2025-01-26 Thread Mark Millard
On Jan 25, 2025, at 02:24, Mark Millard  wrote:

> On Jan 25, 2025, at 02:10, Stefan Esser  wrote:
> 
>> Am 25.01.25 um 10:54 schrieb Mark Millard:
>>> Unfortunately, for now my reporting is based on my personal build 
>>> environment,
>>> not on anofficial FreeBSD build.
>>> Context doing the building:
>> 
>> Hi Mark,
>> 
>> probably an issue due to commit af0a81b6470 from 2024-12-12:
>> 
>> commit af0a81b6470aba4af4a24ae9804053722846ded4
>> Author: Emmanuel Vadot 
>> Date:   Thu Dec 12 17:13:58 2024 +0100
>> 
>>   iwm: Stop shipping firmware as kernel module
>> 
>>   Since we can load raw firmware start shipping them as is.
>>   This also remove the uuencode format that don't add any value and garbage
>>   collect old firmwares version.
>>   For pkgbase users they are now in the FreeBSD-firmware-iwm package.
>> 
>>   Sponsored by:   Beckhoff Automation GmbH & Co. KG
>> 
>> Maybe your sources are out of sync with regard to that commit?
> 
> https://cgit.freebsd.org/src/blame/sys/conf/files shows the lines that
> I quoted that indicate dependencies on various *.fw.uu files.
> 
> It is true that a pre 2024-Dec-12 installation is attempting to
> build what I reported: 2025-01-25 00:07:01 + ( i.e., n275030
> 46a9fb7287f41eedf321d81a68a826f231d11bfe ). I had not updated
> at all between those times and was finally trying to update.
> 

It appears that the following is enough to have the build
failures:

# grep -r iwm /usr/main-src/sys/*/conf*/
/usr/main-src/sys/amd64/conf/GENERIC-NODBG:device iwm
/usr/main-src/sys/amd64/conf/GENERIC-NODBG:device iwmfw
/usr/main-src/sys/amd64/conf/GENERIC-DBG:device iwm
/usr/main-src/sys/amd64/conf/GENERIC-DBG:device iwmfw
/usr/main-src/sys/amd64/conf/GENERIC-NODBG-NONUMA:device iwm
/usr/main-src/sys/amd64/conf/GENERIC-NODBG-NONUMA:device iwmfw

Commenting out the "device iwmfw" lines allowed the build to
complete --but based on my /usr/main-src/ source tree for
what was being built, not on building an official tree.

To my knowledge, the builds shown by the likes of:

https://pkg-status.freebsd.org/builds?type=package&all=1

do not involve "device iwmfw" --and so do not make for
a valid comparison to my context. (I normally check if
I'm getting unusual results vs. official builds.)


Those GENERIC-*DBG* files are based on GENERIC . For example:

# more /usr/main-src/sys/amd64/conf/GENERIC-NODBG
#
# GENERIC -- Custom configuration for the amd64/amd64
#

include "GENERIC"

ident   GENERIC-NODBG

makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols
makeoptions WITH_CTF=1  # Run ctfconvert(1) for DTrace support

options NUMA

#optionsALT_BREAK_TO_DEBUGGER

options KDB # Enable kernel debugger support

# For minimum debugger support (stable branch) use:
options KDB_TRACE   # Print a stack trace for a panic
options DDB # Enable the kernel debugger

# Extra stuff:
#optionsVERBOSE_SYSINIT=0   # Enable verbose sysinit messages
#optionsBOOTVERBOSE=1
#optionsBOOTHOWTO=RB_VERBOSE
#optionsKTR
#optionsKTR_MASK=KTR_TRAP
##options   KTR_CPUMASK=0xF
#optionsKTR_VERBOSE
#optionsACPI_DEBUG

# Disable any extra checking for. . .
nooptions   DEADLKRES   # Would enable the deadlock resolver
nooptions   INVARIANTS  # Would enable calls of extra sanity 
checking
nooptions   INVARIANT_SUPPORT   # Would enable extra sanity checks of 
internal structures, required by INVARIANTS
nooptions   WITNESS # Would enable checks to detect 
deadlocks and cycles
nooptions   WITNESS_SKIPSPIN# Would enable running witness on 
spinlocks for speed
nooptions   DIAGNOSTIC
nooptions   MALLOC_DEBUG_MAXZONES

# Kernel Sanitizers
nooptions   COVERAGE# Would enable generic kernel coverage. 
Used by KCOV
nooptions   KCOV# Would enable Kernel Coverage Sanitizer
# Warning: KUBSAN can result in a kernel too large for loader to load
nooptions   KUBSAN  # Would enable Kernel Undefined 
Behavior Sanitizer

device  iwm
device  iwmfw


Maybe the IWM(4) man page is out of date and the above need to be
changed in some way and so I should follow updated instructions?

IWM(4) FreeBSD Kernel Interfaces Manual IWM(4)

NAME
 iwm – Intel IEEE 802.11ac wireless network driver

SYNOPSIS
 To compile this driver into the kernel, include the following lines in
 your kernel configuration file:

   device iwm
   device pci
   device wlan
   device firmware

 You also need to select a firmware for your device.  Choose one from:

   device iwm3160fw
   device iwm3168fw
   device iwm7260fw
   device iwm7265fw
   device iwm7265Dfw
   device iwm8000Cfw
   device iwm8265fw
   

Re: HEADS UP: NFS changes coming into CURRENT early February

2025-01-26 Thread Rick Macklem
On Sun, Jan 26, 2025 at 1:44 PM Rick Macklem  wrote:
>
> On Tue, Jan 21, 2025 at 10:27 PM Gleb Smirnoff  wrote:
> >
> > CAUTION: This email originated from outside of the University of Guelph. Do 
> > not click links or open attachments unless you recognize the sender and 
> > know the content is safe. If in doubt, forward suspicious emails to 
> > ith...@uoguelph.ca.
> >
> >
> >   Hi,
> >
> > TLDR version:
> > users of NFS with Kerberos (e.g. running gssd(8)) as well as users of NFS 
> > with
> > TLS (e.g. running rpc.tlsclntd(8) or rpc.tlsservd(8)) as well as users of
> > network lock manager (e.g. having 'options NFSLOCKD' and running rpcbind(8))
> > are affected.  You would need to recompile & reinstall both the world and 
> > the
> > kernel together.  Of course this is what you'd normally do when you track
> > FreeBSD CURRENT, but better be warned.  I will post hashes of the specific
> > revisions that break API/ABI when they are pushed.
> >
> > Longer version:
> > last year I tried to check-in a new implementation of unix(4) SOCK_STREAM 
> > and
> > SOCK_SEQPACKET in d80a97def9a1, but was forced to back it out due to several
> > kernel side abusers of a unix(4) socket.  The most difficult ones are the 
> > NFS
> > related RPC services, that act as RPC clients talking to an RPC servers in
> > userland.  Since it is impossible to fully emulate a userland process
> > connection to a unix(4) socket they need to work with the socket internal
> > structures bypassing all the normal KPIs and conventions.  Of course they
> > didn't tolerate the new implementation that totally eliminated intermediate
> > buffer on the sending side.
> >
> > While the original motivation for the upcoming changes is the fact that I 
> > want
> > to go forward with the new unix/stream and unix/seqpacket, I also tried to 
> > make
> > kernel to userland RPC better.  You judge if I succeeded or not :) Here are
> > some highlights:
> >
> > - Code footprint both in kernel clients and in userland daemons is reduced.
> >   Example: gssd:1 file changed, 5 insertions(+), 64 deletions(-)
> >kgssapi: 1 file changed, 26 insertions(+), 78 deletions(-)
> > 4 files changed, 1 insertion(+), 11 deletions(-)
> > - You can easily see all RPC calls from kernel to userland with genl(1):
> >   # genl monitor rpcnl
> > - The new transport is multithreaded in kernel by default, so kernel clients
> >   can send a bunch of RPCs without any serialization and if the userland
> >   figures out how to parallelize their execution, such parallelization would
> >   happen.  Note: new rpc.tlsservd(8) will use threads.
> > - One ad-hoc single program syscall is removed - gssd_syscall.  Note:
> >   rpctls syscall remains, but I have some ideas on how to improve that, too.
> >   Not at this step though.
> > - All sleeps of kernel RPC calls are now in single place, and they all have
> >   timeouts.  I believe NFS services are now much more resilient to hangs.
> >   A deadlock when NFS kernel thread is blocked on unix socket buffer, and
> >   the socket can't go away because its application is blocked in some other
> >   syscall is no longer possible.
> >
> > The code is posted on phabricator, reviews D48547 through D48552.
> > Reviewers are very welcome!
> >
> > I share my branch on Github. It is usually rebased on today's CURRENT:
> >
> > https://github.com/glebius/FreeBSD/commits/gss-netlink/
> >
> > Early testers are very welcome!
> Ok, I can now do minimal testing and crashed it...
>
> I did a mount with option "tls" and then partitioned it from the NFS server
> by doing "ifconfig bridge0 down". Waited until the TCP connection closed
> and then did "ifconfig bridge0 up".
>
> The crash is a NULL pointer at rpctls_impl.c:255 (in rpctls_connect(),
> called from nfscl_renewthread().
> The problem is that you made rpctls_connect_handle a vnet'd variable.
> The client side (aka an NFS mount) does not happen inside a jail and
> cannot use any vnet'd variables.
> Why? Well, any number of threads enter the NFS client via VOP_xxx()
> calls etc. Any one of them might end up doing a TCP reconnect when the
> underlying TCP connection is broken and then heals.
>
> I don't know why you made rpctls_connect_handle  a vnet'd variable,
> but it cannot be that way.
> (I once looked at making NFS mounts work inside a vnet prison and
> gave up when I realized any old thread ends up in the code and it
> would have taken many, many CURVNET_SET() calls to make it work.)
>
> In summary, no global variable on the client side can be vnet'd and no
> global variable on the server side that is vnet'd can be shared with the
> client side code.
Ok,I now see you've fixed this crash.
I'd still like to limit commits to main to the ones that are required to use
netlink for the upcalls at this time.

rick

>
> I realize you are enthusiastic about this, but I'd suggest you back off to
> the minimal changes required to make this stuff work with netlink instead
> of 

Re: "don't know how to make /usr/main-src/sys/contrib/dev/iwm/iwm-3160-17.fw.uu. Stop"

2025-01-26 Thread Adrian Chadd
Hi!

So, there's no longer a build target for the firmware uuencoded files ->
kernel module.

Being able to build iwm in the kernel rather than a module is broken.

Now, the real issue(s) are that iwm needs firmware to initialise, and the
firmware needs to exist, and thus it needs access to the rootfs for
firmware_get() to find the now binary files in /boot/firmware instead of
the kernel module old way, and that whole pipeline is broken if it's loaded
at boot time or included in the kernel directly. There isn't a nice way to
defer the firmware load attempt until /after/ rootfs is up.




-adrian


Re: WIFI compiling error: Can anyone fix it on main branch?

2025-01-26 Thread Adrian Chadd
Hi, I'm sorry i only just saw this.

is it still broken? I'm cc'ing bz@ as it looks like it's in the linuxkpi
stuff.


-adrian


On Tue, 14 Jan 2025 at 22:10, Chen, Alvin W  wrote:

> For my case, It is due to different default compiling options for
> different clang versions as I build it on FreeBSD 14.0 ENV.
>
> Please check the attached discussing thread.
>
> The macros defined for auto pointer free borrowed from Linux kernel would
> be improved.
>
>
>
> Internal Use - Confidential
>
> *From:* Adrian Chadd 
> *Sent:* Wednesday, January 15, 2025 4:55 AM
> *To:* Dave Cottlehuber 
> *Cc:* Chen, Alvin W ; freebsd-current <
> freebsd-current@freebsd.org>
> *Subject:* Re: WIFI compiling error: Can anyone fix it on main branch?
>
>
>
> [EXTERNAL EMAIL]
>
> I did temporarily break -head on some platforms for like 30 minutes like a
> week ago? Is everything ok now?
>
>
>
> What was the error?
>
>
>
> -adrian
>
>
>
>
>
> On Tue, 7 Jan 2025 at 04:17, Dave Cottlehuber  wrote:
>
> On Tue, 7 Jan 2025, at 09:39, Chen, Alvin W wrote:
> > Thanks!
> >
> > Internal Use - Confidential
>
> Hi Alvin,
>
> Can you share your HEAD commit, & an error message?
>
> 749b3b2c0629 works fine on my machine, its current as of 2 hours ago.
>
> A+
> Dave
>
>


Difference in "netstat -rn" output in the last 2 months

2025-01-26 Thread Alexander Leidinger

Hi,

something has changed in the output of "netstat -rn" between 
2024-11-23-195545 and 2025-01-22-151306. The default route is not listed 
as "default" anymore, but with "0.0.0.0" resp. "::/0". This breaks some 
tools (e.g. iocage). Iocage uses python, I'm not sure if it uses netstat 
or some other interface, so it may not be directly related to netstat 
itself but could be related to some other stuff (netlink maybe?).


Does this ring a bell for someone?

Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature