On Tue, Jan 21, 2025 at 10:27 PM Gleb Smirnoff <gleb...@freebsd.org> wrote:
>
> CAUTION: This email originated from outside of the University of Guelph. Do 
> not click links or open attachments unless you recognize the sender and know 
> the content is safe. If in doubt, forward suspicious emails to 
> ith...@uoguelph.ca.
>
>
>   Hi,
>
> TLDR version:
> users of NFS with Kerberos (e.g. running gssd(8)) as well as users of NFS with
> TLS (e.g. running rpc.tlsclntd(8) or rpc.tlsservd(8)) as well as users of
> network lock manager (e.g. having 'options NFSLOCKD' and running rpcbind(8))
> are affected.  You would need to recompile & reinstall both the world and the
> kernel together.  Of course this is what you'd normally do when you track
> FreeBSD CURRENT, but better be warned.  I will post hashes of the specific
> revisions that break API/ABI when they are pushed.
>
> Longer version:
> last year I tried to check-in a new implementation of unix(4) SOCK_STREAM and
> SOCK_SEQPACKET in d80a97def9a1, but was forced to back it out due to several
> kernel side abusers of a unix(4) socket.  The most difficult ones are the NFS
> related RPC services, that act as RPC clients talking to an RPC servers in
> userland.  Since it is impossible to fully emulate a userland process
> connection to a unix(4) socket they need to work with the socket internal
> structures bypassing all the normal KPIs and conventions.  Of course they
> didn't tolerate the new implementation that totally eliminated intermediate
> buffer on the sending side.
>
> While the original motivation for the upcoming changes is the fact that I want
> to go forward with the new unix/stream and unix/seqpacket, I also tried to 
> make
> kernel to userland RPC better.  You judge if I succeeded or not :) Here are
> some highlights:
>
> - Code footprint both in kernel clients and in userland daemons is reduced.
>   Example: gssd:    1 file changed, 5 insertions(+), 64 deletions(-)
>            kgssapi: 1 file changed, 26 insertions(+), 78 deletions(-)
>                     4 files changed, 1 insertion(+), 11 deletions(-)
> - You can easily see all RPC calls from kernel to userland with genl(1):
>   # genl monitor rpcnl
> - The new transport is multithreaded in kernel by default, so kernel clients
>   can send a bunch of RPCs without any serialization and if the userland
>   figures out how to parallelize their execution, such parallelization would
>   happen.  Note: new rpc.tlsservd(8) will use threads.
> - One ad-hoc single program syscall is removed - gssd_syscall.  Note:
>   rpctls syscall remains, but I have some ideas on how to improve that, too.
>   Not at this step though.
> - All sleeps of kernel RPC calls are now in single place, and they all have
>   timeouts.  I believe NFS services are now much more resilient to hangs.
>   A deadlock when NFS kernel thread is blocked on unix socket buffer, and
>   the socket can't go away because its application is blocked in some other
>   syscall is no longer possible.
>
> The code is posted on phabricator, reviews D48547 through D48552.
> Reviewers are very welcome!
>
> I share my branch on Github. It is usually rebased on today's CURRENT:
>
> https://github.com/glebius/FreeBSD/commits/gss-netlink/
>
> Early testers are very welcome!
I think I've found a memory leak, but it shouldn't be a show stopper.

What I did on the NFS client side is:
# vmstat -m | fgrep -i rpc
# mount -t nfs -o nfsv4,tls nfsv4-server:/ /mnt
# ls --lR /mnt
--> Then I network partitioned it from the server a few times, until
      the TCP connection closed.
      (My client is in bhyve and the server on the system the bhyve
       instance is running in. I just "ifconfig bridge0 down", waited for
       the TCP connection to close "netstat --a" then "ifconfig bridge0 up".
Once done, I
# umount /mnt
# vmstat -m | fgrep -i rpc
and say a somewhat larger allocation count

The allocation count only goes up if I do the network partitioning
and only on the NFS client side.

Since the leak is slow and only happens when the TCP connection
breaks, I do not think it is a show stopper and one of us can track it down
someday.

Other than that, I have not found any problems that you had not already
fixed, rick

>
> --
> Gleb Smirnoff
>

Reply via email to