On Sat, Jan 7, 2023 at 6:04 PM Benjamin Kaduk <bjkf...@gmail.com> wrote:
>
> CAUTION: This email originated from outside of the University of Guelph. Do 
> not click links or open attachments unless you recognize the sender and know 
> the content is safe. If in doubt, forward suspicious emails to 
> ith...@uoguelph.ca
>
> On Sat, Jan 7, 2023 at 1:50 PM Rick Macklem <rmack...@freebsd.org> wrote:
>>
>> The branch main has been updated by rmacklem:
>>
>> URL: 
>> https://cgit.FreeBSD.org/src/commit/?id=c33509d49a6fdcf86ef280a78f428d3cb7012c4a
>>
>> commit c33509d49a6fdcf86ef280a78f428d3cb7012c4a
>> Author:     Rick Macklem <rmack...@freebsd.org>
>> AuthorDate: 2023-01-07 21:49:25 +0000
>> Commit:     Rick Macklem <rmack...@freebsd.org>
>> CommitDate: 2023-01-07 21:49:25 +0000
>>
>>     gssd: Fix handling of the gssname=<name> NFS mount option
>>
>>     If an NFS mount using "sec=krb5[ip],gssname=<name>" is
>>     done, the gssd daemon fails.  There is a long delay
>>     (several seconds) in the gss_acquire_cred() call and then
>>     it returns success, but the credentials returned are
>>     junk.
>>
>>     I have no idea how long this has been broken, due to some
>>     change in the Heimdal gssapi library call, but I suspect
>>     it has been quite some time.
>>
>>     Anyhow, it turns out that replacing the "desired_name"
>>     argument with GSS_C_NO_NAME fixes the problem.
>>     Replacing the argument should not be a problem, since the
>>     TGT for the host based initiator credential in the default
>>     keytab file should be the only TGT in the gssd'd credential
>>     cache (which is not the one for uid 0).
>>
>>     I will try and determine if FreeBSD13 and/or FreeBSD12
>>     needs this same fix and will MFC if they need the fix.
>>
>>     This problem only affected Kerberized NFS mounts when the
>>     "gssname" mount option was used.  Other Kerberized NFS
>>     mount cases already used GSS_C_NO_NAME and work ok.
>>     A workaround if you do not have this patch is to do a
>>     "kinit -k host/FQDN" as root on the machine, followed by
>>     the Kerberized NFS mount without the "gssname" mount
>>     option.
>>
>
>
> Hi Rick,
>
> This doesn't seem like a good long-term fix.
> If we're going to have a gssname argument, we should actually make
> it take effect, rather than silently ignoring it, which is what using 
> GSS_C_NO_NAME
> does (it indicates the use of "any credential", which ends up meaning the
> default credential when used on a GSS initiator).
>
> It should be possible to inspect the "junk" credential from gss_acquire_cred()
> and learn more about what happened (perhaps a non-kerberos mechanismm was
> picked, or the name was in the wrong format)  using various gss_inquire_*() 
> calls,
> as a diagnostic measure.  Unfortunately I don't anticipate having a huge 
> amount of time
> to put into it anytime soon...
>
I found the underlying problem. The upcall RPC from the kernel was timing out
at 25sec and the gss_acquire_cred() call was not done at that time.
(It was close.
gss_acquire_cred()  took about 27sec.) Then the kernel code would assume that
the gssd(8) daemon had gone away and closed the upcall socket. This made the
gssd(8) daemon to terminate, due to a SIGPIPE signal.

Increasing the timeout makes it work.

I am now "on the fence" w.r.t. leaving this patch in.  As I noted, I
think it is safe
to do, since the credential cache used by the gssd(8) daemon should only have
a TGT for the host-based client credential.
Without the patch, the mount takes almost 30sec instead of a fraction
of a second
with the patch (assuming the timeout has been increased, which turns out to be
needed for the case where a user's TGT has expired and they attempt to access
the mount).

If you really think it should be reverted, I can do that.

Thanks for your comments, rick
ps: I will be committing a change to increase the timeout.

> Thanks,
>
> Ben

Reply via email to