package: nscd
version: 2.24-11+deb9u1
severity: minor

Hello,

While troubleshooting my boot sequence with some custom SystemD units that rely 
on NSS LDAP user <-> UID resolutions, I found out that NSCD negative caching 
AND NSCD starting before the network is up was the cause of the failures of 
those units (even though they were startong after the network is up). In 
detail, as witnessed in Debian/Stretch:

1. NSCD starts before the network
2. an attempt for NSS LDAP user <-> uid resolution fails, since the network is 
not up (and LDAP is not available)
3. NSCD caches this negative result as per its "negative-time-to-live" 
configuration
4. network is up
5. further attenmpts for the same NSS LDAP user <-> uid resolution keep 
failing, because of NSCD negative cache, although LDAP is now fully available

As could be expected, adding "After=network.target" to NSCD systemd service 
file solves the situation.

Pros:
- no negative caching of transient failures
- all subsequent LDAP NSS calls are successful (and positively cached)

Cons:
- if many networked NSS queries are attempted by some service, the boot may be 
much delayed, according to /etc/ldap.conf NSS configuration (timeouts, retries, 
etc.)
- BUT the service issuing those networked NSS queries ought to have its 
dependency fixed too (rather than using NSCD negative caching as a way to 
quickly bail out of failing networked NSS queries)

Notes: except for the above "con", I don't see where adding this new network 
dependency may be a problem. NSS queries will still be carried out even if NSCD 
is not yet activated. Given we're speaking of the boot sequence, having NSCD 
not active until the network is up seems reasonable, given its main benenfit is 
to cache *networked* NSS queries (the slow ones).

Just a suggestion, though.

Thank you for your Debian work and best,

Cédric 

-- 
Cédric Dufour @ Idiap Research Institute

Reply via email to