On Fri, Jul 6, 2018 at 1:50 AM, Stephen Frost <sfr...@snowman.net> wrote: > What if we (optionally, of course) had an always-running background > worker which connects to AD and streams down the changes happening by > using AD's Change Notification system: > > https://docs.microsoft.com/en-us/windows/desktop/AD/change-notifications-in-active-directory-domain-services > > Then integrate those changes into PG as they come in, avoiding the need > for a cronjob.
Hi Stephen, Yeah, that's a good idea. There are several change tracking techniques available[1], some of which might also be useful for the just-in-time sync technique I'm talking about (a cheap way to know if anything interesting changed since last time at the same time as you authenticate, so you can avoid doing any work at all most of the time). Maybe other LDAP vendors can do this type of stuff too (it looks like there was an attempt to standardise it, but it appears to be resting[2]...) Another idea would be for someone to take a tool like ldap2pg and give it a --daemon mode where it does that, and then package it up with rc.d/systemd/whatever glue so it's easy to deploy. Another idea, somewhere between that and your idea, is to guess that the main reason you really want to put this stuff in a background worker is because you want a long running process and you can't be bothered with the daemon management for an external thing... so... we could make a generic background worker extension that'll run arbitrary external long running programs while the database it up, and then something like ldap2pg could have a --stream-changes mode that'd do it until asked to shut down. I'm assuming it'd be better to keep the actual messy synchronisation problem outside the core PostgreSQL project, where a thousand tools can bloom. We just need the right hooks and cache invalidation logic so they can, and we're pretty close. > Again, I'm not expressing an opinion for or against the change you > propose, just mentioning another approach to the general problem. I can > see some advantages to waiting until an actual connection attempt to go > create the role (you don't end up with roles in the system which never > actually log into it) and advantages to using a background worker (the > role will already be created, avoiding possible delay during the > authentication and setup work of the role; more clear what roles have > access on the system; ability to GRANT access to roles which haven't > logged in yet or to set other attributes on the role prior to login). Another advantage to asynchronous schemes is that you can use group role names in pg_hba.conf (whereas that is checked before the auth module is invoked, so won't work with my synchronous just-in-time scheme; you need to use "all" in that case, though there is an easy workaround using LDAP search_filter). A disadvantage of your specific scheme is that it doesn't exist yet and probably needs to be written in C, and my scheme (despite its disadvantages) probably only requires slinging a few lines of Python/fooscript/barscript around. [1] https://docs.microsoft.com/en-us/windows/desktop/AD/overview-of-change-tracking-techniques [2] https://datatracker.ietf.org/doc/draft-dawkins-ldapext-subnot/ -- Thomas Munro http://www.enterprisedb.com