Hi, On Wed, Feb 21, 2024 at 10:26:11PM +0100, Tomas Vondra wrote: > Thanks for the patch. I took a closer look at v3, so let me share some > review comments. Please push back if you happen to disagree with some of > it, some of this is going to be more a matter of personal preference.
Thanks! As my patch was based on a previous patch, some of decisions were carry-overs I am not overly attached to. > 1) I think it's a bit weird to have two options specifying amount of > time, but one is in seconds and one in milliseconds. Won't that be > unnecessarily confusing? Could we do both in milliseconds? Alright, I changed that. See below for a discussion about the GUCs in general. > 2) The C code defines the GUC as auth_delay.exponential_backoff, but the > SGML docs say <varname>auth_delay.exp_backoff</varname>. Right, an oversight from the last version where the GUC name got changed but I forgot to change the documentation, fixed. > 3) Do we actually need the exponential_backoff GUC? Wouldn't it be > sufficient to have the maximum value, and if it's -1 treat that as no > backoff? That is a good question, I guess that makes sense. The next question then is: what should the default for (now) auth_delay.max_milliseconds be in this case, -1? Or do we say that as the default for auth_delay.milliseconds is 0 anyway, why would somebody not want exponential backoff when they switch it on and keep it at the current 10s/10000ms)? I have not changed that for now, pending further input. > 4) I think the SGML docs should more clearly explain that the delay is > initialized to auth_delay.milliseconds, and reset to this value after > successful authentication. The wording kinda implies that, but it's not > quite clear I think. Ok, I added some text to that end. I also added a not that auth_delay.max_milliseconds will mean that the delay doubling never stops. > 4) I've been really confused by the change from > > if (status != STATUS_OK) > to > if (status == STATUS_ERROR) > > in auth_delay_checks, until I realized those two codes are not the only > ones, and we intentionally ignore STATUS_EOF. I think it'd be good to > mention that in a comment, it's not quite obvious (I only realized it > because the e-mail mentioned it). Yeah I agree, I tried to explain that now. > 5) I kinda like the custom that functions in a contrib module start with > a clear common prefix, say auth_delay_ in this case. Matter of personal > preference, ofc. Ok, I changed the functions to have an auth_delay_ prefix throughout.. > 6) I'm a bit skeptical about some acr_array details. Firstly, why would > 50 entries be enough? Seems like a pretty low and arbitrary number ... > Also, what happens if someone attempts to authenticate, fails to do so, > and then disconnects and never tries again? Or just changes IP? Seems > like the entry will remain in the array forever, no? Yeah, that is how v3 of this patch worked. I have changed that now, see below. > Seems like a way to cause a "limited" DoS - do auth failure from 50 > different hosts, to fill the table, and now everyone has to wait the > maximum amount of time (if they fail to authenticate). Right, though the problem would only exist on authentication failures, so it is really rather limited. > I think it'd be good to consider: > > - Making the acr_array a hash table, and larger than 50 entries (I > wonder if that should be dynamic / customizable by GUC?). I would say a GUC should be overkill for this as this would mostly be an implementation detail. More generally, I think now that entries are expired (see below), this should be less of a concern, so I have not changed this to a hash table for now but doubled MAX_CONN_RECORDS to 100 entries. > - Make sure the entries are eventually expired, based on time (for > example after 2*max_delay?). I went with 5*max_milliseconds - the AuthConnRecord struct now has a last_failed_auth timestamp member; if we increase the delay for a host, we check if any other host expired in the meantime and remove it if so. > - It would be a good idea to log something when we get into the "full > table" and start delaying everyone by max_delay_seconds. (How would > anyone know that's what's happening, seems rather confusing.) Right, I added a log line for that. However, I made it LOG instead of WARNING as I don't think the client would ever see it, would he? Attached is v4 with the above changes. Cheers, Michael
>From 6520fb1e768bb8bc26cafad014d08d7e9ac7f12a Mon Sep 17 00:00:00 2001 From: Michael Banck <michael.ba...@credativ.de> Date: Wed, 27 Dec 2023 15:55:39 +0100 Subject: [PATCH v4] Add optional exponential backoff to auth_delay contrib module. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This adds two new GUCs for auth_delay, exponential_backoff and max_milliseconds. The former controls whether exponential backoff should be used or not, the latter sets an maximum delay (default is 10s) in case exponential backoff is active. The exponential backoff is tracked per remote host and doubled for every failed login attempt (i.e., wrong password, not just missing pg_hba line or database) and reset to auth_delay.milliseconds after a successful authentication or when no authentication attempts have been made for 5*max_milliseconds from that host. Authors: Michael Banck, based on an earlier patch by ζδΉη Reviewed-by: Abhijit Menon-Sen, Tomas Vondra Discussion: https://postgr.es/m/ahwaxacqiwivoehs5yejpqog.1.1668569845751.hmail.zhch...@ceresdata.com --- contrib/auth_delay/auth_delay.c | 231 ++++++++++++++++++++++++++++++- doc/src/sgml/auth-delay.sgml | 45 +++++- src/tools/pgindent/typedefs.list | 1 + 3 files changed, 273 insertions(+), 4 deletions(-) diff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c index ff0e1fd461..647aa4e1cd 100644 --- a/contrib/auth_delay/auth_delay.c +++ b/contrib/auth_delay/auth_delay.c @@ -14,24 +14,51 @@ #include <limits.h> #include "libpq/auth.h" +#include "miscadmin.h" #include "port.h" +#include "storage/dsm_registry.h" +#include "storage/ipc.h" +#include "storage/lwlock.h" +#include "storage/shmem.h" #include "utils/guc.h" #include "utils/timestamp.h" PG_MODULE_MAGIC; +#define MAX_CONN_RECORDS 100 + /* GUC Variables */ static int auth_delay_milliseconds = 0; +static bool auth_delay_exp_backoff = false; +static int auth_delay_max_milliseconds = 0; /* Original Hook */ static ClientAuthentication_hook_type original_client_auth_hook = NULL; +typedef struct AuthConnRecord +{ + char remote_host[NI_MAXHOST]; + double sleep_time; /* in milliseconds */ + TimestampTz last_failed_auth; +} AuthConnRecord; + +static shmem_startup_hook_type shmem_startup_next = NULL; +static AuthConnRecord *acr_array = NULL; + +static AuthConnRecord *auth_delay_find_acr_for_host(char *remote_host); +static AuthConnRecord *auth_delay_find_free_acr(void); +static double auth_delay_increase_delay_after_failed_conn_auth(Port *port); +static void auth_delay_cleanup_conn_record(Port *port); +static void auth_delay_expire_conn_records(Port *port); + /* * Check authentication */ static void auth_delay_checks(Port *port, int status) { + double delay = auth_delay_milliseconds; + /* * Any other plugins which use ClientAuthentication_hook. */ @@ -39,20 +66,193 @@ auth_delay_checks(Port *port, int status) original_client_auth_hook(port, status); /* - * Inject a short delay if authentication failed. + * We handle both STATUS_ERROR and STATUS_OK - the third option + * (STATUS_EOF) is disregarded. + * + * In case of STATUS_ERROR we inject a short delay, optionally with + * exponential backoff. + */ + if (status == STATUS_ERROR) + { + if (auth_delay_exp_backoff) + { + /* + * Delay by 2^n seconds after each authentication failure from a + * particular host, where n is the number of consecutive + * authentication failures. + */ + delay = auth_delay_increase_delay_after_failed_conn_auth(port); + + /* + * Clamp delay to a maximum of auth_delay_max_milliseconds. + */ + if (auth_delay_max_milliseconds > 0) + { + delay = Min(delay, auth_delay_max_milliseconds); + } + } + + if (delay > 0) + { + elog(DEBUG1, "Authentication delayed for %g seconds due to auth_delay", delay / 1000.0); + pg_usleep(1000L * (long) delay); + } + + /* + * Expire delays from other hosts after auth_delay_max_milliseconds * + * 5. + */ + auth_delay_expire_conn_records(port); + } + + /* + * Remove host-specific delay if authentication succeeded. + */ + if (status == STATUS_OK) + auth_delay_cleanup_conn_record(port); +} + +static double +auth_delay_increase_delay_after_failed_conn_auth(Port *port) +{ + AuthConnRecord *acr = NULL; + + acr = auth_delay_find_acr_for_host(port->remote_host); + + if (!acr) + { + acr = auth_delay_find_free_acr(); + + if (!acr) + { + /* + * No free space, MAX_CONN_RECORDS reached. Wait for the + * configured maximum amount. + */ + elog(LOG, "auth_delay: host connection list full, waiting maximum amount"); + return auth_delay_max_milliseconds; + } + strcpy(acr->remote_host, port->remote_host); + } + if (acr->sleep_time == 0) + acr->sleep_time = (double) auth_delay_milliseconds; + else + acr->sleep_time *= 2; + + /* + * Set current timestamp for later expiry. */ - if (status != STATUS_OK) + acr->last_failed_auth = GetCurrentTimestamp(); + + return acr->sleep_time; +} + +static AuthConnRecord * +auth_delay_find_acr_for_host(char *remote_host) +{ + int i; + + for (i = 0; i < MAX_CONN_RECORDS; i++) + { + if (strcmp(acr_array[i].remote_host, remote_host) == 0) + return &acr_array[i]; + } + + return NULL; +} + +static AuthConnRecord * +auth_delay_find_free_acr(void) +{ + int i; + + for (i = 0; i < MAX_CONN_RECORDS; i++) + { + if (!acr_array[i].remote_host[0]) + return &acr_array[i]; + } + + return 0; +} + +static void +auth_delay_cleanup_conn_record(Port *port) +{ + AuthConnRecord *acr = NULL; + + acr = auth_delay_find_acr_for_host(port->remote_host); + if (acr == NULL) + return; + + port->remote_host[0] = '\0'; + + acr->sleep_time = 0.0; + acr->last_failed_auth = 0.0; +} + +static void +auth_delay_expire_conn_records(Port *port) +{ + int i; + TimestampTz now = GetCurrentTimestamp(); + + for (i = 0; i < MAX_CONN_RECORDS; i++) { - pg_usleep(1000L * auth_delay_milliseconds); + /* + * Do not expire the host from which the current authentication + * failure originated. + */ + if (strcmp(acr_array[i].remote_host, port->remote_host) == 0) + continue; + + if (acr_array[i].last_failed_auth > 0 && (long) ((now - acr_array[i].last_failed_auth) / 1000) > 5 * auth_delay_max_milliseconds) + { + acr_array[i].remote_host[0] = '\0'; + acr_array[i].sleep_time = 0.0; + acr_array[i].last_failed_auth = 0.0; + } } } +/* + * Set up shared memory + */ + +static void +auth_delay_init_state(void *ptr) +{ + Size shm_size; + AuthConnRecord *array = (AuthConnRecord *) ptr; + + shm_size = sizeof(AuthConnRecord) * MAX_CONN_RECORDS; + + memset(array, 0, shm_size); +} + +static void +auth_delay_shmem_startup(void) +{ + bool found; + Size shm_size; + + if (shmem_startup_next) + shmem_startup_next(); + + shm_size = sizeof(AuthConnRecord) * MAX_CONN_RECORDS; + acr_array = GetNamedDSMSegment("auth_delay", shm_size, auth_delay_init_state, &found); +} + /* * Module Load Callback */ void _PG_init(void) { + if (!process_shared_preload_libraries_in_progress) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("auth_delay must be loaded via shared_preload_libraries"))); + /* Define custom GUC variables */ DefineCustomIntVariable("auth_delay.milliseconds", "Milliseconds to delay before reporting authentication failure", @@ -66,9 +266,34 @@ _PG_init(void) NULL, NULL); + DefineCustomBoolVariable("auth_delay.exponential_backoff", + "Double the delay after each authentication failure from a particular host", + NULL, + &auth_delay_exp_backoff, + false, + PGC_SIGHUP, + 0, + NULL, + NULL, + NULL); + + DefineCustomIntVariable("auth_delay.max_milliseconds", + "Maximum delay when exponential backoff is enabled", + NULL, + &auth_delay_max_milliseconds, + 10000, + 0, INT_MAX / 1000, + PGC_SIGHUP, + GUC_UNIT_MS, + NULL, NULL, NULL); + MarkGUCPrefixReserved("auth_delay"); /* Install Hooks */ original_client_auth_hook = ClientAuthentication_hook; ClientAuthentication_hook = auth_delay_checks; + + /* Set up shared memory */ + shmem_startup_next = shmem_startup_hook; + shmem_startup_hook = auth_delay_shmem_startup; } diff --git a/doc/src/sgml/auth-delay.sgml b/doc/src/sgml/auth-delay.sgml index 0571f2a99d..b02b6b0b30 100644 --- a/doc/src/sgml/auth-delay.sgml +++ b/doc/src/sgml/auth-delay.sgml @@ -16,6 +16,18 @@ connection slots. </para> + <para> + It is optionally possible to let <filename>auth_delay</filename> wait longer + on each successive authentication failure if the configuration parameter + <varname>auth_delay.exponential_backoff</varname> is active. If enabled, + <filename>auth_delay</filename> will start with a delay of + <varname>auth_delay.milliseconds</varname> and double the delay after each + consecutive authentication failure from a particular host, up to the given + <varname>auth_delay.max_milliseconds</varname> (default: 10s). If the host + authenticates successfully or after a timeout of five times + <varname>auth_delay.max_milliseconds</varname>, the delay is reset. + </para> + <para> In order to function, this module must be loaded via <xref linkend="guc-shared-preload-libraries"/> in <filename>postgresql.conf</filename>. @@ -39,6 +51,35 @@ </para> </listitem> </varlistentry> + <varlistentry> + <term> + <varname>auth_delay.exponential_backoff</varname> (<type>bool</type>) + <indexterm> + <primary><varname>auth_delay.exponential_backoff</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Whether to use exponential backoff per remote host on authentication + failure. The default is off. + </para> + </listitem> + </varlistentry> + <varlistentry> + <term> + <varname>auth_delay.max_milliseconds</varname> (<type>integer</type>) + <indexterm> + <primary><varname>auth_delay.max_milliseconds</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + The maximum delay, in milliseconds, when exponential backoff is + enabled. The default is 10 seconds. If set to 0, authentication delays + will increase without limit. + </para> + </listitem> + </varlistentry> </variablelist> <para> @@ -50,7 +91,9 @@ # postgresql.conf shared_preload_libraries = 'auth_delay' -auth_delay.milliseconds = '500' +auth_delay.milliseconds = '125' +auth_delay.exponential_backoff = 'on' +auth_delay.max_milliseconds = '20000' </programlisting> </sect2> diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index 95ae7845d8..dd9fb9e530 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -166,6 +166,7 @@ AttrMap AttrMissing AttrNumber AttributeOpts +AuthConnRecord AuthRequest AuthToken AutoPrewarmSharedState -- 2.39.2