Hi,

On 10/14/22 7:30 AM, Michael Paquier wrote:
On Wed, Oct 12, 2022 at 08:17:14AM +0200, Drouvot, Bertrand wrote:
Indeed, ;-)


The code could be split to tackle things step-by-step:
- One refactoring patch to introduce token_regcomp() and
token_regexec(), with the introduction of a new structure that
includes the compiled regexes.  (Feel free to counterargue about the
use of AuthToken for this purpose, of course!)
- Plug in the refactored logic for the lists of role names and
database names in pg_hba.conf.

Please find attached v1-0001-regex-handling-for-db-and-roles-in-hba.patch to implement regexes for databases and roles in hba.

It does also contain new regexes related TAP tests and doc updates.

It relies on the refactoring made in fc579e11c6 (but changes the regcomp_auth_token() parameters so that it is now responsible for emitting the compilation error message (if any), to avoid code duplication in parse_hba_line() and parse_ident_line() for roles, databases and user name mapping).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index c6f1b70fd3..30753003ba 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -234,10 +234,13 @@ hostnogssenc  <replaceable>database</replaceable>  
<replaceable>user</replaceabl
        replication connections do not specify any particular database whereas
        logical replication connections do specify it.
        Otherwise, this is the name of
-       a specific <productname>PostgreSQL</productname> database.
-       Multiple database names can be supplied by separating them with
-       commas.  A separate file containing database names can be specified by
-       preceding the file name with <literal>@</literal>.
+       a specific <productname>PostgreSQL</productname> database or a regular
+       expression preceded by <literal>/</literal>.
+       Multiple database names and/or regular expressions preceded by 
<literal>/</literal>
+       can be supplied by separating them with commas.
+       A separate file containing database names and/or regular expressions 
preceded
+       by <literal>/</literal> can be specified by preceding the file name
+       with <literal>@</literal>.
       </para>
      </listitem>
     </varlistentry>
@@ -249,18 +252,20 @@ hostnogssenc  <replaceable>database</replaceable>  
<replaceable>user</replaceabl
        Specifies which database user name(s) this record
        matches. The value <literal>all</literal> specifies that it
        matches all users.  Otherwise, this is either the name of a specific
-       database user, or a group name preceded by <literal>+</literal>.
+       database user, a regular expression preceded by <literal>/</literal>,
+       or a group name preceded by <literal>+</literal>.
        (Recall that there is no real distinction between users and groups
        in <productname>PostgreSQL</productname>; a <literal>+</literal> mark 
really means
        <quote>match any of the roles that are directly or indirectly members
        of this role</quote>, while a name without a <literal>+</literal> mark 
matches
-       only that specific role.) For this purpose, a superuser is only
+       only that specific role or regular expression.) For this purpose, a 
superuser is only
        considered to be a member of a role if they are explicitly a member
        of the role, directly or indirectly, and not just by virtue of
        being a superuser.
-       Multiple user names can be supplied by separating them with commas.
-       A separate file containing user names can be specified by preceding the
-       file name with <literal>@</literal>.
+       Multiple user names and/or regular expressions preceded by 
<literal>/</literal>
+       can be supplied by separating them with commas. A separate file 
containing
+       user names and/or regular expressions preceded by <literal>/</literal>
+       can be specified by preceding the file name with <literal>@</literal>.
       </para>
      </listitem>
     </varlistentry>
@@ -739,6 +744,18 @@ host    all             all             ::1/128            
     trust
 # TYPE  DATABASE        USER            ADDRESS                 METHOD
 host    all             all             localhost               trust
 
+# The same using regular expression for DATABASE, which allows connection to 
the
+# db1 and testdb databases and any database with a name ending with "test".
+#
+# TYPE  DATABASE               USER            ADDRESS          METHOD
+local   db1,/^.*test$,testdb   all             localhost        trust
+
+# The same using regular expression for USER, which allows connection to the
+# user1 and testuser users and any user with a name ending with "test".
+#
+# TYPE  DATABASE                 USER                              ADDRESS     
     METHOD
+local   db1,/^.*test$,testdb     user1,/^.*test$,testuser          localhost   
     trust
+
 # Allow any user from any host with IP address 192.168.93.x to connect
 # to database "postgres" as the same user name that ident reports for
 # the connection (typically the operating system user name).
@@ -785,15 +802,17 @@ host    all             all             192.168.12.10/32  
      gss
 # TYPE  DATABASE        USER            ADDRESS                 METHOD
 host    all             all             192.168.0.0/16          ident 
map=omicron
 
-# If these are the only three lines for local connections, they will
+# If these are the only four lines for local connections, they will
 # allow local users to connect only to their own databases (databases
-# with the same name as their database user name) except for administrators
-# and members of role "support", who can connect to all databases.  The file
-# $PGDATA/admins contains a list of names of administrators.  Passwords
+# with the same name as their database user name) except for users ending
+# with "helpdesk", administrators and members of role "support",
+# who can connect to all databases.
+# The file $PGDATA/admins contains a list of names of administrators.  
Passwords
 # are required in all cases.
 #
 # TYPE  DATABASE        USER            ADDRESS                 METHOD
 local   sameuser        all                                     md5
+local   all             /^.*helpdesk$                           md5
 local   all             @admins                                 md5
 local   all             +support                                md5
 
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index 947a1edcef..165ef4d397 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -119,7 +119,8 @@ static List *tokenize_inc_file(List *tokens, const char 
*outer_filename,
                                                           const char 
*inc_filename, int elevel, char **err_msg);
 static bool parse_hba_auth_opt(char *name, char *val, HbaLine *hbaline,
                                                           int elevel, char 
**err_msg);
-static int     regcomp_auth_token(AuthToken *token);
+static int     regcomp_auth_token(AuthToken *token, char *filename, int 
line_num,
+                                                          char **err_msg, int 
elevel);
 static int     regexec_auth_token(const char *match, AuthToken *token,
                                                           size_t nmatch, 
regmatch_t pmatch[]);
 
@@ -308,7 +309,8 @@ copy_auth_token(AuthToken *in)
  * input.  Returns the result of pg_regcomp().
  */
 static int
-regcomp_auth_token(AuthToken *token)
+regcomp_auth_token(AuthToken *token, char *filename, int line_num,
+                                  char **err_msg, int elevel)
 {
        pg_wchar   *wstr;
        int                     wlen;
@@ -325,6 +327,22 @@ regcomp_auth_token(AuthToken *token)
                                                                wstr, 
strlen(token->string + 1));
 
        rc = pg_regcomp(token->regex, wstr, wlen, REG_ADVANCED, 
C_COLLATION_OID);
+       if (rc)
+       {
+               char            errstr[100];
+
+               pg_regerror(rc, token->regex, errstr, sizeof(errstr));
+               ereport(elevel,
+                               (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
+                                errmsg("invalid regular expression \"%s\": %s",
+                                               token->string + 1, errstr),
+                                errcontext("line %d of configuration file 
\"%s\"",
+                                                       line_num, filename)));
+
+               *err_msg = psprintf("invalid regular expression \"%s\": %s",
+                                                       token->string + 1, 
errstr);
+
+       }
 
        pfree(wstr);
        return rc;
@@ -652,13 +670,18 @@ check_role(const char *role, Oid roleid, List *tokens)
        foreach(cell, tokens)
        {
                tok = lfirst(cell);
-               if (!tok->quoted && tok->string[0] == '+')
+               if (!token_has_regexp(tok))
                {
-                       if (is_member(roleid, tok->string + 1))
+                       if (!tok->quoted && tok->string[0] == '+')
+                       {
+                               if (is_member(roleid, tok->string + 1))
+                                       return true;
+                       }
+                       else if (token_matches(tok, role) ||
+                                        token_is_keyword(tok, "all"))
                                return true;
                }
-               else if (token_matches(tok, role) ||
-                                token_is_keyword(tok, "all"))
+               else if (regexec_auth_token(role, tok, 0, NULL) == REG_OKAY)
                        return true;
        }
        return false;
@@ -685,22 +708,27 @@ check_db(const char *dbname, const char *role, Oid 
roleid, List *tokens)
                        if (token_is_keyword(tok, "replication"))
                                return true;
                }
-               else if (token_is_keyword(tok, "all"))
-                       return true;
-               else if (token_is_keyword(tok, "sameuser"))
+               else if (token_is_keyword(tok, "replication"))
+                       continue;                       /* never match this if 
not walsender */
+               else if (!token_has_regexp(tok))
                {
-                       if (strcmp(dbname, role) == 0)
+                       if (token_is_keyword(tok, "all"))
                                return true;
-               }
-               else if (token_is_keyword(tok, "samegroup") ||
-                                token_is_keyword(tok, "samerole"))
-               {
-                       if (is_member(roleid, dbname))
+                       else if (token_is_keyword(tok, "sameuser"))
+                       {
+                               if (strcmp(dbname, role) == 0)
+                                       return true;
+                       }
+                       else if (token_is_keyword(tok, "samegroup") ||
+                                        token_is_keyword(tok, "samerole"))
+                       {
+                               if (is_member(roleid, dbname))
+                                       return true;
+                       }
+                       else if (token_matches(tok, dbname))
                                return true;
                }
-               else if (token_is_keyword(tok, "replication"))
-                       continue;                       /* never match this if 
not walsender */
-               else if (token_matches(tok, dbname))
+               else if (regexec_auth_token(dbname, tok, 0, NULL) == REG_OKAY)
                        return true;
        }
        return false;
@@ -1119,8 +1147,15 @@ parse_hba_line(TokenizedAuthLine *tok_line, int elevel)
        tokens = lfirst(field);
        foreach(tokencell, tokens)
        {
-               parsedline->databases = lappend(parsedline->databases,
-                                                                               
copy_auth_token(lfirst(tokencell)));
+               AuthToken  *tok = copy_auth_token(lfirst(tokencell));
+
+               /*
+                * Compile a regex from the database token, if necessary.
+                */
+               if (regcomp_auth_token(tok, HbaFileName, line_num, err_msg, 
elevel))
+                       return NULL;
+
+               parsedline->databases = lappend(parsedline->databases, tok);
        }
 
        /* Get the roles. */
@@ -1139,8 +1174,15 @@ parse_hba_line(TokenizedAuthLine *tok_line, int elevel)
        tokens = lfirst(field);
        foreach(tokencell, tokens)
        {
-               parsedline->roles = lappend(parsedline->roles,
-                                                                       
copy_auth_token(lfirst(tokencell)));
+               AuthToken  *tok = copy_auth_token(lfirst(tokencell));
+
+               /*
+                * Compile a regex from the role token, if necessary.
+                */
+               if (regcomp_auth_token(tok, HbaFileName, line_num, err_msg, 
elevel))
+                       return NULL;
+
+               parsedline->roles = lappend(parsedline->roles, tok);
        }
 
        if (parsedline->conntype != ctLocal)
@@ -2374,7 +2416,6 @@ parse_ident_line(TokenizedAuthLine *tok_line, int elevel)
        List       *tokens;
        AuthToken  *token;
        IdentLine  *parsedline;
-       int                     rc;
 
        Assert(tok_line->fields != NIL);
        field = list_head(tok_line->fields);
@@ -2410,24 +2451,8 @@ parse_ident_line(TokenizedAuthLine *tok_line, int elevel)
         * Now that the field validation is done, compile a regex from the user
         * token, if necessary.
         */
-       rc = regcomp_auth_token(parsedline->token);
-       if (rc)
-       {
-               char            errstr[100];
-
-               pg_regerror(rc, parsedline->token->regex, errstr, 
sizeof(errstr));
-               ereport(elevel,
-                               (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
-                                errmsg("invalid regular expression \"%s\": %s",
-                                               parsedline->token->string + 1, 
errstr),
-                                errcontext("line %d of configuration file 
\"%s\"",
-                                                       line_num, 
IdentFileName)));
-
-               *err_msg = psprintf("invalid regular expression \"%s\": %s",
-                                                       
parsedline->token->string + 1, errstr);
-
+       if (regcomp_auth_token(parsedline->token, IdentFileName, line_num, 
err_msg, elevel))
                return NULL;
-       }
 
        return parsedline;
 }
diff --git a/src/test/authentication/t/001_password.pl 
b/src/test/authentication/t/001_password.pl
index ea664d18f5..acb8bfaac8 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -81,6 +81,14 @@ $node->safe_psql(
         GRANT ALL ON sysuser_data TO md5_role;");
 $ENV{"PGPASSWORD"} = 'pass';
 
+# Create a role that contains a comma to stress the parsing.
+$node->safe_psql('postgres',
+       q{SET password_encryption='md5'; CREATE ROLE "md5,role" LOGIN PASSWORD 
'pass';}
+);
+
+# Create a database to test regular expression.
+$node->safe_psql('postgres', "CREATE database regex_testdb;");
+
 # For "trust" method, all users should be able to connect. These users are not
 # considered to be authenticated.
 reset_pg_hba($node, 'all', 'all', 'trust');
@@ -200,6 +208,39 @@ append_to_file(
 
 test_conn($node, 'user=md5_role', 'password from pgpass', 0);
 
+# Testing with regular expression for username. Note that the third regex
+# matches in this case.
+reset_pg_hba($node, 'all', '/^.*nomatch.*$, baduser, /^md.*$', 'password');
+test_conn($node, 'user=md5_role', 'password, matching regexp for username',
+       0);
+
+# The third regex does not match anymore.
+reset_pg_hba($node, 'all', '/^.*nomatch.*$, baduser, /^m_d.*$', 'password');
+test_conn($node, 'user=md5_role',
+       'password, non matching regexp for username',
+       2, log_unlike => [qr/connection authenticated:/]);
+
+# Test with a comma in the regular expression
+reset_pg_hba($node, 'all', '"/^.*5,.*e$"', 'password');
+test_conn($node, 'user=md5,role', 'password', 'matching regexp for username',
+       0);
+
+# Testing with regular expression for dbname. The third regex matches.
+reset_pg_hba($node, '/^.*nomatch.*$, baddb, /^regex_t.*b$', 'all',
+       'password');
+test_conn(
+       $node, 'user=md5_role dbname=regex_testdb', 'password,
+   matching regexp for dbname', 0);
+
+# The third regex does not match anymore.
+reset_pg_hba($node, '/^.*nomatch.*$, baddb, /^regex_t.*ba$',
+       'all', 'password');
+test_conn(
+       $node,
+       'user=md5_role dbname=regex_testdb',
+       'password, non matching regexp for dbname',
+       2, log_unlike => [qr/connection authenticated:/]);
+
 unlink($pgpassfile);
 delete $ENV{"PGPASSFILE"};
 

Reply via email to