On Thu, Mar 21, 2024 at 08:59:54PM -0400, Tom Lane wrote:
> Nathan Bossart <nathandboss...@gmail.com> writes:
>> On Thu, Mar 21, 2024 at 03:40:12PM -0500, Nathan Bossart wrote:
>>> On Thu, Mar 21, 2024 at 04:31:45PM -0400, Tom Lane wrote:
>>>> I don't think we have any really cheap way to de-duplicate the role
>>>> OIDs, especially seeing that it has to be done on-the-fly within the
>>>> collection loop, and the order of roles_list is at least potentially
>>>> interesting.  Not sure how to make further progress without a lot of
>>>> work.
> 
>>> Assuming these are larger lists, this might benefit from optimizations
>>> involving SIMD intrinsics.
> 
>> Never mind.  With the reproduction script, I'm only seeing a ~2%
>> improvement with my patches.
> 
> Yeah, you cannot beat an O(N^2) problem by throwing SIMD at it.

I apparently had some sort of major brain fade when I did this because I
didn't apply your hashing patch when I ran this SIMD test.  With it
applied, I see a speedup of ~39%, which makes a whole lot more sense to me.
If I add the Bloom patch (updated with your suggestions), I get another
~73% improvement from there, and a much smaller regression in the role
creation portion.

            hash     hash+simd hash+simd+bloom
    create  1.27     1.27      1.28
    grant   0.18     0.11      0.03

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index cf5d08576a..4952dc9c1e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -36,6 +36,7 @@
 #include "common/hashfn.h"
 #include "foreign/foreign.h"
 #include "funcapi.h"
+#include "lib/bloomfilter.h"
 #include "lib/qunique.h"
 #include "miscadmin.h"
 #include "utils/acl.h"
@@ -4946,6 +4947,7 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 	ListCell   *l;
 	List	   *new_cached_roles;
 	MemoryContext oldctx;
+	bloom_filter *bf = NULL;
 
 	Assert(OidIsValid(admin_of) == PointerIsValid(admin_role));
 	if (admin_role != NULL)
@@ -5023,16 +5025,46 @@ roles_is_member_of(Oid roleid, enum RoleRecurseType type,
 			 * graph, we must test for having already seen this role. It is
 			 * legal for instance to have both A->B and A->C->B.
 			 */
-			roles_list = list_append_unique_oid(roles_list, otherid);
+			if ((bf && bloom_lacks_element(bf, (unsigned char *) &otherid, sizeof(Oid))) ||
+				!list_member_oid(roles_list, otherid))
+			{
+				if (bf == NULL && list_length(roles_list) > 1000)
+				{
+					bf = bloom_create(1024 * 1024, work_mem, 0);
+					foreach_oid(role, roles_list)
+						bloom_add_element(bf, (unsigned char *) &role, sizeof(Oid));
+				}
+				roles_list = lappend_oid(roles_list, otherid);
+				if (bf)
+					bloom_add_element(bf, (unsigned char *) &otherid, sizeof(Oid));
+			}
 		}
 		ReleaseSysCacheList(memlist);
 
 		/* implement pg_database_owner implicit membership */
 		if (memberid == dba && OidIsValid(dba))
-			roles_list = list_append_unique_oid(roles_list,
-												ROLE_PG_DATABASE_OWNER);
+		{
+			Oid			dbowner = ROLE_PG_DATABASE_OWNER;
+
+			if ((bf && bloom_lacks_element(bf, (unsigned char *) &dbowner, sizeof(Oid))) ||
+				!list_member_oid(roles_list, dbowner))
+			{
+				if (bf == NULL && list_length(roles_list) > 1000)
+				{
+					bf = bloom_create(1024 * 1024, work_mem, 0);
+					foreach_oid(role, roles_list)
+						bloom_add_element(bf, (unsigned char *) &role, sizeof(Oid));
+				}
+				roles_list = lappend_oid(roles_list, dbowner);
+				if (bf)
+					bloom_add_element(bf, (unsigned char *) &dbowner, sizeof(Oid));
+			}
+		}
 	}
 
+	if (bf)
+		bloom_free(bf);
+
 	/*
 	 * Copy the completed list into TopMemoryContext so it will persist.
 	 */

Reply via email to