On 30/3/2023 12:57, Julien Rouhaud wrote:
Extensions could need to pass some query-related data through all stages of
the query planning and execution. As a trivial example, pg_stat_statements
uses queryid at the end of execution to save some statistics. One more
reason - extensions now conflict on queryid value and the logic of its
changing. With this patch, it can be managed.

I just had a quick lookc at the patch, and IIUC it doesn't really help on that
side, as there's still a single official "queryid" that's computed, stored
everywhere and later used by pg_stat_statements (which does then store in
additionally to that official queryid).
Thank you for the attention!
This patch doesn't try to solve the problem of oneness of queryId. In this patch we change pg_stat_statements and it doesn't set 0 into queryId field according to its internal logic. And another extension should do the same - use queryId on your own but not erase it - erase your private copy in the ext_field.

Ruthless pgbench benchmark shows that we got some overhead:
1.6% - in default mode
4% - in prepared mode
~0.1% in extended mode.

That's a quite significant overhead.  But the only reason to accept such a
change is to actually use it to store additional data, so it would be
interesting to see what the overhead is like once you store at least 2
different values there.
Yeah, but as I said earlier, it can be reduced to much smaller value just with simple optimization. Here I intentionally avoid it to discuss the core concept.

- Are we need to invent a registration procedure to do away with the names
of entries and use some compact integer IDs?

Note that the patch as proposed doesn't have any defense for two extensions
trying to register something with the same name, or update a stored value, as
AddExtensionDataToNode() simply prepend the new value to the list.  You can
actually update the value by just storing the new value, but it will add a
significant overhead to every other extension that want to read another value.
Thanks a lot! Patch in attachment implements such an idea - extension can allocate some entries and use these private IDs to add entries. I hope, an extension would prefer to use only one entry for all the data to manage overhead, but still.

The API is also quite limited as each stored value has a single identifier.
What if your extension needs to store multiple things?  Since it's all based on
Node you can't really store some custom struct, so you probably have to end up
with things like "my_extension.my_val1", "my_extension.my_val2" which isn't
great.
Main idea here - if an extension holds custom struct and want to pass it along all planning and execution stages it should use extensible node with custom read/write/copy routines.

--
regards,
Andrey Lepikhov
Postgres Professional
From ab101322330684e9839e46c26f70ad5462e40dac Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" <a.lepik...@postgrespro.ru>
Date: Thu, 30 Mar 2023 21:49:32 +0500
Subject: [PATCH 2/2] Add ExtendedData entry registration routines to use
 entryId instead of symbolic name.

---
 .../pg_stat_statements/pg_stat_statements.c   |  7 +-
 src/backend/commands/extension.c              | 69 +++++++++++++++----
 src/include/commands/extension.h              |  6 +-
 src/include/nodes/parsenodes.h                |  2 +-
 4 files changed, 64 insertions(+), 20 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index fc47661c86..5b21163365 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -109,11 +109,11 @@ static const uint32 PGSS_PG_MAJOR_VERSION = 
PG_VERSION_NUM / 100;
                                                                        !IsA(n, 
DeallocateStmt))
 
 #define GET_QUERYID(node) \
-       (Bigint *) GetExtensionData(node->ext_field, "pg_stat_statements")
+       (Bigint *) GetExtensionData(node->ext_field, extendedEntryID)
 
 #define INSERT_QUERYID(node, queryid) \
        castNode(Bigint, AddExtensionDataToNode((Node *) node, \
-                                                                               
        "pg_stat_statements", \
+                                                                               
        extendedEntryID, \
                                                                                
        (Node *) makeBigint((int64) queryid), \
                                                                                
        true))
 /*
@@ -298,6 +298,7 @@ static bool pgss_track_utility = true;      /* whether to 
track utility commands */
 static bool pgss_track_planning = false;       /* whether to track planning
                                                                                
         * duration */
 static bool pgss_save = true;  /* whether to save stats across shutdown */
+static int extendedEntryID = -1; /* Use to add queryId into the Query struct */
 
 
 #define pgss_enabled(level) \
@@ -482,6 +483,8 @@ _PG_init(void)
        ExecutorEnd_hook = pgss_ExecutorEnd;
        prev_ProcessUtility = ProcessUtility_hook;
        ProcessUtility_hook = pgss_ProcessUtility;
+
+       extendedEntryID = RegisterExtensionDataEntry("pg_stat_statements");
 }
 
 /*
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 74f8a820cb..effaeada24 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -3396,8 +3396,53 @@ read_whole_file(const char *filename, int *length)
        return buf;
 }
 
+#define EXTDATA_ENTRIES_NUM    (64)
+
+/*
+ * Storage for registered ExtensionData entries.
+ * Most common use case should be - registering one entry on extension startup.
+ * It would be better to store complex structures as a node tree in single
+ * extension entry, because it can affect performance and the extension only
+ * knows its data and should be responsible for additional overhead.
+ * So, we need quite limited number of entries.
+ */
+struct
+{
+       char *label;
+} ExtensionDataEntries [EXTDATA_ENTRIES_NUM] = {0};
+
+int
+RegisterExtensionDataEntry(const char *label)
+{
+       int num;
+
+       for (num = 0; num < EXTDATA_ENTRIES_NUM; num++)
+       {
+               MemoryContext oldmemctx;
+
+               if (ExtensionDataEntries[num].label != NULL)
+                       continue;
+
+               oldmemctx = MemoryContextSwitchTo(TopMemoryContext);
+               ExtensionDataEntries[num].label = pstrdup(label);
+               MemoryContextSwitchTo(oldmemctx);
+               break;
+       }
+
+       return (num < EXTDATA_ENTRIES_NUM) ? num : -1;
+}
+
+void
+UnRegisterExtensionDataEntry(int num)
+{
+       if (ExtensionDataEntries[num].label == NULL)
+               elog(ERROR, "Extension data entry already freed");
+
+       pfree(ExtensionDataEntries[num].label);
+}
+
 static ExtensionData *
-_initial_actions(Node *node, const char *entryname)
+_initial_actions(Node *node, int entryId)
 {
        ExtensionData **ext_field;
        ExtensionData  *scanner;
@@ -3417,19 +3462,13 @@ _initial_actions(Node *node, const char *entryname)
        }
 
        Assert(strlen(entryname) > 0);
-#ifdef USE_ASSERT_CHECKING
-       {
-               const char *c;
-               /* check name doesn't contain escapable symbols */
-               for (c = entryname; *c; c++)
-                       Assert(isgraph(*c) && strchr("(){}\\\"", *c) == NULL);
-       }
-#endif
+       Assert(entryId >= 0 && entryId < EXTDATA_ENTRIES_NUM &&
+                  ExtensionDataEntries[entryId].label != NULL);
 
        /* Check existing entry */
        for (scanner = *ext_field; scanner != NULL; scanner = scanner->next)
        {
-               if (strcmp(scanner->name, entryname) == 0)
+               if (scanner->eid == entryId)
                {
                                elem = scanner;
                                break;
@@ -3439,8 +3478,8 @@ _initial_actions(Node *node, const char *entryname)
        if (elem == NULL)
        {
                elem = makeNode(ExtensionData);
+               elem->eid = entryId;
                elem->next = *ext_field;
-               elem->name = pstrdup(entryname);
                *ext_field = elem;
        }
 
@@ -3448,7 +3487,7 @@ _initial_actions(Node *node, const char *entryname)
 }
 
 Node *
-AddExtensionDataToNode(Node *node, const char *entryname, const Node *data,
+AddExtensionDataToNode(Node *node, int entryId, const Node *data,
                                           bool noCopy)
 {
        MemoryContext   destCtx = GetMemoryChunkContext(node);
@@ -3458,7 +3497,7 @@ AddExtensionDataToNode(Node *node, const char *entryname, 
const Node *data,
        /* Safely store in the Query memory context */
        oldctx = MemoryContextSwitchTo(destCtx);
 
-       elem = _initial_actions(node, entryname);
+       elem = _initial_actions(node, entryId);
        elem->node = (Node *) (noCopy ? data : copyObject(data));
 
        MemoryContextSwitchTo(oldctx);
@@ -3487,11 +3526,11 @@ AddExtensionDataToNode(Node *node, const char 
*entryname, const Node *data,
  * NULL, if not found.
  */
 Node *
-GetExtensionData(ExtensionData *extdata, const char *entryname)
+GetExtensionData(ExtensionData *extdata, int entryId)
 {
        while (extdata != NULL)
        {
-               if (strcmp(entryname, extdata->name) == 0)
+               if (extdata->eid == entryId)
                        return extdata->node;
 
                extdata = extdata->next;
diff --git a/src/include/commands/extension.h b/src/include/commands/extension.h
index a652fdbf1b..c13ceee478 100644
--- a/src/include/commands/extension.h
+++ b/src/include/commands/extension.h
@@ -53,8 +53,10 @@ extern bool extension_file_exists(const char *extensionName);
 extern ObjectAddress AlterExtensionNamespace(const char *extensionName, const 
char *newschema,
                                                                                
         Oid *oldschema);
 
-extern Node *AddExtensionDataToNode(Node *node, const char *entryname,
+extern int RegisterExtensionDataEntry(const char *label);
+extern void UnRegisterExtensionDataEntry(int num);
+extern Node *AddExtensionDataToNode(Node *node, int entryId,
                                                                        const 
Node *data, bool noCopy);
-extern Node *GetExtensionData(ExtensionData *extdata, const char *entryname);
+extern Node *GetExtensionData(ExtensionData *extdata, int entryId);
 
 #endif                                                 /* EXTENSION_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 59a41f398d..8996780450 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -117,7 +117,7 @@ typedef struct ExtensionData
        pg_node_attr(no_equal, no_read, no_query_jumble)
 
        NodeTag                                 type;
-       char                               *name;
+       int16                                   eid; /* entry id. See 
RegisterExtensionDataEntry */
        Node                               *node;
        struct ExtensionData   *next;
 } ExtensionData;
-- 
2.40.0

Reply via email to