OFFSET

David Rowley Mon, 10 Mar 2025 04:46:28 -0700

On Mon, 10 Mar 2025 at 23:17, Bykov Ivan <[email protected]> wrote:
> I have attached the updated Variant A patch with the following changes:
> - Implemented a pg_stat_statements regression test (see similar test results
>   on REL_17_3 in Query-ID-collision-at_pg_stat_statements-1ea6e890b22.txt).
> - Added extra description about the Query ID collision
>   in src/include/nodes/parsenodes.h.


I've not paid much attention to the jumble code before, but I did
glance at this patch:

+ *
+ *   The query jumbling process may trigger a Query ID collision when
+ *   two Query fields located sequentially contain the same type of nodes
+ *   (a list of nodes), and both of them may be NULL.
+ *   List of such groups of fields:
+ *     - limitOffset and limitCount (which may be an int8 expression or NULL);
+ *     - distinctClause, and sortClause (which may be a list of
+ *       SortGroupClause entries or NULL);
+ *   To prevent this collision, we should insert at least one scalar field
+ *   without the attribute query_jumble_ignore between such fields.
  */
 typedef struct Query
 {
@@ -208,11 +218,11 @@ typedef struct Query

  List    *distinctClause; /* a list of SortGroupClause's */

- List    *sortClause; /* a list of SortGroupClause's */
-
  Node    *limitOffset; /* # of result tuples to skip (int8 expr) */
- Node    *limitCount; /* # of result tuples to return (int8 expr) */
  LimitOption limitOption; /* limit type */
+ Node    *limitCount; /* # of result tuples to return (int8 expr) */
+
+ List    *sortClause; /* a list of SortGroupClause's */

  List    *rowMarks; /* a list of RowMarkClause's */

It seems to me that if this fixes the issue, that the next similar one
is already lurking in the shadows waiting to jump out on us.

Couldn't we adjust all this code so that we pass a seed to
AppendJumble() that's the offsetof(Struct, field) that we're jumbling
and incorporate that seed into the hash?  I don't think we could
possibly change the offset in a minor version, so I don't think
there's a risk that could cause jumble value instability. Possibly
different CPU architectures would come up with different offsets
through having different struct alignment requirements, but we don't
make any guarantees about the jumble values matching across different
CPU architectures anyway.

I admit to only having spent a few minutes looking at this, so there
may well be a good reason for not doing this. Maybe Michael can tell
me what that is...?

A very quickly put together patch is attached. I intend this only to
demonstrate what I mean, not because I want to work on it myself.

David

v1-0001-A-very-rough-proof-of-concept-to-fix-the-problem-.patch
Description: Binary data

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

Reply via email to