> On 14 Aug 2025, at 10:18, Michael Paquier <mich...@paquier.xyz> wrote:
> 
> That seems kind of limited to me in scope.  The executor is only one
> part of the system.  I would have considered using an xact callback
> when a transaction is aborted if I were to do a patch like the one you
> are proposing, to know how many times a transaction is failing at a
> specific phase, because you should know the latest query_id in this
> case to be able to put a counter update in the correct slot (right?).
> 
> +-- aborted calls tracking
> +SELECT pg_sleep(0.5);
> + pg_sleep 
> +----------
> + 
> +(1 row)
> 
> Using hardcoded sleep times for deterministic tests is never a good
> idea.  On fast machines, they eat time for nothing.  And if not
> written correctly, they may not achieve their goal on slow machines
> because the sleep threshold may be reached before the custom action is
> taken.  If you want to force a failure, you should just use a SQL that
> you know would fail at execution time (based on your implementation
> expects).

Hello,

Thanks for the review.

On scope and callbacks
- I prototyped a version with RegisterXactCallback and RegisterSubXactCallback 
but I’m not very happy with the result. Same limitation as v4: we only have a 
stable identity (queryId) after parse/analyze. The xact/subxact callbacks fire 
with no statement context, so you still need to know “which statement was 
active” at abort time.
- To deal with that, I keep a small backend-local tracker of the current 
statement’s key (userid, dbid, queryId, toplevel) when we enter the executor, 
mark it completed on success, and bump calls_aborted in the callbacks only if 
active && !completed.
- This reliably captures executor-time failures and timeouts. It cannot 
attribute errors that happen before post-parse (syntax/name resolution), 
because queryId isn’t set yet. That’s the hard boundary. Planner failures 
aren’t attributed in this version because the tracker is set in 
ExecutorRun..still.

On the value of callbacks
- It seems callbacks alone are insufficient (no per-statement context).
- If we want to preserve pgss semantics (only report statements that reached 
post-parse and have a queryId), the v4 approach is simpler and safer; I could 
look into planner error handling in a next patch version.
- If we also want to count very-early errors, that implies raw-text keyed 
entries (no queryId), which changes pg_stat_statements’ semantics 
(normalization, PII risk, cardinality). That probably should be separate 
extension as Julien Rouhaud suggested: 
https://www.postgresql.org/message-id/aJ2ov4KYM2vX4uAs%40jrouhaud. Not sure I 
would be able to make this, also I need to have something that will be 
supported by managed PostgreSQL with extension restrictions.

Happy to adjust if you see a better way to thread “latest queryId” into the 
abort path without the local tracker.

On pre-creating entries earlier
- I considered creating entries from raw text to count very-early errors, but 
I’ve avoided it: pgss is keyed on the normalized parse tree (queryId). 
Pre-hashing raw text weakens normalization, risks storing sensitive literals 
before jumbling, and adds lock churn in hot paths.

Tests
- Agreed on sleep flakiness. I switched to deterministic executor-time 
failures: duplicate key (primary key violation) and a check constraint 
violation. Both trip calls_aborted without timing assumptions.

Also, are we okay tracking statements that never reach post-parse (i.e., no 
queryId)? That would effectively key on raw text, which changes 
pg_stat_statements’ semantics.

Benoit

Attachment: patch_pgss_calls_aborted_v5.patch
Description: Binary data

Reply via email to