Hi,

Thank you for the kind comments.

I've updated the previous patch.

Below is a summary of the changes:
1. The code path and documentation have been corrected based on your feedback.
2. The following message is now suppressed by default. Instead, an error message
is added when a client aborts during SQL execution. (v6-0003-Suppress-xxx.patch)

```
                                if (verbose_errors)
                                        pg_log_error("client %d script %d 
aborted in command %d query %d: %s",
                                                                 st->id, 
st->use_file, st->command, qrynum,
                                                                 
PQerrorMessage(st->con));
```


On 2025/07/04 22:01, Hayato Kuroda (Fujitsu) wrote:

>>> Could I confirm what you mean by "start new one"?
>>>
>>> In the current pgbench, when a query raises an error (a deadlock or
>>> serialization failure), it can be retried using the same random state.
>>> This typically means the query will be retried with the same parameter 
>>> values.
>>>
>>> On the other hand, when the query ultimately fails (possibly after some 
>>> retries),
>>> the transaction is marked as a "failure", and the next transaction starts 
>>> with a
>>> new random state (i.e., with new parameter values).
>>>
>>> Therefore, if a query fails due to a unique constraint violation and is 
>>> retried
>>> with the same parameters, it will keep failing on each retry.
>>
>> Thank you for your explanation. I understand it as you described. I've also
>> attached a schematic diagram of the state machine. I hope it will help 
>> clarify
>> the behavior of pgbench. Red arrows represent the transition of state when 
>> SQL
>> command fails and --continue-on-error option is specified.
> 
> Thanks for the diagram, it's quite helpful. Let me share my understanding and 
> opinion.
> 
> The terminology "retry" is being used for the transition 
> CSTATE_ERROR->CSTATE_RETRY,
> and here the random state would be restored to be the begining:
> 
> ```
>                               /*
>                                * Reset the random state as they were at the 
> beginning of the
>                                * transaction.
>                                */
>                               st->cs_func_rs = st->random_state;
> ```
> 
> In --continue-on-error case, the transaction CSTATE_WAIT_RESULT->CSTATE_ERROR
> can happen even the reason of failure is not serialization and deadlock.
> Ultimately the pass will reach ...->CSTATE_END_TX->CSTATE_CHOOSE_SCRIPT, the
> beginning of the state machine. cs_func_rs is not overwritten in the route so
> that different random value would be generated, or even another script may be
> chosen. Is it correct?

Yes, I believe that’s correct.

> 
> 01.
> ```
> $ git am ../patches/pgbench/v5-0001-Add-continue-on-error-opt
> ion-to-pgbench.patch
> Applying: When the option is set, client rolls back the failed transaction 
> and...
> .git/rebase-apply/patch:65: trailing whitespace.
>    <literal>serialization</literal>, <literal>deadlock</literal>, or 
> .git/rebase-apply/patch:139: trailing whitespace.
>    <option>--max-tries</option> option is not equal to 1 and 
> warning: 2 lines add whitespace errors.
> ```
> 
> I got warnings when I applied the patch. Please fix it.

It's been fixed.

> 
> 02. 
> ```
> +        *       'serialization_failures' + 'deadlock_failures' +
> +        *   'other_sql_failures' (they got a error when continue-on-error 
> option
> ```
> The first line has the tab, but it should be normal blank.

I hadn't noticed it. It's fixed.


> 03.
> ```
> +                               else if (continue_on_error | 
> canRetryError(st->estatus))
> ```
> 
> I feel "|" should be "||".

Thank you for pointing out. Fixed it.

> 04.
> ```
>      <term><replaceable>retries</replaceable></term>
>      <listitem>
>       <para>
>        number of retries after serialization or deadlock errors
>        (zero unless <option>--max-tries</option> is not equal to one)
>       </para>
>      </listitem>
> ```
> 
> To confirm; --continue-on-error won't be counted here because it is not 
> "retry",
> in other words, it does not reach CSTATE_RETRY, right?

Yes. I agree with Nagata-san [1] — --continue-on-error is not considered a
"retry" because it doesn't reach CSTATE_RETRY.


On 2025/07/05 0:03, Yugo Nagata wrote:
>>>              case PGRES_NONFATAL_ERROR:
>>>              case PGRES_FATAL_ERROR:
>>>                  st->estatus = getSQLErrorStatus(PQresultErrorField(res,
>>>                                                                     
>>> PG_DIAG_SQLSTATE));
>>>                  if (canRetryError(st->estatus))
>>>                  {
>>>                      if (verbose_errors)
>>>                          commandError(st, PQerrorMessage(st->con));
>>>                      goto error;
>>>                  }
>>>                  /* fall through */
>>>
>>>              default:
>>>                  /* anything else is unexpected */
>>>                  pg_log_error("client %d script %d aborted in command %d 
>>> query %d: %s",
>>>                               st->id, st->use_file, st->command, qrynum,
>>>                               PQerrorMessage(st->con));
>>>                  goto error;
>>>          }
>>>
>>> When an SQL error other than a serialization or deadlock error occurs, an 
>>> error message is
>>> output via pg_log_error in this code path. However, I think this should be 
>>> reported only
>>> when verbose_errors is set, similar to how serialization and deadlock 
>>> errors are handled when
>>> --continue-on-error is enabled
>>
>> I think the error message logged via pg_log_error is useful when 
>> verbose_errors 
>> is not specified, because it informs users that the client has exited. 
>> Without 
>> it, users may not notice that something went wrong.
> 
> However, if a large number of errors occur, this could result in a 
> significant increase
> in stderr output during the benchmark.
> 
> Users can still notice that something went wrong by checking the “number of 
> other failures”
> reported after the run, and I assume that in most cases, when 
> --continue-on-error is enabled,
> users aren’t particularly interested in seeing individual error messages as 
> they happen.
> 
> It’s true that seeing error messages during the benchmark might be useful in 
> some cases, but
> the same could be said for serialization or deadlock errors, and that’s 
> exactly what the
> --verbose-errors option is for.


I understand your concern. The condition for calling pg_log_error() was modified
to reduce stderr output.
Additionally, an error message was added for cases where some clients aborted
while executing SQL commands, similar to other code paths that transition to
st->state = CSTATE_ABORTED, as shown in the example below:

```
                                                pg_log_error("client %d aborted 
while establishing connection", st->id);
                                                st->state = CSTATE_ABORTED;
```


> Here are some comments on the patch.
> 
> (1)
> 
>                 }
> -               else if (canRetryError(st->estatus))
> +               else if (continue_on_error | canRetryError(st->estatus))
>                     st->state = CSTATE_ERROR;
>                 else
>                     st->state = CSTATE_ABORTED;
> 
> Due to this change, when --continue-on-error is enabled, st->state is set to
> CSTATE_ERROR regardless of the type of error returned by readCommandResponse.
> When the error is not ESTATUS_OTHER_SQL_ERROR, e.g. ESTATUS_META_COMMAND_ERROR
> due to a failure of \gset with query returning more the one row.
> 
> Therefore, this should be like:
> 
>                else if ((st->estatus == ESTATUS_OTHER_SQL_ERROR  && 
> continue_on_error) ||
>                          canRetryError(st->estatus))
> 

Thanks for pointing that out — I’ve corrected it.


> (2)
> 
> +          "  --continue-on-error      continue processing transations after 
> a trasaction fails\n"
> 
> "trasaction" is a typo and including "transaction" twice looks a bit 
> redundant.
> Instead using the word "transaction", how about:
> 
>  "--continue-on-error continue running after an SQL error" ?
> 
> This version is shorter, avoids repetition, and describes well the actual 
> behavior when
> SQL statements fail.

Fixed it.

> (3)
> 
> -    * A failed transaction is defined as unsuccessfully retried transactions.
> +    * A failed transaction is defined as unsuccessfully retried transactions
> +    * unless continue-on-error option is specified.
>      * It can be one of two types:
>      *
>      * failed (the number of failed transactions) =
> @@ -411,6 +412,12 @@ typedef struct StatsData
>      *   'deadlock_failures' (they got a deadlock error and were not
>      *                        successfully retried).
>      *
> +    * When continue-on-error option is specified,
> +    * failed (the number of failed transactions) =
> +    *   'serialization_failures' + 'deadlock_failures' +
> +    *   'other_sql_failures' (they got a error when continue-on-error option
> +    *                         was specified).
> +    *
> 
> To explain explicitly that there are two definitions of failed transactions
> depending on the situation, how about:
> 
> """
>  A failed transaction is counted differently depending on whether
>  the --continue-on-error option is specified.
> 
>  Without --continue-on-error:
>  
>  failed (the number of failed transactions) =
>   'serialization_failures' (they got a serialization error and were not
>                             successfully retried) +
>   'deadlock_failures' (they got a deadlock error and were not
>                        successfully retried).
> 
>  When --continue-on-error is specified:
> 
>  failed (number of failed transactions) =
>    'serialization_failures' + 'deadlock_failures' +
>    'other_sql_failures'  (they got some other SQL error; the transaction was
>                           not retried and counted as failed due to
>                           --continue-on-error).
> """

Thank you for your suggestion. I modified it accordingly.


> (4)
> +   int64       other_sql_failures; /* number of failed transactions for
> +                                    * reasons other than
> +                                    * serialization/deadlock failure , which
> +                                    * is enabled if --continue-on-error is
> +                                    * used */
> 
> Is "counted" is more proper than "enabled" here?

Fixed.


> 
> Af for the documentations:
> (5)
>    The next line reports the number of failed transactions due to
> -  serialization or deadlock errors (see <xref 
> linkend="failures-and-retries"/>
> -  for more information).
> +  serialization or deadlock errors by default (see
> +  <xref linkend="failures-and-retries"/> for more information).
> 
> Would it be more readable to simply say:
> "The next line reports the number of failed transactions (see ... for more 
> information),
> since definition of "failed transaction" has become a bit messy?
> 

I fixed it to the simple explanation.

> (6)
>     connection with the database server was lost or the end of script was 
> reached
>     without completing the last transaction. In addition, if execution of an 
> SQL
>     or meta command fails for reasons other than serialization or deadlock 
> errors,
> -   the client is aborted. Otherwise, if an SQL command fails with 
> serialization or
> -   deadlock errors, the client is not aborted. In such cases, the current
> -   transaction is rolled back, which also includes setting the client 
> variables
> -   as they were before the run of this transaction (it is assumed that one
> -   transaction script contains only one transaction; see
> -   <xref linkend="transactions-and-scripts"/> for more information).
> +   the client is aborted by default. However, if the --continue-on-error 
> option
> +   is specified, the client does not abort and proceeds to the next 
> transaction
> +   regardless of the error. This case is reported as other failures in the 
> output.
> +   Otherwise, if an SQL command fails with serialization or deadlock errors, 
> the
> +   client is not aborted. In such cases, the current transaction is rolled 
> back,
> +   which also includes setting the client variables as they were before the 
> run
> +   of this transaction (it is assumed that one transaction script contains 
> only 
> +   one transaction; see <xref linkend="transactions-and-scripts"/> for more 
> information).
> 
> To emphasize the default behavior, I wonder it would be better to move "by 
> default"
> to the beginning of the statements; like
> 
>  "By default, if execution of an SQL or meta command fails for reasons other 
> than
>  serialization or deadlock errors, the client is aborted."
> 
> How about quoting "other failures"? like:
> 
>  "These cases are reported as "other failures" in the output."
> 
> Also, I feel the meaning of "Otherwise" has becomes somewhat unclear since the
> explanation of --continue-on-error was added between the sentences So, how 
> about
> clarifying that "the clients are not aborted due to serializable/deadlock 
> even without
> --continue-on-error".  For example;
> 
>  "On contrast, if an SQL command fails with serialization or deadlock errors, 
> the
>   client is not aborted even without  <option>--continue-on-error</option>. 
>   Instead, the current transaction is rolled back, which also includes setting
>   the client variables as they were before the run of this transaction
>   (it is assumed that one transaction script contains only 
>    one transaction; see <xref linkend="transactions-and-scripts"/> for more 
> information)."
> 

I've modified according to your suggestion.

> (7)
>     The main report contains the number of failed transactions. If the
> -   <option>--max-tries</option> option is not equal to 1, the main report 
> also
> +   <option>--max-tries</option> option is not equal to 1 and 
> +   <option>--continue-on-error</option> is not specified, the main report 
> also
>     contains statistics related to retries: the total number of retried
> 
> Is that true?
> The retreis statitics would be included even without --continue-on-error.

That was wrong. I corrected it.


[1]
https://www.postgresql.org/message-id/20250705002239.27e6e5a4ba22c047ac2fa16a%40sraoss.co.jp

Regards,
Rintaro Ikeda

From caa1ede6a7b5ac3e19b73943a1a810bf98e32e21 Mon Sep 17 00:00:00 2001
From: Rintaro Ikeda <ikedarinta...@oss.nttdata.com>
Date: Wed, 9 Jul 2025 23:36:37 +0900
Subject: [PATCH v6 1/3] Add --continue-on-error option

When the option is set, client rolls back the failed transaction and starts a
new one when its transaction fails due to the reason other than the deadlock and
serialization failure.
---
 doc/src/sgml/ref/pgbench.sgml                | 71 +++++++++++++++-----
 src/bin/pgbench/pgbench.c                    | 55 +++++++++++++--
 src/bin/pgbench/t/001_pgbench_with_server.pl | 22 ++++++
 3 files changed, 124 insertions(+), 24 deletions(-)

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ab252d9fc74..15fcb45e223 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -76,9 +76,8 @@ tps = 896.967014 (without initial connection time)
   and number of transactions per client); these will be equal unless the run
   failed before completion or some SQL command(s) failed.  (In
   <option>-T</option> mode, only the actual number of transactions is printed.)
-  The next line reports the number of failed transactions due to
-  serialization or deadlock errors (see <xref linkend="failures-and-retries"/>
-  for more information).
+  The next line reports the number of failed transactions (see
+  <xref linkend="failures-and-retries"/> for more information).
   The last line reports the number of transactions per second.
  </para>
 
@@ -790,6 +789,9 @@ pgbench <optional> <replaceable>options</replaceable> 
</optional> <replaceable>d
          <listitem>
           <para>deadlock failures;</para>
          </listitem>
+         <listitem>
+          <para>other failures;</para>
+         </listitem>
         </itemizedlist>
         See <xref linkend="failures-and-retries"/> for more information.
        </para>
@@ -914,6 +916,26 @@ pgbench <optional> <replaceable>options</replaceable> 
</optional> <replaceable>d
       </listitem>
      </varlistentry>
 
+     <varlistentry id="pgbench-option-continue-on-error">
+      <term><option>--continue-on-error</option></term>
+      <listitem>
+       <para>
+        Allows clients to continue their run even if an SQL statement fails 
due to
+        errors other than serialization or deadlock. Unlike serialization and 
deadlock
+        failures, clients do not retry the same transactions but start new 
transaction.
+        This option is useful when your custom script may raise errors due to 
some
+        reason like unique constraints violation. Without this option, the 
client is
+        aborted after such errors.
+       </para>
+       <para>
+        Note that serialization and deadlock failures never cause the client 
to be
+        aborted even after clients retries <option>--max-tries</option> times 
by
+        default, so they are not affected by this option.
+        See <xref linkend="failures-and-retries"/> for more information.
+       </para>
+      </listitem>
+     </varlistentry>
+
     </variablelist>
    </para>
 
@@ -2409,8 +2431,8 @@ END;
    will be reported as <literal>failed</literal>. If you use the
    <option>--failures-detailed</option> option, the
    <replaceable>time</replaceable> of the failed transaction will be reported 
as
-   <literal>serialization</literal> or
-   <literal>deadlock</literal> depending on the type of failure (see
+   <literal>serialization</literal>, <literal>deadlock</literal>, or
+   <literal>other</literal> depending on the type of failure (see
    <xref linkend="failures-and-retries"/> for more information).
   </para>
 
@@ -2638,6 +2660,16 @@ END;
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><replaceable>other_sql_failures</replaceable></term>
+     <listitem>
+      <para>
+       number of transactions that got a SQL error
+       (zero unless <option>--failures-detailed</option> is specified)
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
   </para>
 
@@ -2646,8 +2678,8 @@ END;
 <screen>
 <userinput>pgbench --aggregate-interval=10 --time=20 --client=10 --log 
--rate=1000 --latency-limit=10 --failures-detailed --max-tries=10 
test</userinput>
 
-1650260552 5178 26171317 177284491527 1136 44462 2647617 7321113867 0 9866 64 
7564 28340 4148 0
-1650260562 4808 25573984 220121792172 1171 62083 3037380 9666800914 0 9998 598 
7392 26621 4527 0
+1650260552 5178 26171317 177284491527 1136 44462 2647617 7321113867 0 9866 64 
7564 28340 4148 0 0
+1650260562 4808 25573984 220121792172 1171 62083 3037380 9666800914 0 9998 598 
7392 26621 4527 0 0
 </screen>
   </para>
 
@@ -2839,9 +2871,11 @@ statement latencies in milliseconds, failures and 
retries:
          <option>--exit-on-abort</option> is specified. Otherwise in the worst
          case they only lead to the abortion of the failed client while other
          clients continue their run (but some client errors are handled without
-         an abortion of the client and reported separately, see below). Later 
in
-         this section it is assumed that the discussed errors are only the
-         direct client errors and they are not internal
+         an abortion of the client and reported separately, see below). When
+         <option>--continue-on-error</option> is specified, the client
+         continues to process new transactions even if it encounters an error.
+         Later in this section it is assumed that the discussed errors are only
+         the direct client errors and they are not internal
          <application>pgbench</application> errors.
        </para>
      </listitem>
@@ -2851,14 +2885,17 @@ statement latencies in milliseconds, failures and 
retries:
   <para>
    A client's run is aborted in case of a serious error; for example, the
    connection with the database server was lost or the end of script was 
reached
-   without completing the last transaction. In addition, if execution of an SQL
+   without completing the last transaction. By default, if execution of an SQL
    or meta command fails for reasons other than serialization or deadlock 
errors,
-   the client is aborted. Otherwise, if an SQL command fails with 
serialization or
-   deadlock errors, the client is not aborted. In such cases, the current
-   transaction is rolled back, which also includes setting the client variables
-   as they were before the run of this transaction (it is assumed that one
-   transaction script contains only one transaction; see
-   <xref linkend="transactions-and-scripts"/> for more information).
+   the client is aborted. However, if the --continue-on-error option is 
specified,
+   the client does not abort and proceeds to the next transaction regardless of
+   the error. These cases are reported as "other failures" in the output.
+   On contrast, if an SQL command fails with serialization or deadlock errors, 
the
+   client is not aborted even without  <option>--continue-on-error</option>.
+   Instead, the current transaction is rolled back, which also includes setting
+   the client variables as they were before the run of this transaction
+   (it is assumed that one transaction script contains only one transaction;
+   see <xref linkend="transactions-and-scripts"/> for more information).
    Transactions with serialization or deadlock errors are repeated after
    rollbacks until they complete successfully or reach the maximum
    number of tries (specified by the <option>--max-tries</option> option) / 
the maximum
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 497a936c141..4b3ddb3146f 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -402,15 +402,23 @@ typedef struct StatsData
         *   directly successful transactions (they were successfully completed 
on
         *                                     the first try).
         *
-        * A failed transaction is defined as unsuccessfully retried 
transactions.
-        * It can be one of two types:
+        * A failed transaction is counted differently depending on whether
+        * the --continue-on-error option is specified.
         *
+        * Without --continue-on-error:
         * failed (the number of failed transactions) =
         *   'serialization_failures' (they got a serialization error and were 
not
         *                             successfully retried) +
         *   'deadlock_failures' (they got a deadlock error and were not
         *                        successfully retried).
         *
+        * When --continue-on-error is specified:
+        *
+        * failed (number of failed transactions) =
+        *   'serialization_failures' + 'deadlock_failures' +
+        *   'other_sql_failures'  (they got some other SQL error; the 
transaction was
+        * not retried and counted as failed due to --continue-on-error).
+        *
         * If the transaction was retried after a serialization or a deadlock
         * error this does not guarantee that this retry was successful. Thus
         *
@@ -440,6 +448,11 @@ typedef struct StatsData
        int64           deadlock_failures;      /* number of transactions that 
were not
                                                                         * 
successfully retried after a deadlock
                                                                         * 
error */
+       int64           other_sql_failures; /* number of failed transactions for
+                                                                        * 
reasons other than
+                                                                        * 
serialization/deadlock failure, which
+                                                                        * is 
counted if --continue-on-error is
+                                                                        * 
specified */
        SimpleStats latency;
        SimpleStats lag;
 } StatsData;
@@ -770,6 +783,7 @@ static int64 total_weight = 0;
 static bool verbose_errors = false; /* print verbose messages of all errors */
 
 static bool exit_on_abort = false;     /* exit when any client is aborted */
+static bool continue_on_error = false; /* continue after errors */
 
 /* Builtin test scripts */
 typedef struct BuiltinScript
@@ -954,6 +968,7 @@ usage(void)
                   "  --log-prefix=PREFIX      prefix for transaction time log 
file\n"
                   "                           (default: \"pgbench_log\")\n"
                   "  --max-tries=NUM          max number of tries to run 
transaction (default: 1)\n"
+                  "  --continue-on-error      continue running after an SQL 
error\n"
                   "  --progress-timestamp     use Unix epoch timestamps for 
progress\n"
                   "  --random-seed=SEED       set random seed (\"time\", 
\"rand\", integer)\n"
                   "  --sampling-rate=NUM      fraction of transactions to log 
(e.g., 0.01 for 1%%)\n"
@@ -1467,6 +1482,7 @@ initStats(StatsData *sd, pg_time_usec_t start)
        sd->retried = 0;
        sd->serialization_failures = 0;
        sd->deadlock_failures = 0;
+       sd->other_sql_failures = 0;
        initSimpleStats(&sd->latency);
        initSimpleStats(&sd->lag);
 }
@@ -1516,6 +1532,9 @@ accumStats(StatsData *stats, bool skipped, double lat, 
double lag,
                case ESTATUS_DEADLOCK_ERROR:
                        stats->deadlock_failures++;
                        break;
+               case ESTATUS_OTHER_SQL_ERROR:
+                       stats->other_sql_failures++;
+                       break;
                default:
                        /* internal error which should never occur */
                        pg_fatal("unexpected error status: %d", estatus);
@@ -4007,7 +4026,8 @@ advanceConnectionState(TState *thread, CState *st, 
StatsData *agg)
                                        if (PQpipelineStatus(st->con) != 
PQ_PIPELINE_ON)
                                                st->state = CSTATE_END_COMMAND;
                                }
-                               else if (canRetryError(st->estatus))
+                               else if ((st->estatus == 
ESTATUS_OTHER_SQL_ERROR && continue_on_error) ||
+                                                canRetryError(st->estatus))
                                        st->state = CSTATE_ERROR;
                                else
                                        st->state = CSTATE_ABORTED;
@@ -4528,7 +4548,8 @@ static int64
 getFailures(const StatsData *stats)
 {
        return (stats->serialization_failures +
-                       stats->deadlock_failures);
+                       stats->deadlock_failures +
+                       stats->other_sql_failures);
 }
 
 /*
@@ -4548,6 +4569,8 @@ getResultString(bool skipped, EStatus estatus)
                                return "serialization";
                        case ESTATUS_DEADLOCK_ERROR:
                                return "deadlock";
+                       case ESTATUS_OTHER_SQL_ERROR:
+                               return "other";
                        default:
                                /* internal error which should never occur */
                                pg_fatal("unexpected error status: %d", 
estatus);
@@ -4603,6 +4626,7 @@ doLog(TState *thread, CState *st,
                        int64           skipped = 0;
                        int64           serialization_failures = 0;
                        int64           deadlock_failures = 0;
+                       int64           other_sql_failures = 0;
                        int64           retried = 0;
                        int64           retries = 0;
 
@@ -4643,10 +4667,12 @@ doLog(TState *thread, CState *st,
                        {
                                serialization_failures = 
agg->serialization_failures;
                                deadlock_failures = agg->deadlock_failures;
+                               other_sql_failures = agg->other_sql_failures;
                        }
-                       fprintf(logfile, " " INT64_FORMAT " " INT64_FORMAT,
+                       fprintf(logfile, " " INT64_FORMAT " " INT64_FORMAT " " 
INT64_FORMAT,
                                        serialization_failures,
-                                       deadlock_failures);
+                                       deadlock_failures,
+                                       other_sql_failures);
 
                        fputc('\n', logfile);
 
@@ -6285,6 +6311,7 @@ printProgressReport(TState *threads, int64 test_start, 
pg_time_usec_t now,
                cur.serialization_failures +=
                        threads[i].stats.serialization_failures;
                cur.deadlock_failures += threads[i].stats.deadlock_failures;
+               cur.other_sql_failures += threads[i].stats.other_sql_failures;
        }
 
        /* we count only actually executed transactions */
@@ -6427,7 +6454,8 @@ printResults(StatsData *total,
 
        /*
         * Remaining stats are nonsensical if we failed to execute any xacts due
-        * to others than serialization or deadlock errors
+        * to other than serialization or deadlock errors and 
--continue-on-error
+        * is not set.
         */
        if (total_cnt <= 0)
                return;
@@ -6443,6 +6471,9 @@ printResults(StatsData *total,
                printf("number of deadlock failures: " INT64_FORMAT " 
(%.3f%%)\n",
                           total->deadlock_failures,
                           100.0 * total->deadlock_failures / total_cnt);
+               printf("number of other failures: " INT64_FORMAT " (%.3f%%)\n",
+                          total->other_sql_failures,
+                          100.0 * total->other_sql_failures / total_cnt);
        }
 
        /* it can be non-zero only if max_tries is not equal to one */
@@ -6546,6 +6577,10 @@ printResults(StatsData *total,
                                                           
sstats->deadlock_failures,
                                                           (100.0 * 
sstats->deadlock_failures /
                                                                
script_total_cnt));
+                                               printf(" - number of other 
failures: " INT64_FORMAT " (%.3f%%)\n",
+                                                          
sstats->other_sql_failures,
+                                                          (100.0 * 
sstats->other_sql_failures /
+                                                               
script_total_cnt));
                                        }
 
                                        /*
@@ -6705,6 +6740,7 @@ main(int argc, char **argv)
                {"verbose-errors", no_argument, NULL, 15},
                {"exit-on-abort", no_argument, NULL, 16},
                {"debug", no_argument, NULL, 17},
+               {"continue-on-error", no_argument, NULL, 18},
                {NULL, 0, NULL, 0}
        };
 
@@ -7058,6 +7094,10 @@ main(int argc, char **argv)
                        case 17:                        /* debug */
                                pg_logging_increase_verbosity();
                                break;
+                       case 18:                        /* continue-on-error */
+                               benchmarking_option_set = true;
+                               continue_on_error = true;
+                               break;
                        default:
                                /* getopt_long already emitted a complaint */
                                pg_log_error_hint("Try \"%s --help\" for more 
information.", progname);
@@ -7413,6 +7453,7 @@ main(int argc, char **argv)
                stats.retried += thread->stats.retried;
                stats.serialization_failures += 
thread->stats.serialization_failures;
                stats.deadlock_failures += thread->stats.deadlock_failures;
+               stats.other_sql_failures += thread->stats.other_sql_failures;
                latency_late += thread->latency_late;
                conn_total_duration += thread->conn_duration;
 
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl 
b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 7dd78940300..8bb35dda5f7 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -1813,6 +1813,28 @@ update counter set i = i+1 returning i \gset
 # Clean up
 $node->safe_psql('postgres', 'DROP TABLE counter;');
 
+# Test --continue-on-error
+$node->safe_psql('postgres',
+       'CREATE TABLE unique_table(i int unique);' . 'INSERT INTO unique_table 
VALUES (0);');
+
+$node->pgbench(
+       '-t 10 --continue-on-error --failures-detailed',
+       0,
+       [
+               qr{processed: 0/10\b},
+               qr{other failures: 10\b}
+       ],
+       [],
+       'test --continue-on-error',
+       {
+               '002_continue_on_error' => q{
+               insert into unique_table values 0;
+               }
+       });
+
+# Clean up
+$node->safe_psql('postgres', 'DROP TABLE unique_table;');
+
 # done
 $node->safe_psql('postgres', 'DROP TABLESPACE regress_pgbench_tap_1_ts');
 $node->stop;
-- 
2.39.5 (Apple Git-154)

From c1074c2a076e879196e5c68bc641995bface8453 Mon Sep 17 00:00:00 2001
From: Rintaro Ikeda <ikedarinta...@oss.nttdata.com>
Date: Wed, 9 Jul 2025 23:50:36 +0900
Subject: [PATCH v6 2/3] Rename a confusing enumerator

Rename the confusing enumerator which may be mistakenly assumed to be related to
other_sql_errors
---
 src/bin/pgbench/pgbench.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4b3ddb3146f..95a7083ede0 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -485,7 +485,7 @@ typedef enum TStatus
        TSTATUS_IDLE,
        TSTATUS_IN_BLOCK,
        TSTATUS_CONN_ERROR,
-       TSTATUS_OTHER_ERROR,
+       TSTATUS_UNKNOWN_ERROR,
 } TStatus;
 
 /* Various random sequences are initialized from this one. */
@@ -3577,12 +3577,12 @@ getTransactionStatus(PGconn *con)
                         * not. Internal error which should never occur.
                         */
                        pg_log_error("unexpected transaction status %d", 
tx_status);
-                       return TSTATUS_OTHER_ERROR;
+                       return TSTATUS_UNKNOWN_ERROR;
        }
 
        /* not reached */
        Assert(false);
-       return TSTATUS_OTHER_ERROR;
+       return TSTATUS_UNKNOWN_ERROR;
 }
 
 /*
-- 
2.39.5 (Apple Git-154)

From 6d916730e26384e7f3a559515bd16d0d9831064b Mon Sep 17 00:00:00 2001
From: Rintaro Ikeda <ikedarinta...@oss.nttdata.com>
Date: Wed, 9 Jul 2025 23:46:19 +0900
Subject: [PATCH v6 3/3] Suppress error messages unless client abort

Suppress error messages for individual failed SQL commands and report them only
hen the client aborts
---
 src/bin/pgbench/pgbench.c                    | 10 +++++++---
 src/bin/pgbench/t/001_pgbench_with_server.pl | 14 +++++++-------
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 95a7083ede0..26995b93313 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3385,9 +3385,10 @@ readCommandResponse(CState *st, MetaCommand meta, char 
*varprefix)
 
                        default:
                                /* anything else is unexpected */
-                               pg_log_error("client %d script %d aborted in 
command %d query %d: %s",
-                                                        st->id, st->use_file, 
st->command, qrynum,
-                                                        
PQerrorMessage(st->con));
+                               if (verbose_errors)
+                                       pg_log_error("client %d script %d 
aborted in command %d query %d: %s",
+                                                                st->id, 
st->use_file, st->command, qrynum,
+                                                                
PQerrorMessage(st->con));
                                goto error;
                }
 
@@ -4030,7 +4031,10 @@ advanceConnectionState(TState *thread, CState *st, 
StatsData *agg)
                                                 canRetryError(st->estatus))
                                        st->state = CSTATE_ERROR;
                                else
+                               {
+                                       pg_log_error("client %d aborted while 
executing SQL commands", st->id);
                                        st->state = CSTATE_ABORTED;
+                               }
                                break;
 
                                /*
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl 
b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 8bb35dda5f7..a38a1cf4ab7 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -301,7 +301,7 @@ $node->append_conf('postgresql.conf',
          . "log_parameter_max_length_on_error = 0");
 $node->reload;
 $node->pgbench(
-       '-n -t1 -c1 -M prepared',
+       '-n -t1 -c1 -M prepared --verbose',
        2,
        [],
        [
@@ -328,7 +328,7 @@ $node->append_conf('postgresql.conf',
          . "log_parameter_max_length_on_error = 64");
 $node->reload;
 $node->pgbench(
-       '-n -t1 -c1 -M prepared',
+       '-n -t1 -c1 -M prepared --verbose',
        2,
        [],
        [
@@ -342,7 +342,7 @@ SELECT 1 / (random() / 2)::int, :one::int, :two::int;
 }
        });
 $node->pgbench(
-       '-n -t1 -c1 -M prepared',
+       '-n -t1 -c1 -M prepared --verbose',
        2,
        [],
        [
@@ -370,7 +370,7 @@ $node->append_conf('postgresql.conf',
          . "log_parameter_max_length_on_error = -1");
 $node->reload;
 $node->pgbench(
-       '-n -t1 -c1 -M prepared',
+       '-n -t1 -c1 -M prepared --verbose',
        2,
        [],
        [
@@ -387,7 +387,7 @@ SELECT 1 / (random() / 2)::int, :one::int, :two::int;
 $node->append_conf('postgresql.conf', "log_min_duration_statement = 0");
 $node->reload;
 $node->pgbench(
-       '-n -t1 -c1 -M prepared',
+       '-n -t1 -c1 -M prepared --verbose',
        2,
        [],
        [
@@ -410,7 +410,7 @@ $log = undef;
 
 # Check that bad parameters are reported during typinput phase of BIND
 $node->pgbench(
-       '-n -t1 -c1 -M prepared',
+       '-n -t1 -c1 -M prepared --verbose',
        2,
        [],
        [
@@ -1464,7 +1464,7 @@ for my $e (@errors)
        my $n = '001_pgbench_error_' . $name;
        $n =~ s/ /_/g;
        $node->pgbench(
-               '-n -t 1 -Dfoo=bla -Dnull=null -Dtrue=true -Done=1 -Dzero=0.0 
-Dbadtrue=trueXXX'
+               '-n -t 1 -Dfoo=bla -Dnull=null -Dtrue=true -Done=1 -Dzero=0.0 
-Dbadtrue=trueXXX --verbose'
                  . ' -Dmaxint=9223372036854775807 
-Dminint=-9223372036854775808'
                  . ($no_prepare ? '' : ' -M prepared'),
                $status,
-- 
2.39.5 (Apple Git-154)

Reply via email to