Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

Marina Polyakova Wed, 11 Jul 2018 02:07:19 -0700

On 09-07-2018 16:05, Fabien COELHO wrote:

Hello Marina,


Hello, Fabien!

Here is a review for the last part of your v9 version.


Thank you very much for this!

Patch does not "git apply" (may anymore):
  error: patch failed: doc/src/sgml/ref/pgbench.sgml:513
  error: doc/src/sgml/ref/pgbench.sgml: patch does not apply


Sorry, I'll send a new version soon.

However I could get it to apply with the "patch" command.

Then patch compiles, global & pgbench "make check" are ok.

:-)

Feature
=======
The patch adds the ability to restart transactions (i.e. the fullscript)
on some errors, which is a good thing as it allows to exercice postgres
performance in more realistic scenarii.
* -d/--debug: I'm not in favor in requiring a mandatory text argumenton thisoption. It is not pratical, the user has to remember it, and it is achange.I'm sceptical of the overall debug handling changes. Maybe we couldhavemultiple -d which lead to higher debug level, but I'm not sure that itcan bemade to work for this case and still be compatible with the previousbehavior.
Maybe you need a specific option for your purpose, eg "--debug-retry"?


As you wrote in [1], adding an additional option is also a bad idea:

I'm sceptical of the "--debug-fails" options. ISTM that --debug is
already there
and should just be reused.

Maybe it's better to use an optional argument/arguments forcompatibility (--debug[=fails] or --debug[=NUM])? But if we use thenumbers, now I can see only 2 levels, and there's no guarantee that theywill no change..

Code
====

* The implementation is less complex that the previous submission,
which is a good thing. I'm not sure that all the remaining complexity
is still fully needed.

* I'm reserved about the whole ereport thing, see comments in other
messages.

Thank you, I'll try to implement the error reporting in the way yousuggested.

Leves ELEVEL_LOG_CLIENT_{FAIL,ABORTED} & LOG_MAIN look unclear to me.
In particular, the "CLIENT" part is not very useful. If the

distinction makes sense, I would have kept "LOG" for the initial oneand

add other ones for ABORT and PGBENCH, maybe.

Ok!

* There are no comments about "retries" in StatData, CState and Command
structures.

* Also, for StatData, I would like to understand the logic between cnt,
skipped, retries, retried, errors, ... so a clear information about the

expected invariant if any would be welcome. One has to go in the codeto

understand how these fields relate one to the other.

<...>

* How "errors" differs from "ecnt" is unclear to me.


Thank you, I'll fix this.

* commandFailed: I think that it should be kept much simpler. In
particular, having errors on errors does not help much: on
ELEVEL_FATAL, it ignores the actual reported error and generates
another error of the same level, so that the initial issue is hidden.
Even if these are can't happen cases, hidding the origin if it occurs
looks unhelpful. Just print it directly, and maybe abort if you think
that it is a can't happen case.


Oh, thanks, my mistake(

* copyRandomState: just use sizeof(RandomState) instead of makingassumptionsabout the contents of the struct. Also, this function looks prettyuseless,
why not just do a plain assignment?
* copyVariables: lacks comments to explain that the destination iscleaned upand so on. The cleanup phase could probaly be in a distinct function,so thatthe code would be clearer. Maybe the function variable names are toolong.


Thank you, I'll fix this.

  if (current_source->svalue)

in the context of a guard for a strdup, maybe:

  if (current_source->svalue != NULL)


I'm sorry, I'll fix this.

* I do not understand the comments on CState enum: "First, remember thefailurein CSTATE_FAILURE. Then process other commands of the failedtransaction if any"Why would other commands be processed at all if the transaction isaborted?
For me any error must leads to the rollback and possible retry of the
transaction. This comment needs to be clarified. It should also say
that on FAILURE, it will go either to RETRY or ABORTED. See below my
comments about doCustom.

It is unclear to me why their could be several failures within a
transaction, as I would have stopped that it would be aborted on the
first one.
* I do not undestand the purpose of first_failure. The comment shouldexplainwhy it would need to be remembered. From my point of view, I'm notfully
convinced that it should.

<...>
* executeCondition: this hides client automaton state changes whichwere
clearly visible beforehand in the switch, and the different handling of
if & elif is also hidden.
I'm against this unnecessary restructuring and to hide such aninformation,all state changes should be clearly seen in the state switch so that itis
easier to understand and follow.

I do not see why touching the conditional stack on internal errors
(evaluateExpr failure) brings anything, the whole transaction will beaborted
anyway.

* doCustom changes.

On CSTATE_START_COMMAND, it considers whether to retry on the end.
For me, this cannot happen: if some command failed, then it should have
skipped directly to the RETRY state, so that you cannot get to the end
of the script with an error. Maybe you could assert that the state ofthe
previous command is NO_FAILURE, though.
On CSTATE_FAILURE, the next command is possibly started. Although thereis someconsistency with the previous point, I think that it totally breaks thestate
automaton where now a command can start while the whole transaction is
in failing state anyway. There was no point in starting it in the firstplace.
So, for me, the FAILURE state should record/count the failure, thenskip
to RETRY if a retry is decided, else proceed to ABORT. Nothing else.
This is much clearer that way.
Then RETRY should reinstate the global state and proceed to start the*first*
command again.

<...>

It is unclear to me why backslash command errors are turned to FAILURE
instead of ABORTED: there is no way they are going to be retried, so
maybe they should/could skip directly to ABORTED?

Function executeCondition is a bad idea, as stated above.

So do you propose to execute the command "ROLLBACK" without calculatingits latency etc. if we are in a failed transaction and clear theconditional stack after each failure?

Also just to be clear: do you want to have the state CSTATE_ABORTED forclient abortion and another state for interrupting the currenttransaction?

The current RETRY state does memory allocations to generate a message
with buffer allocation and so on. This looks like a costly and useless
operation. If the user required "retries", then this is normalbehavior,
the retries are counted and will be printed out in the final report,
and there is no point in printing out every single one of them.
Maybe you want that debugging, but then coslty operations should beguarded.

I think we need these debugging messages because, for example, if youuse the option --latency-limit, you we will never know in advancewhether the serialization/deadlock failure will be retried or not. Theyalso help to understand which limit of retries was violated or how closewe were to these limits during the execution of a specific transaction.But I agree with you that they are costly and can be skipped if thefailure type is never retried. Maybe it is better to split them intomultiple error function calls?..

* reporting
The number of transactions above the latency limit report can besimplified.Remove the if and just use one printf f with a %s for the optionalcomment.
I'm not sure this optional comment is useful there.


Oh, thanks, my mistake(

Before the patch, ISTM that all lines relied on one printf. you have
changed to a style where a collection of printf is used to compose a
line. I'd suggest to keep to the previous one-printf-prints-one-line
style, where possible.

Ok!

You have added 20-columns alignment prints. This looks like too muchand
generates much too large lines. Probably 10 (billion) would be enough.


I have already asked you about this in [2]:

The variables for the numbers of failures and retries are of type int64
since the variable for the total number of transactions has the same
type. That's why such a large alignment (as I understand it now, enough
20 characters). Do you prefer floating alignemnts, depending on the
maximum number of failures/retries for any command in any script?

Some people try to parse the output, so it should be deterministic. I'daddthe needed columns always if appropriate (i.e. under retry), even ifnone
occured.

Ok!

* processXactStats: An else is replaced by a detailed stats, with theinitial"no detailed stats" comment kept. The function is called both in thethenb
& else branch. The structure does not make sense anymore. I'm not sure
this changed was needed.

* getLatencyUsed: declared "double" so "return 0.0".
* typo: ruin -> run; probably others, I did not check for them indetail.


Oh, thanks, my mistakes(

TAP Tests
=========
On my laptop, tests last 5.5 seconds before the patch, and about 13secondsafter. This is much too large. Pgbench TAP tests do not deserve to takeover
twice as much time as before just on this patch.

One reason which explains this large time is there is a new script
with a new created instance. I'd suggest to append tests to the
existing 2 scripts, depending on whether they need a running instance
or not.

Ok! All new tests that do not need a running instance are already addedto the file 002_pgbench_no_server.pl.

Secondly, I think that the design of the tests are too heavy. For such
a feature, ISTM enough to check that it works, i.e. one test for
deadlocks (trigger one or a few deadlocks), idem for serializable,
maybe idem for other errors if any.

<...>
The latency limit to 900 ms try is a bad idea because it takes a lot oftime.I did such tests before and they were removed by Tom Lane because ofdeterminism
and time issues. I would comment this test out for now.

Ok! If it doesn't bother you - can you tell more about the causes ofthese determinism issues?.. Tests for some other failures that cannot beretried are already added to 001_pgbench_with_server.pl.

The challenge is to do that reliably and efficiently, i.e. so that thetest does
not rely on chance and is still quite efficient.
The trick you use is to run an interactive psql in parallel to pgbenchso as toplay with concurrent locks. That is interesting, but deserves morecomments
and explanatation, eg before the test functions.

Maybe this could be achieved within pgbench by using some wait stuff
in PL/pgSQL so that concurrent client can wait one another based on
data in unlogged table updated by a CALL within an "embedded"
transactions? Not sure.

<...>

Anyway, TAP tests should be much lighter (in total time), and if
possible much simpler.


I'll try, thank you..

Otherwise, maybe (simple) pgbench-side thread
barrier could help, but this would require more thinking.


Tests must pass if we use --disable-thread-safety..

Documentation
=============

Not looked at in much details for now. Just a few comments:
Having the "most important settings" on line 1-6 and 8 (i.e. skipping7) lookssilly. The important ones should simply be the first ones, and the 8this not
that important, or it is in 7th position.

Ok!

I do not understand why there is so much text about in failed sqltransactionstuff, while we are mainly interested in serialization & deadlockerrors, andthis only falls in some "other" category. There seems to be moredetails about
other errors that about deadlocks & serializable errors.
The reporting should focus on what is of interest, either all errors,or some
detailed split of these errors.

<...>

* "errors_in_failed_tx" is some subcounter of "errors", for a special
case. Why it is there fails me [I finally understood, and I think it
should be removed, see end of review]. If we wanted to distinguish,
then we should distinguish homogeneously: maybe just count the
different error types, eg have things like "deadlock_errors",
"serializable_errors", "other_errors", "internal_pgbench_errors" which
would be orthogonal one to the other, and "errors" could be recomputed
from these.

Thank you, I agree with you. Unfortunately each new error type adds anew 1 or 2 columns of maximum width 20 to the per-statement report (toreport errors and possibly retries of this type in this statement) andwe already have 2 new columns for all errors and retries. So I'm notsure that we need add anything other than statistics only about all theerrors and all the retries in general.

The documentation should state clearly what
are the counted errors, and then what are their effects on the reportedstats.The "Errors and Serialization/Deadlock Retries" section is a good startin thatdirection, but it does not talk about pgbench internal errors (eg"cos(true)").
I think it should more explicit about errors.


Thank you, I'll try to improve it.

Option --max-tries default value should be spelled out in the doc.

If you mean that it is set to 1 if neither of the options --max-tries or--latency-limit is explicitly used, I'll fix this.

"Client's run is aborted", do you mean "Pgbench run is aborted"?


No, other clients continue their run as usual.

* FailureStatus states are not homogeneously named. I'd suggest to use
*_FAILURE for all cases. The miscellaneous case should probably be the
last. I do not understand the distinction between ANOTHER_FAILURE &
IN_FAILED_SQL_TRANSACTION. Why should it be needed? [again, see and of
review]

<...>
"If a failed transaction block does not terminate in the currentscript":
this just looks like a very bad idea, and explains my general ranting
above about this error condition. ISTM that the only reasonable option
is that a pgbench script should be inforced as a transaction, or a setoftransactions, but cannot be a "piece" of transaction, i.e. pgbenchscript
with "BEGIN;" but without a corresponding "COMMIT" is a user error and
warrants an abort, so that there is no need to manage these "in aborted
transaction" errors every where and report about them and document them
extensively.

This means adding a check when a script is finished or starting that
PQtransactionStatus(const PGconn *conn) == PQTRANS_IDLE, and abort ifnotwith a fatal error. Then we can forget about these "in tx errors"counting,
reporting and so on, and just have to document the restriction.

Ok!

[1]https://www.postgresql.org/message-id/alpine.DEB.2.20.1801031720270.20034%40lancre[2]https://www.postgresql.org/message-id/[email protected]


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

Reply via email to