Re: pgbench bug candidate: negative "initial connection time"
At Thu, 17 Jun 2021 11:52:04 +0200 (CEST), Fabien COELHO wrote in > > Ok. I gave up to change the state in threadRun. Instead, I changed the > > condition at the end of bench, which enables to report abortion due to > > socket errors. > > > > +@@ -6480,7 +6490,7 @@ main(int argc, char **argv) > > + #endif/* > > ENABLE_THREAD_SAFETY */ > > + > > + for (int j = 0; j < thread->nstate; j++) > > +- if (thread->state[j].state == CSTATE_ABORTED) > > ++ if (thread->state[j].state != CSTATE_FINISHED) > > + exit_code = 2; > > + > > + /* aggregate thread level stats */ > > > > Does this make sense? > > Yes, definitely. I sought for a simple way to enforce all client finishes with the states abort or finished but I didn't find. So +1 for the change. However, as a matter of style. if we touch the code maybe we want to enclose the if statement. Doing this means we regard any state other than CSTATE_FINISHED as aborted. So, the current goto's to done in threadRun are effectively aborting a part or the all clients running on the thread. So for example the following place: pgbench.c:6713 >/* must be something wrong */ >pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD); >goto done; Should say such like "thread %d aborted: %s() failed: ...". I'm not sure what is the consensus here about the case where aborted client can recoonect to the same server. This patch doesn't that. However, I think causing reconnection needs more work than accepted as a fix while beta. +/* as the bench is already running, we do not abort */ pg_log_error("client %d aborted while establishing connection", st->id); st->state = CSTATE_ABORTED; The comment looks strange that it is saying "we don't abort" while setting the state to CSTATE_ABORT then showing "client %d aborted". if ((con = doConnect()) == NULL) + { +pg_log_fatal("connection for initialization failed"); exit(1); doConnect() prints an error emssage given from libpq. So The additional messaget is redundant. errno = THREAD_BARRIER_INIT(&barrier, nthreads); if (errno != 0) + { pg_log_fatal("could not initialize barrier: %m"); +exit(1); This is a run-time error. Maybe we should return 2 in that case. === if (thread->logfile == NULL) { pg_log_fatal("could not open logfile \"%s\": %m", logpath); - goto done; + exit(1); Maybe we should exit with 2 this case. If we exit this case, we might also want to exit when fclose() fails. (Currently the error of fclose() is ignored.) === +/* coldly abort on connection failure */ +pg_log_fatal("cannot create connection for thread %d client %d", + thread->tid, i); +exit(1); It seems to me that the "thread %d client %d(not clinent id but the client index within the thread)" doesn't make sense to users. Even if we showed a message like that, it should show only the global clinent id (cstate->id). I think that we should return with 2 here but we return with 1 in another place for the same reason.. === /* must be something wrong */ -pg_log_fatal("%s() failed: %m", SOCKET_WAIT_METHOD); +pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD); goto done; Why doesn't a fatal error cause an immediate exit? (And if we change this to fatal, we also need to change similar errors in the same function to fatal.) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [PoC] Federated Authn/z with OAUTHBEARER
On 08/06/2021 19:37, Jacob Champion wrote: We've been working on ways to expand the list of third-party auth methods that Postgres provides. Some example use cases might be "I want to let anyone with a Google account read this table" or "let anyone who belongs to this GitHub organization connect as a superuser". Cool! The iddawc dependency for client-side OAuth was extremely helpful to develop this proof of concept quickly, but I don't think it would be an appropriate component to build a real feature on. It's extremely heavyweight -- it incorporates a huge stack of dependencies, including a logging framework and a web server, to implement features we would probably never use -- and it's fairly difficult to debug in practice. If a device authorization flow were the only thing that libpq needed to support natively, I think we should just depend on a widely used HTTP client, like libcurl or neon, and implement the minimum spec directly against the existing test suite. You could punt and let the application implement that stuff. I'm imagining that the application code would look something like this: conn = PQconnectStartParams(...); for (;;) { status = PQconnectPoll(conn) switch (status) { case CONNECTION_SASL_TOKEN_REQUIRED: /* open a browser for the user, get token */ token = open_browser() PQauthResponse(token); break; ... } } It would be nice to have a simple default implementation, though, for psql and all the other client applications that come with PostgreSQL itself. If you've read this far, thank you for your interest, and I hope you enjoy playing with it! A few small things caught my eye in the backend oauth_exchange function: + /* Handle the client's initial message. */ + p = strdup(input); this strdup() should be pstrdup(). In the same function, there are a bunch of reports like this: ereport(ERROR, + (errcode(ERRCODE_PROTOCOL_VIOLATION), + errmsg("malformed OAUTHBEARER message"), + errdetail("Comma expected, but found character \"%s\".", + sanitize_char(*p; I don't think the double quotes are needed here, because sanitize_char will return quotes if it's a single character. So it would end up looking like this: ... found character "'x'". - Heikki
RE: locking [user] catalog tables vs 2pc vs logical rep
On Friday, June 18, 2021 11:41 AM osumi.takami...@fujitsu.com wrote: > On Thursday, June 17, 2021 10:34 PM Simon Riggs > wrote: > > On Thu, Jun 17, 2021 at 12:57 PM Amit Kapila > > wrote: > > > On Thu, Jun 17, 2021 at 4:27 PM Amit Kapila > > > > > wrote: > > > > > > > > On Thu, Jun 17, 2021 at 8:41 AM osumi.takami...@fujitsu.com > > > > wrote: > > > > > > > > Pushed! > > > > > > > [Responding to Simon's comments] > > > > > > > If LOCK and TRUNCATE is advised against on all user catalog > > > > tables, why would CLUSTER only apply to pg_class? Surely its > > > > locking level is the > > same as LOCK? > > > > > > > > > > Cluster will also apply to all user catalog tables. I think we can > > > extend it slightly as we have mentioned for Lock. > > > > OK, good. > > > > > > The use of "[user]" isn't fully explained, so it might not be > > > > clear that this applies to both Postgres catalog tables and any > > > > user tables that > > have been nominated as catalogs. Probably worth linking to the > "Capabilities" > > section to explain. > > > > > > > > > > Sounds reasonable. > Simon, I appreciate your suggestions and yes, if the user catalog table is > referenced by the output plugin, it can be another cause of the deadlock. > > I'm going to post the patch for the those two changes, accordingly. Hi, I've made the patch-set to cover the discussion above for all-supported versions. Please have a look at those. Best Regards, Takamichi Osumi HEAD_extend_logical_decoding_caveats_in_sync_mode_v01.patch Description: HEAD_extend_logical_decoding_caveats_in_sync_mode_v01.patch PG10_extend_logical_decoding_caveats_in_sync_mode_v01.patch Description: PG10_extend_logical_decoding_caveats_in_sync_mode_v01.patch PG11_extend_logical_decoding_caveats_in_sync_mode_v01.patch Description: PG11_extend_logical_decoding_caveats_in_sync_mode_v01.patch PG12_extend_logical_decoding_caveats_in_sync_mode_v01.patch Description: PG12_extend_logical_decoding_caveats_in_sync_mode_v01.patch PG13_extend_logical_decoding_caveats_in_sync_mode_v01.patch Description: PG13_extend_logical_decoding_caveats_in_sync_mode_v01.patch PG96_extend_logical_decoding_caveats_in_sync_mode_v01.patch Description: PG96_extend_logical_decoding_caveats_in_sync_mode_v01.patch
Re: fdatasync performance problem with large number of DB files
On 2021/06/04 23:39, Justin Pryzby wrote: You said switching to SIGHUP "would have zero effect"; but, actually it allows an admin who's DB took a long time in recovery/startup to change the parameter without shutting down the service. This mitigates the downtime if it crashes again. I think that's at least 50% of how this feature might end up being used. Yes, it would have an effect when the server is automatically restarted after crash when restart_after_crash is enabled. At least for me +1 to your proposed change. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: disfavoring unparameterized nested loops
> > The problem I have with this idea is that I really don't know how to > properly calculate what the risk_factor should be set to. It seems > easy at first to set it to something that has the planner avoid these > silly 1-row estimate nested loop mistakes, but I think what we'd set > the risk_factor to would become much more important when more and more > Path types start using it. So if we did this and just guessed the > risk_factor, that might be fine when only 1 of the paths being > compared had a non-zero risk_factor, but as soon as both paths have > one set, unless they're set to something sensible, then we just end up > comparing garbage costs to garbage costs. Risk factor is the inverse of confidence on estimate, lesser confidence, higher risk. If we associate confidence with the selectivity estimate, or computer confidence interval of the estimate instead of a single number, we can associate risk factor with each estimate. When we combine estimates to calculate new estimates, we also combine their confidences/confidence intervals. If my memory serves well, confidence intervals/confidences are calculated based on the sample size and method used for estimation, so we should be able to compute those during ANALYZE. I have not come across many papers which leverage this idea. Googling "selectivity estimation confidence interval", does not yield many papers. Although I found [1] to be using a similar idea. So may be there's not merit in this idea, thought theoretically it sounds fine to me. [1] https://pi3.informatik.uni-mannheim.de/~moer/Publications/vldb18_smpl_synop.pdf -- Best Wishes, Ashutosh Bapat
Re: row filtering for logical replication
On Wed, Jun 9, 2021 at 5:33 AM Peter Smith wrote: > > On Mon, May 10, 2021 at 11:42 PM Euler Taveira wrote: > > > > On Mon, May 10, 2021, at 5:19 AM, Peter Smith wrote: > > > > AFAIK this is the latest patch available, but FYI it no longer applies > > cleanly on HEAD. > > > > Peter, the last patch is broken since f3b141c4825. I'm still working on it > > for > > the next CF. I already addressed the points suggested by Amit in his last > > review; however, I'm still working on a cache for evaluating expression as > > suggested by Andres. I hope to post a new patch soon. > > Is there any ETA for your new patch? > > In the interim can you rebase the old patch just so it builds and I can try > it? > I have rebased the patch so that you can try it out. The main thing I have done is to remove changes in worker.c and created a specialized function to create estate for pgoutput.c as I don't think we need what is done in worker.c. Euler, do let me know if you are not happy with the change in pgoutput.c? -- With Regards, Amit Kapila. v15-0001-Row-filter-for-logical-replication.patch Description: Binary data
Re: Added schema level support for publication.
On Thu, Jun 17, 2021 at 12:41 AM vignesh C wrote: > Thanks for reporting it, the attached patch is a rebased version of > the patch with few review comment fixes, I will reply with the comment > fixes to the respective mails. > I've applied the patch, it applies cleand and ran "make check" and tests run fine. Some comments for patch 1: In the commit message, some grammar mistakes: "Changes was done in pg_dump to handle pubtype updation in pg_publication table while the database gets upgraded." -- change to -- Changes were done in pg_dump to handle pubtype updation in pg_publication table while the database gets upgraded. = Prototypes present in pg_publication.h was moved to publicationcmds.h so that minimal datastructures ... - change to -- Prototypes present in pg_publication.h were moved to publicationcmds.h so that minimal datastructures .. In patch 1: In getObjectDescription() + if (!nspname) + { + pfree(pubname); + pfree(nspname); + ReleaseSysCache(tup); Why free nspname if it is NULL? Same comment in getObjectIdentityParts() In GetAllTablesPublicationRelations() + ScanKeyData key[2]; TableScanDesc scan; HeapTuple tuple; List*result = NIL; + int keycount = 0; classRel = table_open(RelationRelationId, AccessShareLock); - ScanKeyInit(&key[0], + ScanKeyInit(&key[keycount++], Here you have init key[1], but the code below in the same function inits key[0]. I am not sure if this will affect that code below. if (pubviaroot) { ScanKeyInit(&key[0], Anum_pg_class_relkind, BTEqualStrategyNumber, F_CHAREQ, CharGetDatum(RELKIND_PARTITIONED_TABLE)); = in UpdatePublicationTypeTupleValue(): + tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls, + replaces); + + /* Update the catalog. */ + CatalogTupleUpdate(rel, &tup->t_self, tup); Not sure if this tup needs to be freed or if the memory context will take care of it. = regards, Ajin Cherian Fujitsu Australia
Supply restore_command to pg_rewind via CLI argument
Hi hackers! Starting from v13 pg_rewind can use restore_command if it lacks necessary WAL segments. And this is awesome for HA clusters with many nodes! Thanks to everyone who worked on the feature! Here's some feedback on how to make things even better. If we run 'pg_rewind --restore-target-wal' there must be restore_command in config of target installation. But if the config is not within $PGDATA\postgresql.conf pg_rewind cannot use it. If we run postmaster with `-c config_file=/etc/postgresql/10/data/postgresql.conf`, we simply cannot use the feature. We solved the problem by putting config into PGDATA only during pg_rewind, but this does not seem like a very robust solution. Maybe we could add "-C, --target-restore-command=COMMAND target WAL restore_command\n" as was proposed within earlier versions of the patch[0]? Or instruct pg_rewind to pass config to 'postgres -C restore_command' run? From my POV adding --target-restore-command is simplest way, I can extract corresponding portions from original patch. Thanks! Best regards, Andrey Borodin. [0] https://www.postgresql.org/message-id/flat/CAPpHfduUqKLr2CRpcpHcv1qjaz%2B-%2Bi9bOL2AOvdWSr954ti8Xw%40mail.gmail.com#1d4b372b5aa26f93af9ed1d5dd0693cd
Re: Asymmetric partition-wise JOIN
Andrey Lepikhov писал 2021-05-27 07:27: Next version of the patch. For searching any problems I forced this patch during 'make check' tests. Some bugs were found and fixed. Hi. I've tested this patch and haven't found issues, but I have some comments. src/backend/optimizer/path/joinrels.c: 1554 1555 /* 1556 * Build RelOptInfo on JOIN of each partition of the outer relation and the inner 1557 * relation. Return List of such RelOptInfo's. Return NIL, if at least one of 1558 * these JOINs are impossible to build. 1559 */ 1560 static List * 1561 extract_asymmetric_partitionwise_subjoin(PlannerInfo *root, 1562 RelOptInfo *joinrel, 1563 AppendPath *append_path, 1564 RelOptInfo *inner_rel, 1565 JoinType jointype, 1566 JoinPathExtraData *extra) 1567 { 1568 List*result = NIL; 1569 ListCell*lc; 1570 1571 foreach (lc, append_path->subpaths) 1572 { 1573 Path*child_path = lfirst(lc); 1574 RelOptInfo *child_rel = child_path->parent; 1575 Relids child_join_relids; 1576 Relids parent_relids; 1577 RelOptInfo *child_join_rel; 1578 SpecialJoinInfo *child_sjinfo; 1579 List*child_restrictlist; Variable names - child_join_rel and child_join_relids seem to be inconsistent with rest of the file (I see child_joinrelids in try_partitionwise_join() and child_joinrel in try_partitionwise_join() and get_matching_part_pairs()). 1595 child_join_rel = build_child_join_rel(root, 1596 child_rel, 1597 inner_rel, 1598 joinrel, 1599 child_restrictlist, 1600 child_sjinfo, 1601 jointype); 1602 if (!child_join_rel) 1603 { 1604 /* 1605 * If can't build JOIN between inner relation and one of the outer 1606 * partitions - return immediately. 1607 */ 1608 return NIL; 1609 } When build_child_join_rel() can return NULL? If I read code correctly, joinrel is created in the begining of build_child_join_rel() with makeNode(), makeNode() wraps newNode() and newNode() uses MemoryContextAllocZero()/MemoryContextAllocZeroAligned(), which would error() on alloc() failure. 1637 1638 static bool 1639 is_asymmetric_join_capable(PlannerInfo *root, 1640RelOptInfo *outer_rel, 1641RelOptInfo *inner_rel, 1642JoinType jointype) 1643 { Function misses a comment. 1656 /* 1657 * Don't allow asymmetric JOIN of two append subplans. 1658 * In the case of a parameterized NL join, a reparameterization procedure will 1659 * lead to large memory allocations and a CPU consumption: 1660 * each reparameterize will induce subpath duplication, creating new 1661 * ParamPathInfo instance and increasing of ppilist up to number of partitions 1662 * in the inner. Also, if we have many partitions, each bitmapset 1663 * variable will large and many leaks of such variable (caused by relid 1664 * replacement) will highly increase memory consumption. 1665 * So, we deny such paths for now. 1666 */ Missing word: each bitmapset variable will large => each bitmapset variable will be large 1694 foreach (lc, outer_rel->pathlist) 1695 { 1696 AppendPath *append_path = lfirst(lc); 1697 1698 /* 1699 * MEMO: We assume this pathlist keeps at least one AppendPath that 1700 * represents partitioned table-scan, symmetric or as
Re: pgbench bug candidate: negative "initial connection time"
Hello, Doing this means we regard any state other than CSTATE_FINISHED as aborted. So, the current goto's to done in threadRun are effectively aborting a part or the all clients running on the thread. So for example the following place: pgbench.c:6713 /* must be something wrong */ pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD); goto done; Should say such like "thread %d aborted: %s() failed: ...". Yep, possibly. I'm not sure what is the consensus here about the case where aborted client can recoonect to the same server. This patch doesn't that. Non trivial future work. However, I think causing reconnection needs more work than accepted as a fix while beta. It is an entire project which requires some thinking about. +/* as the bench is already running, we do not abort */ pg_log_error("client %d aborted while establishing connection", st->id); st->state = CSTATE_ABORTED; The comment looks strange that it is saying "we don't abort" while setting the state to CSTATE_ABORT then showing "client %d aborted". Indeed. There is abort from the client, which just means that it stops sending transaction, and abort for the process, which is basically "exit(1)". if ((con = doConnect()) == NULL) + { +pg_log_fatal("connection for initialization failed"); exit(1); doConnect() prints an error emssage given from libpq. So The additional messaget is redundant. This is not the same for me: doConnect may fail but we may decide to go retry the connection later, or just one client may be disconnected but others are going on, which is different from deciding to stop the whole program, which deserves a message on its own. errno = THREAD_BARRIER_INIT(&barrier, nthreads); if (errno != 0) + { pg_log_fatal("could not initialize barrier: %m"); +exit(1); This is a run-time error. Maybe we should return 2 in that case. Hmmm. Yep. === if (thread->logfile == NULL) { pg_log_fatal("could not open logfile \"%s\": %m", logpath); - goto done; + exit(1); Maybe we should exit with 2 this case. Yep. If we exit this case, we might also want to exit when fclose() fails. (Currently the error of fclose() is ignored.) Not sure. I'd let it at that for now. === +/* coldly abort on connection failure */ +pg_log_fatal("cannot create connection for thread %d client %d", + thread->tid, i); +exit(1); It seems to me that the "thread %d client %d(not clinent id but the client index within the thread)" doesn't make sense to users. Even if we showed a message like that, it should show only the global clinent id (cstate->id). This is not obvious to me. I think that we should be homogeneous with what is already done around. I think that we should return with 2 here but we return with 1 in another place for the same reason.. Possibly. /* must be something wrong */ -pg_log_fatal("%s() failed: %m", SOCKET_WAIT_METHOD); +pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD); goto done; Why doesn't a fatal error cause an immediate exit? Good point. I do not know, but I would expect it to be the case, and AFAICR it does not. (And if we change this to fatal, we also need to change similar errors in the same function to fatal.) Possibly. I'll look into it over the week-end. Thanks for the feedback! -- Fabien.
Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
On Thu, Jun 10, 2021 at 6:36 PM Bharath Rupireddy wrote: > > On Thu, Jun 10, 2021 at 9:17 AM Bharath Rupireddy > wrote: > > > I don't know the answer; my guess is that all this has become obsolete > > > and the whole Assert and the dodgy comment can just be deleted. > > > > Hm. I get it. Unfortunately the commit b1ff33f is missing information > > on what the coverity tool was complaining of and it has no related > > discussion at all. > > > > I agree to remove that assertion entirely. I will post a new patch set soon. > > PSA v5 patch set. PSA v6 patch set rebased onto the latest master. With Regards, Bharath Rupireddy. v6-0001-Refactor-parse_subscription_options.patch Description: Binary data v6-0002-Remove-similar-ereport-calls-in-parse_subscriptio.patch Description: Binary data
Re: Add version macro to libpq-fe.h
Tom Lane writes: > I think putting a version number as such in there is a truly > horrid idea. However, I could get behind adding a boolean flag > that says specifically whether the pipeline feature exists. > Then you'd do something like > > #ifdef LIBPQ_HAS_PIPELINING > > rather than embedding knowledge of exactly which release > added that. That would be even better, but I agree with what others have said: we would have to keep adding such feature test macros going forward. I think ideally you would want to have both since the version macro could still be helpful in dealing with "features" that you did not plan to add (aka bugs). > Comparing v13 and v14 libpq-fe.h, I see that there is a solution > available now: "#ifdef PQ_QUERY_PARAM_MAX_LIMIT". Hm, it must have been added recently since I don't see it in 14beta1. But thanks for the pointer, if nothing better comes up this will have to do.
Re: pgbench bug candidate: negative "initial connection time"
/* must be something wrong */ pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD); goto done; Should say such like "thread %d aborted: %s() failed: ...". After having a lookg, there are already plenty such cases. I'd say not to change anything for beta, and think of it for the next round. errno = THREAD_BARRIER_INIT(&barrier, nthreads); if (errno != 0) + { pg_log_fatal("could not initialize barrier: %m"); +exit(1); This is a run-time error. Maybe we should return 2 in that case. I think that you are right, but there are plenty such places where exit should be 2 instead of 1 if the doc is followed: """Errors during the run such as database errors or problems in the script will result in exit status 2.""" My beta take is to let these as they are, i.e. pretty inconsistent all over pgbench, and schedule a cleanup on the next round. === if (thread->logfile == NULL) { pg_log_fatal("could not open logfile \"%s\": %m", logpath); - goto done; + exit(1); Maybe we should exit with 2 this case. Yep. The bench is not even started, this is not really run time yet, 1 seems ok. The failure may be due to a typo in the path which comes from the user. If we exit this case, we might also want to exit when fclose() fails. (Currently the error of fclose() is ignored.) Not sure. I'd let it at that for now. I stand by this. +/* coldly abort on connection failure */ +pg_log_fatal("cannot create connection for thread %d client %d", + thread->tid, i); +exit(1); It seems to me that the "thread %d client %d(not clinent id but the client index within the thread)" doesn't make sense to users. Even if we showed a message like that, it should show only the global clinent id (cstate->id). This is not obvious to me. I think that we should be homogeneous with what is already done around. ok for only giving the global client id. I think that we should return with 2 here but we return with 1 in another place for the same reason.. Possibly. Again for this one, the bench has not really started, so 1 seems fine. /* must be something wrong */ -pg_log_fatal("%s() failed: %m", SOCKET_WAIT_METHOD); +pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD); goto done; Why doesn't a fatal error cause an immediate exit? Good point. I do not know, but I would expect it to be the case, and AFAICR it does not. (And if we change this to fatal, we also need to change similar errors in the same function to fatal.) Possibly. On second look, I think that error is fine, indeed we do not stop the process, so "fatal" it is not; Attached Yugo-san patch with some updates discussed in the previous mails, so as to move things along. -- Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index d7479925cb..202507f5d5 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -3171,6 +3171,7 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg) if ((st->con = doConnect()) == NULL) { + /* as the bench is already running, we do not abort the process */ pg_log_error("client %d aborted while establishing connection", st->id); st->state = CSTATE_ABORTED; break; @@ -4405,7 +4406,10 @@ runInitSteps(const char *initialize_steps) initPQExpBuffer(&stats); if ((con = doConnect()) == NULL) + { + pg_log_fatal("connection for initialization failed"); exit(1); + } setup_cancel_handler(NULL); SetCancelConn(con); @@ -6332,7 +6336,10 @@ main(int argc, char **argv) /* opening connection... */ con = doConnect(); if (con == NULL) + { + pg_log_fatal("setup connection failed"); exit(1); + } pg_log_debug("pghost: %s pgport: %s nclients: %d %s: %d dbName: %s", PQhost(con), PQport(con), nclients, @@ -6437,7 +6444,10 @@ main(int argc, char **argv) errno = THREAD_BARRIER_INIT(&barrier, nthreads); if (errno != 0) + { pg_log_fatal("could not initialize barrier: %m"); + exit(1); + } #ifdef ENABLE_THREAD_SAFETY /* start all threads but thread 0 which is executed directly later */ @@ -6480,7 +6490,7 @@ main(int argc, char **argv) #endif /* ENABLE_THREAD_SAFETY */ for (int j = 0; j < thread->nstate; j++) - if (thread->state[j].state == CSTATE_ABORTED) + if (thread->state[j].state != CSTATE_FINISHED) exit_code = 2; /* aggregate thread level stats */ @@ -6548,7 +6558,7 @@ threadRun(void *arg) if (thread->logfile == NULL) { pg_log_fatal("could not open logfile \"%s\": %m", logpath); - goto done; + exit(1); } } @@ -6572,16 +6582,10 @@ threadRun(void *arg) { if ((state[i].con = doConnect()) == NULL) { -/* - * On connection failure, we meet the barrier here in place of - * GO before proceeding to the "done" path which will cleanup, - * so as to avoid locking the process. - * - * It is unclear
Re: Add version macro to libpq-fe.h
Boris Kolpackov writes: > Tom Lane writes: >> I think putting a version number as such in there is a truly >> horrid idea. However, I could get behind adding a boolean flag >> that says specifically whether the pipeline feature exists. > That would be even better, but I agree with what others have > said: we would have to keep adding such feature test macros > going forward. Yes, and I think that is a superior solution. I think the argument that it's too much effort is basically nonsense. > I think ideally you would want to have both since the version > macro could still be helpful in dealing with "features" that you > did not plan to add (aka bugs). I really doubt that a version number appearing in libpq-fe.h would be helpful for deciding whether you need to work around a bug. The problem again is version skew: how well does the libpq.so you are running against today match up with the header you compiled against (possibly months ago, possibly on a different machine)? What you'd want for that sort of thing is a runtime test, i.e. consult PQlibVersion(). That point, along with the previously-discussed point about confusion between server and libpq versions, nicely illustrates another reason why I'm resistant to just adding a version number to libpq-fe.h. If we do that, application programmers will be presented with THREE different Postgres version numbers, and it seems inevitable that people will make mistakes and consult the wrong one for a particular purpose. I think we can at least reduce the confusion by handling the question of which-features-are-visible-in-the-include-file in a different style. regards, tom lane
Re: Centralizing protective copying of utility statements
Julien Rouhaud writes: > On Thu, Jun 17, 2021 at 01:03:29PM -0400, Tom Lane wrote: > + * readOnlyTree: treat pstmt's node tree as read-only > Maybe it's because I'm not a native english speaker, or because it's quite > late here, but I don't find "treat as read-only" really clear. I don't have a > concise better wording to suggest. Maybe "if true, pstmt's node tree must not be modified" ? >> Still thinking about which way to fix it in the back branches. > I'm +0.5 for a narrow fix, due to the possibility of unspotted similar problem > vs possibility of performance regression ratio. After sleeping on it another day, I'm inclined to think the same. The key attraction of a centralized fix is that it prevents the possibility of new bugs of the same ilk in newly-added features. Given how long these CREATE/ALTER DOMAIN bugs escaped detection, it's hard to have full confidence that there are no others in the back branches --- but they must be in really lightly-used features. regards, tom lane
Re: Add version macro to libpq-fe.h
On 2021-Jun-18, Boris Kolpackov wrote: > Tom Lane writes: > > > I think putting a version number as such in there is a truly > > horrid idea. However, I could get behind adding a boolean flag > > that says specifically whether the pipeline feature exists. > > Then you'd do something like > > > > #ifdef LIBPQ_HAS_PIPELINING > > > > rather than embedding knowledge of exactly which release > > added that. > > That would be even better, but I agree with what others have > said: we would have to keep adding such feature test macros > going forward. But we do not add that many significant features to libpq in the first place, so I'm not sure it would be too bad. As far as I am aware, this is the first time someone has requested a mechanism to detect feature presence specifically in libpq. To put a number to it, I counted the number of commits to exports.txt since Jan 2015 -- there are 17. But many of them are just intra-release fixups; the number of actual "features" is 11, an average of two per year. That seems small enough to me. So I'm +1 on adding this "feature macro". (The so-version major changed from 4 to 5 in commit 1e7bb2da573e, dated April 2006.) > I think ideally you would want to have both since the version > macro could still be helpful in dealing with "features" that you > did not plan to add (aka bugs). > > > > Comparing v13 and v14 libpq-fe.h, I see that there is a solution > > available now: "#ifdef PQ_QUERY_PARAM_MAX_LIMIT". > > Hm, it must have been added recently since I don't see it in 14beta1. > But thanks for the pointer, if nothing better comes up this will > have to do. Yeah, this one was added by commit cb92703384e2 on June 8th, three weeks after beta1. -- Álvaro Herrera Valdivia, Chile "Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)
Re: Centralizing protective copying of utility statements
On Fri, Jun 18, 2021 at 10:24:20AM -0400, Tom Lane wrote: > Julien Rouhaud writes: > > On Thu, Jun 17, 2021 at 01:03:29PM -0400, Tom Lane wrote: > > + * readOnlyTree: treat pstmt's node tree as read-only > > > Maybe it's because I'm not a native english speaker, or because it's quite > > late here, but I don't find "treat as read-only" really clear. I don't > > have a > > concise better wording to suggest. > > Maybe "if true, pstmt's node tree must not be modified" ? Thanks, I find it way better!
Re: Centralizing protective copying of utility statements
Julien Rouhaud writes: > On Fri, Jun 18, 2021 at 10:24:20AM -0400, Tom Lane wrote: >> Maybe "if true, pstmt's node tree must not be modified" ? > Thanks, I find it way better! OK, pushed that way, and with a couple other comment tweaks from an additional re-reading. regards, tom lane
Re: pgbench bug candidate: negative "initial connection time"
Hello Horiguchi-san, Fabien, On Fri, 18 Jun 2021 15:58:48 +0200 (CEST) Fabien COELHO wrote: > > >>>/* must be something wrong */ > >>>pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD); > >>>goto done; > >> > >> Should say such like "thread %d aborted: %s() failed: ...". > > After having a lookg, there are already plenty such cases. I'd say not to > change anything for beta, and think of it for the next round. Agreed. Basically, I think the existing message should be left as is. > >> > >> errno = THREAD_BARRIER_INIT(&barrier, nthreads); > >> if (errno != 0) > >> + { > >> pg_log_fatal("could not initialize barrier: %m"); > >> +exit(1); > >> > >> This is a run-time error. Maybe we should return 2 in that case. > > I think that you are right, but there are plenty such places where exit > should be 2 instead of 1 if the doc is followed: > > """Errors during the run such as database errors or problems in the script > will result in exit status 2.""" > > My beta take is to let these as they are, i.e. pretty inconsistent all > over pgbench, and schedule a cleanup on the next round. As same as the below Fabian's comment about thread->logfile, > >> === > >> if (thread->logfile == NULL) > >> { > >> pg_log_fatal("could not open logfile \"%s\": %m", logpath); > >> - goto done; > >> + exit(1); > >> > >> Maybe we should exit with 2 this case. > > > > Yep. > > The bench is not even started, this is not really run time yet, 1 seems > ok. The failure may be due to a typo in the path which comes from the > user. the bench is not started at THREAD_BARRIER_INIT, so I think exit(1) is ok. > > >> /* must be something wrong */ > >> -pg_log_fatal("%s() failed: %m", SOCKET_WAIT_METHOD); > >> +pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD); > >> goto done; > >> > >> Why doesn't a fatal error cause an immediate exit? > > > > Good point. I do not know, but I would expect it to be the case, and AFAICR > > it does not. > > > >> (And if we change this to fatal, we also need to change similar errors in > >> the same function to fatal.) > > > > Possibly. > > On second look, I think that error is fine, indeed we do not stop the > process, so "fatal" it is not; I replaced this 'fatal' with 'error' because we are aborting the client instead of exit(1). When pgbench was rewritten to use common logging API by the commit 30a3e772b40, somehow pg_log_fatal was used, but I am wondering it should have be pg_log_error. > Attached Yugo-san patch with some updates discussed in the previous mails, > so as to move things along. Thank you for update. I agree with this fix. Regards, Yugo Nagata -- Yugo NAGATA
pg_stats and range statistics
Hi, Statistics for range types are not currently exposed in pg_stats view (i.e. STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM and STATISTIC_KIND_BOUNDS_HISTOGRAM). Shouldn't they? If so, here is a patch for adding them. The following is a simple example of what it looks like: CREATE TABLE test(r int4range); INSERT INTO test SELECT int4range((random()*10)::integer,(10+random()*10)::integer) FROM generate_series(1,1); SET default_statistics_target = 10; ANALYZE test; SELECT range_length_histogram, range_length_empty_frac, range_bounds_histogram FROM pg_stats WHERE tablename = 'test' \gx -[ RECORD 1 ]---+-- range_length_histogram | {1,4,6,8,9,10,11,12,14,16,20} range_length_empty_frac | {0.003666} range_bounds_histogram | {"[0,10)","[1,11)","[2,12)","[3,13)","[4,14)","[5,15)","[6,16)","[7,17)","[8,18)","[9,19)","[10,20)"} Regards, Egor Rogov. diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index f517a7d4af..4d037b590e 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -12877,6 +12877,36 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx non-null elements. (Null for scalar types.) + + + + range_length_histogram anyarray + + + A histogram of lengths of non-empty, non-null ranges. + (Null for non-range types.) + + + + + + range_length_empty_frac float4 + + + A fraction of empty ranges. (Null for non-range types.) + + + + + + range_bounds_histogram anyarray + + + Histograms of lower and upper bounds of non-empty, non-null ranges, + combined into a single array of range values. + (Null for non-range types.) + + diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 999d984068..93fadfff15 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -243,7 +243,28 @@ CREATE VIEW pg_stats WITH (security_barrier) AS WHEN stakind3 = 5 THEN stanumbers3 WHEN stakind4 = 5 THEN stanumbers4 WHEN stakind5 = 5 THEN stanumbers5 -END AS elem_count_histogram +END AS elem_count_histogram, +CASE +WHEN stakind1 = 6 THEN stavalues1 +WHEN stakind2 = 6 THEN stavalues2 +WHEN stakind3 = 6 THEN stavalues3 +WHEN stakind4 = 6 THEN stavalues4 +WHEN stakind5 = 6 THEN stavalues5 +END AS range_length_histogram, +CASE +WHEN stakind1 = 6 THEN stanumbers1[1] +WHEN stakind2 = 6 THEN stanumbers2[1] +WHEN stakind3 = 6 THEN stanumbers3[1] +WHEN stakind4 = 6 THEN stanumbers4[1] +WHEN stakind5 = 6 THEN stanumbers5[1] +END AS range_length_empty_frac, +CASE +WHEN stakind1 = 7 THEN stavalues1 +WHEN stakind2 = 7 THEN stavalues2 +WHEN stakind3 = 7 THEN stavalues3 +WHEN stakind4 = 7 THEN stavalues4 +WHEN stakind5 = 7 THEN stavalues5 +END AS range_bounds_histogram FROM pg_statistic s JOIN pg_class c ON (c.oid = s.starelid) JOIN pg_attribute a ON (c.oid = attrelid AND attnum = s.staattnum) LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace) diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index e5ab11275d..780aefdc26 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -2409,7 +2409,31 @@ pg_stats| SELECT n.nspname AS schemaname, WHEN (s.stakind4 = 5) THEN s.stanumbers4 WHEN (s.stakind5 = 5) THEN s.stanumbers5 ELSE NULL::real[] -END AS elem_count_histogram +END AS elem_count_histogram, +CASE +WHEN (s.stakind1 = 6) THEN s.stavalues1 +WHEN (s.stakind2 = 6) THEN s.stavalues2 +WHEN (s.stakind3 = 6) THEN s.stavalues3 +WHEN (s.stakind4 = 6) THEN s.stavalues4 +WHEN (s.stakind5 = 6) THEN s.stavalues5 +ELSE NULL::anyarray +END AS range_length_histogram, +CASE +WHEN (s.stakind1 = 6) THEN s.stanumbers1[1] +WHEN (s.stakind2 = 6) THEN s.stanumbers2[1] +WHEN (s.stakind3 = 6) THEN s.stanumbers3[1] +WHEN (s.stakind4 = 6) THEN s.stanumbers4[1] +WHEN (s.stakind5 = 6) THEN s.stanumbers5[1] +ELSE NULL::real +END AS range_length_empty_frac, +CASE +WHEN (s.stakind1 = 7) THEN s.stavalues1 +WHEN (s.stakind2 = 7) THEN s.stavalues2 +WHEN (s.stakind3 = 7) THEN s.stavalues3 +WHEN (s.stakind4 = 7) THEN s.stavalues4 +WHEN (s.stakind5 = 7) THEN s.stavalues5 +
Re: disfavoring unparameterized nested loops
On Fri, Jun 18, 2021 at 6:20 AM Ashutosh Bapat wrote: > I have not come across many papers which leverage this idea. Googling > "selectivity estimation confidence interval", does not yield many > papers. Although I found [1] to be using a similar idea. So may be > there's not merit in this idea, thought theoretically it sounds fine > to me. > > > [1] https://pi3.informatik.uni-mannheim.de/~moer/Publications/vldb18_smpl_synop.pdf Well, that paper's title shows it's a bit too far forward for us, since we don't use samples during plan time (although that's a separate topic worth considering). From the references, however, this one gives some mathematical framing of the problem that lead to the thread subject, although I haven't read enough to see if we can get practical advice from it: Y. E. Ioannidis and S. Christodoulakis. On the propagation of errors in the size of join results. https://www.csd.uoc.gr/~hy460/pdf/p268-ioannidis.pdf -- John Naylor EDB: http://www.enterprisedb.com
Re: PoC: Using Count-Min Sketch for join cardinality estimation
On Wed, Jun 16, 2021 at 8:23 PM Tomas Vondra wrote: > Not really, but to be fair even for the histograms it's only really > supported by "seems to work in practice" :-( Hmm, we cite a theoretical result in analyze.c, but I don't know if there is something better out there: * The following choice of minrows is based on the paper * "Random sampling for histogram construction: how much is enough?" * by Surajit Chaudhuri, Rajeev Motwani and Vivek Narasayya, in What is more troubling to me is that we set the number of MCVs to the number of histograms. Since b5db1d93d2a6 we have a pretty good method of finding the MCVs that are justified. When that first went in, I experimented with removing the MCV limit and found it easy to create value distributions that lead to thousands of MCVs. I guess the best justification now for the limit is plan time, but if we have a sketch also, we can choose one or the other based on a plan-time speed vs accuracy tradeoff (another use for a planner effort guc). In that scenario, for tables with many MCVs we would only use them for restriction clauses. -- John Naylor EDB: http://www.enterprisedb.com
Version reporting in pgbench
I see that commit 547f04e73 caused pgbench to start printing its version number. I think that's a good idea in general, but it appears to me that next to no thought went into the details (as perhaps evidenced by the fact that the commit message doesn't even mention it). I've got two beefs with how it was done: * The output only mentions pgbench's own version, which would be highly misleading if the server being used is of a different version. I should think that in most cases the server's version is more important than pgbench's. * We have a convention for how client programs should print their versions, and this ain't it. (Specifically, you should print the PG_VERSION string not make up your own.) What I think should have been done instead is to steal psql's battle-tested logic for printing its startup version banner, more or less as attached. One point here is that printing the server version requires access to a connection, which printResults() hasn't got because we already closed all the connections by that point. I solved that by printing the banner during the initial connection that gets the scale factor, does vacuuming, etc. If you're dead set on not printing the version till the end, that could be made to happen; but it's not clear to me that this way is any worse, and it's certainly easier. Thoughts? regards, tom lane diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index d7479925cb..78b5ac612b 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -63,6 +63,7 @@ #include "common/username.h" #include "fe_utils/cancel.h" #include "fe_utils/conditional.h" +#include "fe_utils/string_utils.h" #include "getopt_long.h" #include "libpq-fe.h" #include "pgbench.h" @@ -5493,6 +5494,37 @@ printSimpleStats(const char *prefix, SimpleStats *ss) } } +/* print version banner */ +static void +printVersion(PGconn *con) +{ + int server_ver = PQserverVersion(con); + int client_ver = PG_VERSION_NUM; + + if (server_ver != client_ver) + { + const char *server_version; + char sverbuf[32]; + + /* Try to get full text form, might include "devel" etc */ + server_version = PQparameterStatus(con, "server_version"); + /* Otherwise fall back on server_ver */ + if (!server_version) + { + formatPGVersionNumber(server_ver, true, + sverbuf, sizeof(sverbuf)); + server_version = sverbuf; + } + + printf(_("%s (%s, server %s)\n"), + "pgbench", PG_VERSION, server_version); + } + /* For version match, only print pgbench version */ + else + printf("%s (%s)\n", "pgbench", PG_VERSION); + fflush(stdout); +} + /* print out results */ static void printResults(StatsData *total, @@ -5506,7 +5538,6 @@ printResults(StatsData *total, double bench_duration = PG_TIME_GET_DOUBLE(total_duration); double tps = ntx / bench_duration; - printf("pgbench (PostgreSQL) %d.%d\n", PG_VERSION_NUM / 1, PG_VERSION_NUM % 100); /* Report test parameters. */ printf("transaction type: %s\n", num_scripts == 1 ? sql_script[0].desc : "multiple scripts"); @@ -6386,6 +6417,9 @@ main(int argc, char **argv) exit(1); } + /* report pgbench and server versions */ + printVersion(con); + if (!is_no_vacuum) { fprintf(stderr, "starting vacuum...");
A few nuances about specifying the timeline with START_REPLICATION
A few questions about this comment in walsender.c, originating in commit abfd192b1b5b: /* * Found the requested timeline in the history. Check that * requested startpoint is on that timeline in our history. * * This is quite loose on purpose. We only check that we didn't * fork off the requested timeline before the switchpoint. We * don't check that we switched *to* it before the requested * starting point. This is because the client can legitimately * request to start replication from the beginning of the WAL * segment that contains switchpoint, but on the new timeline, so * that it doesn't end up with a partial segment. If you ask for * too old a starting point, you'll get an error later when we * fail to find the requested WAL segment in pg_wal. * * XXX: we could be more strict here and only allow a startpoint * that's older than the switchpoint, if it's still in the same * WAL segment. */ 1. I think there's a typo: it should be "fork off the requested timeline before the startpoint", right? 2. It seems to imply that requesting an old start point is wrong, but I don't see why. As long as the WAL is there (or at least the slot boundaries), what's the problem? Can we either just change the comment to say that it's fine to start on an ancestor of the requested timeline (and maybe update the docs, too)? 3. I noticed when looking at this that the terminology in the docs is a bit inconsistent between START_REPLICATION and recovery_target_timeline. a. In recovery_target_timeline: i. a numeric value means "stop when this timeline forks" ii. "latest" means "follow forks along the newest timeline" iii. "current" is an alias for the backup's numerical timeline b. In the start START_REPLICATION docs: i. "current" means "follow forks along the newest timeline" ii. a numeric value that is equal to the current timeline is the same as "current" iii. a numeric value that is less than the current timeline means "stop when this timeline forks" On point #3, it looks like START_REPLICATION could be improved: * Should we change the docs to say "latest" rather than "current"? * Should we change the behavior so that specifying the current timeline as a number still means a historic timeline (e.g. it will stop replicating there and emit a tuple)? * Should we add some keywords like "latest" or "current" to the START_REPLICATION command? Regards, Jeff Davis
Re: Add version macro to libpq-fe.h
Alvaro Herrera writes: > So I'm +1 on adding this "feature macro". Concretely, how about the attached? (I also got rid of a recently-added extra comma. While the compilers we use might not warn about that, it seems unwise to assume that no user's compiler will.) I guess one unresolved question is whether we want to mention these in the SGML docs. I vote "no", because it'll raise the maintenance cost noticeably. But I can see an argument on the other side. regards, tom lane diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h index ec378705ad..4677c51e1b 100644 --- a/src/interfaces/libpq/libpq-fe.h +++ b/src/interfaces/libpq/libpq-fe.h @@ -28,6 +28,13 @@ extern "C" */ #include "postgres_ext.h" +/* + * These symbols may be used in compile-time #ifdef tests for the availability + * of newer libpq features. + */ +#define LIBPQ_HAS_PIPELINING 1 +#define LIBPQ_HAS_TRACE_FLAGS 1 + /* * Option flags for PQcopyResult */ @@ -98,7 +105,7 @@ typedef enum PGRES_COPY_BOTH, /* Copy In/Out data transfer in progress */ PGRES_SINGLE_TUPLE, /* single tuple from larger resultset */ PGRES_PIPELINE_SYNC, /* pipeline synchronization point */ - PGRES_PIPELINE_ABORTED, /* Command didn't run because of an abort + PGRES_PIPELINE_ABORTED /* Command didn't run because of an abort * earlier in a pipeline */ } ExecStatusType;
Re: Pipeline mode and PQpipelineSync()
On 2021-Jun-16, Boris Kolpackov wrote: > Specifically, the documentation[1] > makes it sound like the use of PQpipelineSync() is optional (34.5.1.1 > "Issuing Queries"): Hmm. My intention here was to indicate that you should have PQpipelineSync *somewhere*, but that the server was free to start executing some commands even before that, if the buffered commands happened to reach the server somehow -- but not necessarily that the results from those commands would reach the client immediately. I'll experiment a bit more to be sure that what I'm saying is correct. But if it is, then I think the documentation you quote is misleading: > "After entering pipeline mode, the application dispatches requests using > PQsendQuery, PQsendQueryParams, or its prepared-query sibling > PQsendQueryPrepared. These requests are queued on the client-side until > flushed to the server; this occurs when PQpipelineSync is used to establish a > synchronization point in the pipeline, or when PQflush is called. [...] > > The server executes statements, and returns results, in the order the client > sends them. The server will begin executing the commands in the pipeline > immediately, not waiting for the end of the pipeline. [...]" ... because it'll lead people to do what you've done, only to discover that it doesn't really work. I think I should rephrase this to say that PQpipelineSync() is needed where the user needs the server to start executing commands; and separately indicate that it is possible (but not promised) that the server would start executing commands ahead of time because $reasons. Do I have it right that other than this documentation problem, you've been able to use pipeline mode successfully? > So to me it looks like, contrary to the documentation, the server does > not start executing the statements immediately, instead waiting for the > synchronization point. Or am I missing something here? I don't think you are. > The above tests were performed using libpq from 14beta1 running against > PostgreSQL server version 9.5. If you would like to take a look at the > actual code, you can find it here[2] (the PIPELINE_SYNC macro controls > whether PQpipelineSync() is used). Thanks. > On a related note, I've been using libpq_pipeline.c[3] as a reference > and I believe it has a busy loop calling PQflush() repeatedly on line > 721 since once everything has been sent and we are waiting for the > result, select() will keep returning with an indication that the socket > is writable Oops, thanks, will look at fixing this too. > (you can find one way to fix this in [2]). > [2] > https://git.codesynthesis.com/cgit/odb/libodb-pgsql/tree/odb/pgsql/statement.cxx?h=bulk#n771 Neat, can do likewise I suppose. -- Álvaro Herrera39°49'30"S 73°17'W
Re: Add version macro to libpq-fe.h
On 2021-Jun-18, Tom Lane wrote: > Alvaro Herrera writes: > > So I'm +1 on adding this "feature macro". > > Concretely, how about the attached? Seems OK to me. We can just accumulate any similar ones in the future nearby. > (I also got rid of a recently-added > extra comma. While the compilers we use might not warn about that, > it seems unwise to assume that no user's compiler will.) Oops. > I guess one unresolved question is whether we want to mention these in > the SGML docs. I vote "no", because it'll raise the maintenance cost > noticeably. But I can see an argument on the other side. Well, if we do want docs for these macros, then IMO it'd be okay to have them in libpq-fe.h itself rather than SGML. A one-line comment for each would suffice: +/* + * These symbols may be used in compile-time #ifdef tests for the availability + * of newer libpq features. + */ +/* Indicates presence of PQenterPipelineMode and friends */ +#define LIBPQ_HAS_PIPELINING 1 + +/* Indicates presence of PQsetTraceFlags; PQtrace changed output format */ +#define LIBPQ_HAS_TRACE_FLAGS 1 -- Álvaro Herrera39°49'30"S 73°17'W
Re: Add version macro to libpq-fe.h
Alvaro Herrera writes: > On 2021-Jun-18, Tom Lane wrote: >> I guess one unresolved question is whether we want to mention these in >> the SGML docs. I vote "no", because it'll raise the maintenance cost >> noticeably. But I can see an argument on the other side. > Well, if we do want docs for these macros, then IMO it'd be okay to have > them in libpq-fe.h itself rather than SGML. A one-line comment for each > would suffice: WFM. I'd sort of supposed that the symbol names were self-documenting, but you're right that a line or so of annotation improves things. regards, tom lane
Re: Version reporting in pgbench
Hello Tom, One point here is that printing the server version requires access to a connection, which printResults() hasn't got because we already closed all the connections by that point. I solved that by printing the banner during the initial connection that gets the scale factor, does vacuuming, etc. Ok. If you're dead set on not printing the version till the end, that could be made to happen; but it's not clear to me that this way is any worse, and it's certainly easier. pgbench (14beta1 dev 2021-06-12 08:10:44, server 13.3 (Ubuntu 13.3-1.pgdg20.04+1)) Why not move the printVersion call right after the connection is created, at line 6374? Otherwise it works for me. -- Fabien.
Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)
On 2021-Jun-03, Michael Paquier wrote: > Hi all, > > serinus has been complaining about the new gcd functions in 13~: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2021-06-03%2003%3A44%3A14 Hello, this problem is still happening; serinus' configure output says it's running a snapshot from 20210527, and Fabien mentioned downthread that his compiler stopped complaining on 2021-06-05. Andres, maybe the compiler in serinus is due for an update too? Thanks -- Álvaro Herrera39°49'30"S 73°17'W "Once again, thank you and all of the developers for your hard work on PostgreSQL. This is by far the most pleasant management experience of any database I've worked on." (Dan Harris) http://archives.postgresql.org/pgsql-performance/2006-04/msg00247.php
Re: Version reporting in pgbench
Fabien COELHO writes: > Why not move the printVersion call right after the connection is created, > at line 6374? I started with that, and one of the 001_pgbench_with_server.pl tests fell over --- it was expecting no stdout output before a "Perhaps you need to do initialization" failure. If you don't mind changing that, I agree that printing immediately after the connection is made is a bit less astonishing. regards, tom lane
Re: A few nuances about specifying the timeline with START_REPLICATION
On 18/06/2021 20:27, Jeff Davis wrote: A few questions about this comment in walsender.c, originating in commit abfd192b1b5b: /* * Found the requested timeline in the history. Check that * requested startpoint is on that timeline in our history. * * This is quite loose on purpose. We only check that we didn't * fork off the requested timeline before the switchpoint. We * don't check that we switched *to* it before the requested * starting point. This is because the client can legitimately * request to start replication from the beginning of the WAL * segment that contains switchpoint, but on the new timeline, so * that it doesn't end up with a partial segment. If you ask for * too old a starting point, you'll get an error later when we * fail to find the requested WAL segment in pg_wal. * * XXX: we could be more strict here and only allow a startpoint * that's older than the switchpoint, if it's still in the same * WAL segment. */ 1. I think there's a typo: it should be "fork off the requested timeline before the startpoint", right? Yes, I think you're right. 2. It seems to imply that requesting an old start point is wrong, but I don't see why. As long as the WAL is there (or at least the slot boundaries), what's the problem? Can we either just change the comment to say that it's fine to start on an ancestor of the requested timeline (and maybe update the docs, too)? Hmm, I'm not sure if the logic in WalSndSegmentOpen() would work if you did that. For example, if you had the following WAL segments, because a timeline switch happened somewhere in the middle of segment 13: 00040012 00040013 00050013 00050014 and you requested to start streaming from timeline 5, 0/1200, I think WalSndSegmentOpen() would try to open file "00050012" and not find it. We could teach it to look into the timeline history to find the correct file, though. Come to think of it, we could remove the START_REPLICATION TIMELINE option altogether (or rather, make it optional or backwards compatibility). The server doesn't need it for anything, it knows the timeline history so the LSN is enough to uniquely identify the starting point. If the client asks for a historic timeline, the replication will stop when it reaches the end of that timeline. In hindsight, I think it would make more sense to send a message to the client to say that it's switching to a new timeline, and continue streaming from the new timeline. 3. I noticed when looking at this that the terminology in the docs is a bit inconsistent between START_REPLICATION and recovery_target_timeline. a. In recovery_target_timeline: i. a numeric value means "stop when this timeline forks" ii. "latest" means "follow forks along the newest timeline" iii. "current" is an alias for the backup's numerical timeline b. In the start START_REPLICATION docs: i. "current" means "follow forks along the newest timeline" ii. a numeric value that is equal to the current timeline is the same as "current" iii. a numeric value that is less than the current timeline means "stop when this timeline forks" On point #3, it looks like START_REPLICATION could be improved: * Should we change the docs to say "latest" rather than "current"? * Should we change the behavior so that specifying the current timeline as a number still means a historic timeline (e.g. it will stop replicating there and emit a tuple)? * Should we add some keywords like "latest" or "current" to the START_REPLICATION command? Hmm, the timeline in the START_REPLICATION command is not specifying a recovery target timeline, so I don't think "latest" or "current" make much sense there. Per above, it just tells the server which timeline the requested starting point belongs to, so it's actually redundant. - Heikki
Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)
Alvaro Herrera writes: > Hello, this problem is still happening; serinus' configure output says > it's running a snapshot from 20210527, and Fabien mentioned downthread > that his compiler stopped complaining on 2021-06-05. Andres, maybe the > compiler in serinus is due for an update too? Yeah, serinus is visibly still running one of the broken revisions: configure: using compiler=gcc (Debian 20210527-1) 12.0.0 20210527 (experimental) [master revision 262e75d22c3:7bb6b9b2f47:9d3a953ec4d2695e9a6bfa5f22655e2aea47a973] It'd sure be nice if seawasp stopped spamming the buildfarm failure log, too. That seems to be a different issue: Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x7f3827947859 in __GI_abort () at abort.c:79 #2 0x7f3827947729 in __assert_fail_base (fmt=0x7f3827add588 "%s%s%s:%u: %s%sAssertion `%s' failed.\\n%n", assertion=0x7f381c3ce4c8 "S->getValue() && \\"Releasing SymbolStringPtr with zero ref count\\"", file=0x7f381c3ce478 "/home/fabien/llvm-src/llvm/include/llvm/ExecutionEngine/Orc/SymbolStringPool.h", line=91, function=) at assert.c:92 #3 0x7f3827958f36 in __GI___assert_fail (assertion=0x7f381c3ce4c8 "S->getValue() && \\"Releasing SymbolStringPtr with zero ref count\\"", file=0x7f381c3ce478 "/home/fabien/llvm-src/llvm/include/llvm/ExecutionEngine/Orc/SymbolStringPool.h", line=91, function=0x7f381c3ce570 "llvm::orc::SymbolStringPtr::~SymbolStringPtr()") at assert.c:101 #4 0x7f381c23c98d in llvm::orc::SymbolStringPtr::~SymbolStringPtr (this=0x277a8b0, __in_chrg=) at /home/fabien/llvm-src/llvm/include/llvm/ExecutionEngine/Orc/SymbolStringPool.h:91 #5 0x7f381c24f879 in std::_Destroy (__pointer=0x277a8b0) at /home/fabien/gcc-10-bin/include/c++/10.3.1/bits/stl_construct.h:140 #6 0x7f381c24d10c in std::_Destroy_aux::__destroy (__first=0x277a8b0, __last=0x277a998) at /home/fabien/gcc-10-bin/include/c++/10.3.1/bits/stl_construct.h:152 #7 0x7f381c2488a6 in std::_Destroy (__first=0x277a8b0, __last=0x277a998) at /home/fabien/gcc-10-bin/include/c++/10.3.1/bits/stl_construct.h:185 #8 0x7f381c243c51 in std::_Destroy (__first=0x277a8b0, __last=0x277a998) at /home/fabien/gcc-10-bin/include/c++/10.3.1/bits/alloc_traits.h:738 #9 0x7f381c23f1c3 in std::vector >::~vector (this=0x7ffc73440a10, __in_chrg=) at /home/fabien/gcc-10-bin/include/c++/10.3.1/bits/stl_vector.h:680 #10 0x7f381c26112c in llvm::orc::JITDylib::removeTracker (this=0x18b4240, RT=...) at /home/fabien/llvm-src/llvm/lib/ExecutionEngine/Orc/Core.cpp:1464 #11 0x7f381c264cb9 in operator() (__closure=0x7ffc73440d00) at /home/fabien/llvm-src/llvm/lib/ExecutionEngine/Orc/Core.cpp:2054 #12 0x7f381c264d29 in llvm::orc::ExecutionSession::runSessionLocked >(struct {...} &&) (this=0x187d110, F=...) at /home/fabien/llvm-src/llvm/include/llvm/ExecutionEngine/Orc/Core.h:1291 #13 0x7f381c264ebc in llvm::orc::ExecutionSession::removeResourceTracker (this=0x187d110, RT=...) at /home/fabien/llvm-src/llvm/lib/ExecutionEngine/Orc/Core.cpp:2051 #14 0x7f381c25734c in llvm::orc::ResourceTracker::remove (this=0x1910c30) at /home/fabien/llvm-src/llvm/lib/ExecutionEngine/Orc/Core.cpp:53 #15 0x7f381c25a9c1 in llvm::orc::JITDylib::clear (this=0x18b4240) at /home/fabien/llvm-src/llvm/lib/ExecutionEngine/Orc/Core.cpp:622 #16 0x7f381c26305e in llvm::orc::ExecutionSession::endSession (this=0x187d110) at /home/fabien/llvm-src/llvm/lib/ExecutionEngine/Orc/Core.cpp:1777 #17 0x7f381c3373a3 in llvm::orc::LLJIT::~LLJIT (this=0x18a73b0, __in_chrg=) at /home/fabien/llvm-src/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp:1002 #18 0x7f381c38af48 in LLVMOrcDisposeLLJIT (J=0x18a73b0) at /home/fabien/llvm-src/llvm/lib/ExecutionEngine/Orc/OrcV2CBindings.cpp:561 #19 0x7f381c596612 in llvm_shutdown (code=, arg=140722242323824) at llvmjit.c:892 #20 0x007d4385 in proc_exit_prepare (code=code@entry=0) at ipc.c:209 #21 0x007d4288 in proc_exit (code=code@entry=0) at ipc.c:107 regards, tom lane
Re: Supply restore_command to pg_rewind via CLI argument
Hi, On Fri, Jun 18, 2021 at 5:42 PM Andrey Borodin wrote: > > If we run 'pg_rewind --restore-target-wal' there must be restore_command in > config of target installation. But if the config is not within > $PGDATA\postgresql.conf pg_rewind cannot use it. > If we run postmaster with `-c > config_file=/etc/postgresql/10/data/postgresql.conf`, we simply cannot use > the feature. We solved the problem by putting config into PGDATA only during > pg_rewind, but this does not seem like a very robust solution. > Yeah, Michael was against it, while we had no good arguments, so Alexander removed it, IIRC. This example sounds reasonable to me. I also recall some complaints from PostgresPro support folks, that it is sad to not have a cli option to pass restore_command. However, I just thought about another recent feature --- ensure clean shutdown, which is turned on by default. So you usually run Postgres with one config, but pg_rewind may start it with another, although in single-user mode. Is it fine for you? > > Maybe we could add "-C, --target-restore-command=COMMAND target WAL > restore_command\n" as was proposed within earlier versions of the patch[0]? > Or instruct pg_rewind to pass config to 'postgres -C restore_command' run? Hm, adding --target-restore-command is the simplest way, sure, but forwarding something like '-c config_file=...' to postgres sounds interesting too. Could it have any use case beside providing a restore_command? I cannot imagine anything right now, but if any exist, then it could be a more universal approach. > > From my POV adding --target-restore-command is simplest way, I can extract > corresponding portions from original patch. > I will have a look, maybe I even already have this patch separately. I remember that we were considering adding this option to PostgresPro, when we did a backport of this feature. -- Alexey Kondratov
Re: Optionally automatically disable logical replication subscriptions on error
> On Jun 17, 2021, at 9:47 PM, Amit Kapila wrote: > > (a) The patch > seem to be assuming that the error can happen only by the apply worker > but I think the constraint violation can happen via one of the table > sync workers as well You are right. Peter mentioned the same thing, and it is clearly so. I am working to repair this fault in v2 of the patch. > (b) What happens if the error happens when you > are updating the error information in the catalog table. I think that is an entirely different kind of error. The patch attempts to catch errors caused by the user, not by core functionality of the system failing. If there is a fault that prevents the catalogs from being updated, it is unclear what the patch can do about that. > I think > instead of seeing the actual apply time error, the user might see some > other for which it won't be clear what is an appropriate action. Good point. Before trying to do much of anything with the caught error, the v2 patch logs the error. If the subsequent efforts to disable the subscription fail, at least the logs should contain the initial failure message. The v1 patch emitted a log message much further down, and really just intended for debugging the patch itself, with many opportunities for something else to throw before the log is written. > We are also discussing another action like skipping the apply of the > transaction on an error [1]. I think it is better to evaluate both the > proposals as one seems to be an extension of another. Thanks for the link. I think they are two separate options. For some users and data patterns, subscriber-side skipping of specific problematic commits will be fine. For other usage patterns, skipping earlier commits will results in more and more data integrity problems (foreign key references, etc.) such that the failures will snowball with skipping becoming the norm rather than the exception. Users with those usage patterns would likely prefer the subscription to automatically be disabled until manual intervention can clean up the problem. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: PoC: Using Count-Min Sketch for join cardinality estimation
On 6/18/21 7:03 PM, John Naylor wrote: On Wed, Jun 16, 2021 at 8:23 PM Tomas Vondra mailto:tomas.von...@enterprisedb.com>> wrote: > Not really, but to be fair even for the histograms it's only really > supported by "seems to work in practice" :-( Hmm, we cite a theoretical result in analyze.c, but I don't know if there is something better out there: * The following choice of minrows is based on the paper * "Random sampling for histogram construction: how much is enough?" * by Surajit Chaudhuri, Rajeev Motwani and Vivek Narasayya, in True. I read that paper (long time ago), and it certainly gives some very interesting guidance and guarantees regarding relative error. And now that I look at it, the theorems 5 & 6, and the corollary 1 do provide a way to calculate probability of a lower error (essentially vary the f, get the probability). I still think there's a lot of reliance on experience from practice, because even with such strong limits delta=0.5 of a histogram with 100 buckets, representing 1e9 rows, is still plenty of space for errors. What is more troubling to me is that we set the number of MCVs to the number of histograms. Since b5db1d93d2a6 we have a pretty good method of finding the MCVs that are justified. When that first went in, I experimented with removing the MCV limit and found it easy to create value distributions that lead to thousands of MCVs. I guess the best justification now for the limit is plan time, but if we have a sketch also, we can choose one or the other based on a plan-time speed vs accuracy tradeoff (another use for a planner effort guc). In that scenario, for tables with many MCVs we would only use them for restriction clauses. Sorry, I'm not sure what you mean by "we set the number of MCVs to the number of histograms" :-( When you say "MCV limit" you mean that we limit the number of items to statistics target, right? I agree plan time is one concern - but it's also about analyze, as we need larger sample to build a larger MCV or histogram (as the paper you referenced shows). I think the sketch is quite interesting for skewed data sets where the MCV can represent only small fraction of the data, exactly because of the limit. For (close to) uniform data distributions we can just use ndistinct estimates to get estimates that are better than those from a sketch, I think. So I think we should try using MCV first, and then use sketches for the rest of the data (or possibly all data, if one side does not have MCV). FWIW I think the sketch may be useful even for restriction clauses, which is what the paper calls "point queries". Again, maybe this should use the same correction depending on ndistinct estimate. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: PoC: Using Count-Min Sketch for join cardinality estimation
On Fri, Jun 18, 2021 at 3:43 PM Tomas Vondra wrote: > Sorry, I'm not sure what you mean by "we set the number of MCVs to the > number of histograms" :-( > > When you say "MCV limit" you mean that we limit the number of items to > statistics target, right? I agree plan time is one concern - but it's > also about analyze, as we need larger sample to build a larger MCV or > histogram (as the paper you referenced shows). Ah, I didn't realize the theoretical limit applied to the MCVs too, but that makes sense since they're basically singleton histogram buckets. -- John Naylor EDB: http://www.enterprisedb.com
Re: A few nuances about specifying the timeline with START_REPLICATION
On Fri, 2021-06-18 at 21:48 +0300, Heikki Linnakangas wrote: > On 18/06/2021 20:27, Jeff Davis wrote: > We could teach it to look into the timeline history to find the > correct > file, though. That's how recovery_target_timeline behaves, and it would match my intuition better if START_REPLICATION behaved that way. > If the client asks for a historic timeline, the replication will > stop > when it reaches the end of that timeline. In hindsight, I think it > would > make more sense to send a message to the client to say that it's > switching to a new timeline, and continue streaming from the new > timeline. Why is it important for the standby to be told explicitly in the protocol about timeline switches? If it is important, why only for historical timelines? > Hmm, the timeline in the START_REPLICATION command is not specifying > a > recovery target timeline, so I don't think "latest" or "current" > make > much sense there. Per above, it just tells the server which timeline > the > requested starting point belongs to, so it's actually redundant. That's not very clear from the docs: "if TIMELINE option is specified, streaming starts on timeline tli...". Part of the confusion is that there's not a good distinction in terminology between: 1. a timeline ID, which is a specific segment of a timeline 2. a timeline made up of the given timeline ID and all its ancestors, terminating at the given ID 3. the timeline made up of the current ID, all ancestor IDs, and all descendent IDs that the current active primary switches to 4. the set of all timelines that contain a given ID It seems you are saying that replication only concerns itself with #3, which does not require a timeline ID at all. That seems basically correct for now, but since we already document the protocol to take a timeline, it makes sense to me to just have the primary serve it if possible. If we (continue to?) allow timelines for replication, it will start to treat the primary like an archive. That might not be quite what was intended, but could be powerful. You could imagine a special archive that implements the replication protocol, and have replicas directly off the archive, or maybe doing PITR off the archive. Regards, Jeff Davis
Re: PoC: Using Count-Min Sketch for join cardinality estimation
On 6/18/21 9:54 PM, John Naylor wrote: On Fri, Jun 18, 2021 at 3:43 PM Tomas Vondra mailto:tomas.von...@enterprisedb.com>> wrote: > Sorry, I'm not sure what you mean by "we set the number of MCVs to the > number of histograms" :-( > > When you say "MCV limit" you mean that we limit the number of items to > statistics target, right? I agree plan time is one concern - but it's > also about analyze, as we need larger sample to build a larger MCV or > histogram (as the paper you referenced shows). Ah, I didn't realize the theoretical limit applied to the MCVs too, but that makes sense since they're basically singleton histogram buckets. Something like that, yes. Looking at MCV items as singleton histogram buckets is interesting, although I'm not sure that was the reasoning when calculating the MCV size. AFAIK it was kinda the other way around, i.e. the sample size is derived from the histogram paper, and when building the MCV we ask what's sufficiently different from the average frequency, based on the sample size etc. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pg_stats and range statistics
On 6/18/21 6:22 PM, Egor Rogov wrote: Hi, Statistics for range types are not currently exposed in pg_stats view (i.e. STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM and STATISTIC_KIND_BOUNDS_HISTOGRAM). Shouldn't they? If so, here is a patch for adding them. I think they should be exposed - I don't see why not to do that. I noticed this when working on the count-min sketch experiment too, so thanks for this patch. FWIW I've added the patch to the next CF: https://commitfest.postgresql.org/33/3184/ regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Race condition in InvalidateObsoleteReplicationSlots()
On 2021-Jun-11, Álvaro Herrera wrote: > I tried hard to make this stable, but it just isn't (it works fine one > thousand runs, then I grab some coffee and run it once more and that one > fails. Why? that's not clear to me). Attached is the last one I have, > in case somebody wants to make it better. Maybe there's some completely > different approach that works better, but I'm out of ideas for now. It occurred to me that this could be made better by sigstopping both walreceiver and walsender, then letting only the latter run; AFAICS this makes the test stable. I'll register this on the upcoming commitfest to let cfbot run it, and if it looks good there I'll get it pushed to master. If there's any problem I'll just remove it before beta2 is stamped. -- Álvaro Herrera Valdivia, Chile >From 8080ced6b3c807039ff9a66810fa661fae19b347 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Thu, 10 Jun 2021 16:44:03 -0400 Subject: [PATCH v3] Add test case for obsoleting slot with active walsender MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The code to signal a running walsender when its reserved WAL size grows too large is completely uncovered before this commit; this adds coverage for that case. This test involves sending SIGSTOP to walsender and walreceiver and running a checkpoint while advancing WAL, then sending SIGCONT. There's no precedent for this coding in Perl tests, and my reading of relevant manpages says it's likely to fail on Windows. Because of this, this test is always skipped on that platform. Author: Álvaro Herrera Discussion: https://postgr.es/m/202106102202.mjw4huiix7lo@alvherre.pgsql --- src/test/recovery/t/019_replslot_limit.pl | 82 ++- 1 file changed, 79 insertions(+), 3 deletions(-) diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7094aa0704..2e6c5a229e 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -11,7 +11,7 @@ use TestLib; use PostgresNode; use File::Path qw(rmtree); -use Test::More tests => 14; +use Test::More tests => $TestLib::windows_os ? 14 : 18; use Time::HiRes qw(usleep); $ENV{PGDATABASE} = 'postgres'; @@ -211,8 +211,8 @@ for (my $i = 0; $i < 1; $i++) } ok($failed, 'check that replication has been broken'); -$node_primary->stop('immediate'); -$node_standby->stop('immediate'); +$node_primary->stop; +$node_standby->stop; my $node_primary2 = get_new_node('primary2'); $node_primary2->init(allows_streaming => 1); @@ -253,6 +253,82 @@ my @result = timeout => '60')); is($result[1], 'finished', 'check if checkpoint command is not blocked'); +$node_primary2->stop; +$node_standby->stop; + +# The next test depends on Perl's `kill`, which apparently is not +# portable to Windows. (It would be nice to use Test::More's `subtest`, +# but that's not in the ancient version we require.) +if ($TestLib::windows_os) +{ + done_testing(); + exit; +} + +# Get a slot terminated while the walsender is active +# We do this by sending SIGSTOP to the walsender. Skip this on Windows. +my $node_primary3 = get_new_node('primary3'); +$node_primary3->init(allows_streaming => 1, extra => ['--wal-segsize=1']); +$node_primary3->append_conf( + 'postgresql.conf', qq( + min_wal_size = 2MB + max_wal_size = 2MB + log_checkpoints = yes + max_slot_wal_keep_size = 1MB + )); +$node_primary3->start; +$node_primary3->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('rep3')"); +# Take backup +$backup_name = 'my_backup'; +$node_primary3->backup($backup_name); +# Create standby +my $node_standby3 = get_new_node('standby_3'); +$node_standby3->init_from_backup($node_primary3, $backup_name, + has_streaming => 1); +$node_standby3->append_conf('postgresql.conf', "primary_slot_name = 'rep3'"); +$node_standby3->start; +$node_primary3->wait_for_catchup($node_standby3->name, 'replay'); +my $senderpid = $node_primary3->safe_psql('postgres', + "SELECT pid FROM pg_stat_activity WHERE backend_type = 'walsender'"); +like($senderpid, qr/^[0-9]+$/, "have walsender pid $senderpid"); +my $receiverpid = $node_standby3->safe_psql('postgres', + "SELECT pid FROM pg_stat_activity WHERE backend_type = 'walreceiver'"); +like($receiverpid, qr/^[0-9]+$/, "have walreceiver pid $receiverpid"); + +# freeze walsender and walreceiver. Slot will still be active, but walreceiver +# won't get anything anymore. +kill 'STOP', $senderpid, $receiverpid; +$logstart = get_log_size($node_primary3); +advance_wal($node_primary3, 4); +ok(find_in_log($node_primary3, "to release replication slot", $logstart), + "walreceiver termination logged"); + +# Now let the walsender continue; slot should be killed now. +# (Must not let walreceiver run yet; otherwise the standby could start another +# one before the slot can be killed) +kill 'CONT', $senderpid; +$node_primary3->poll_query_until('postgres', + "SELECT wal_status FROM pg_replica
Re: Race condition in InvalidateObsoleteReplicationSlots()
Apologies, I inadvertently sent the version before I added a maximum number of iterations in the final loop. -- Álvaro Herrera Valdivia, Chile "La fuerza no está en los medios físicos sino que reside en una voluntad indomable" (Gandhi) >From 1492e9468ecd86167a1253c4a2e9b31139835649 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Thu, 10 Jun 2021 16:44:03 -0400 Subject: [PATCH v4] Add test case for obsoleting slot with active walsender MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The code to signal a running walsender when its reserved WAL size grows too large is completely uncovered before this commit; this adds coverage for that case. This test involves sending SIGSTOP to walsender and walreceiver and running a checkpoint while advancing WAL, then sending SIGCONT. There's no precedent for this coding in Perl tests, and my reading of relevant manpages says it's likely to fail on Windows. Because of this, this test is always skipped on that platform. Author: Álvaro Herrera Discussion: https://postgr.es/m/202106102202.mjw4huiix7lo@alvherre.pgsql --- src/test/recovery/t/019_replslot_limit.pl | 84 ++- 1 file changed, 81 insertions(+), 3 deletions(-) diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 7094aa0704..52bc9f8cfd 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -11,7 +11,7 @@ use TestLib; use PostgresNode; use File::Path qw(rmtree); -use Test::More tests => 14; +use Test::More tests => $TestLib::windows_os ? 14 : 18; use Time::HiRes qw(usleep); $ENV{PGDATABASE} = 'postgres'; @@ -211,8 +211,8 @@ for (my $i = 0; $i < 1; $i++) } ok($failed, 'check that replication has been broken'); -$node_primary->stop('immediate'); -$node_standby->stop('immediate'); +$node_primary->stop; +$node_standby->stop; my $node_primary2 = get_new_node('primary2'); $node_primary2->init(allows_streaming => 1); @@ -253,6 +253,84 @@ my @result = timeout => '60')); is($result[1], 'finished', 'check if checkpoint command is not blocked'); +$node_primary2->stop; +$node_standby->stop; + +# The next test depends on Perl's `kill`, which apparently is not +# portable to Windows. (It would be nice to use Test::More's `subtest`, +# but that's not in the ancient version we require.) +if ($TestLib::windows_os) +{ + done_testing(); + exit; +} + +# Get a slot terminated while the walsender is active +# We do this by sending SIGSTOP to the walsender. Skip this on Windows. +my $node_primary3 = get_new_node('primary3'); +$node_primary3->init(allows_streaming => 1, extra => ['--wal-segsize=1']); +$node_primary3->append_conf( + 'postgresql.conf', qq( + min_wal_size = 2MB + max_wal_size = 2MB + log_checkpoints = yes + max_slot_wal_keep_size = 1MB + )); +$node_primary3->start; +$node_primary3->safe_psql('postgres', + "SELECT pg_create_physical_replication_slot('rep3')"); +# Take backup +$backup_name = 'my_backup'; +$node_primary3->backup($backup_name); +# Create standby +my $node_standby3 = get_new_node('standby_3'); +$node_standby3->init_from_backup($node_primary3, $backup_name, + has_streaming => 1); +$node_standby3->append_conf('postgresql.conf', "primary_slot_name = 'rep3'"); +$node_standby3->start; +$node_primary3->wait_for_catchup($node_standby3->name, 'replay'); +my $senderpid = $node_primary3->safe_psql('postgres', + "SELECT pid FROM pg_stat_activity WHERE backend_type = 'walsender'"); +like($senderpid, qr/^[0-9]+$/, "have walsender pid $senderpid"); +my $receiverpid = $node_standby3->safe_psql('postgres', + "SELECT pid FROM pg_stat_activity WHERE backend_type = 'walreceiver'"); +like($receiverpid, qr/^[0-9]+$/, "have walreceiver pid $receiverpid"); + +# freeze walsender and walreceiver. Slot will still be active, but walreceiver +# won't get anything anymore. +kill 'STOP', $senderpid, $receiverpid; +$logstart = get_log_size($node_primary3); +advance_wal($node_primary3, 4); +ok(find_in_log($node_primary3, "to release replication slot", $logstart), + "walreceiver termination logged"); + +# Now let the walsender continue; slot should be killed now. +# (Must not let walreceiver run yet; otherwise the standby could start another +# one before the slot can be killed) +kill 'CONT', $senderpid; +$node_primary3->poll_query_until('postgres', + "SELECT wal_status FROM pg_replication_slots WHERE slot_name = 'rep3'", + "lost") + or die "timed out waiting for slot to be lost"; + +my $max_attempts = 180; +while ($max_attempts-- > 0) +{ + if (find_in_log($node_primary3, + 'invalidating slot "rep3" because its restart_lsn', $logstart)) + { + ok(1, "slot invalidation logged"); + last; + } + sleep 1; +} + +# Now let the walreceiver continue, so that the node can be stopped cleanly +kill 'CONT', $receiverpid; + +$node_primary3->stop; +$node_standby3->stop; + # # Advance WAL of $node by $n segments
Re: A few nuances about specifying the timeline with START_REPLICATION
On 18/06/2021 22:55, Jeff Davis wrote: On Fri, 2021-06-18 at 21:48 +0300, Heikki Linnakangas wrote: On 18/06/2021 20:27, Jeff Davis wrote: We could teach it to look into the timeline history to find the correct file, though. That's how recovery_target_timeline behaves, and it would match my intuition better if START_REPLICATION behaved that way. If the client asks for a historic timeline, the replication will stop when it reaches the end of that timeline. In hindsight, I think it would make more sense to send a message to the client to say that it's switching to a new timeline, and continue streaming from the new timeline. Why is it important for the standby to be told explicitly in the protocol about timeline switches? So that it knows to write the WAL to the correctly named WAL segment. You could do it differently, looking at the 'xlp_tli' field in the WAL page headers, or watching out for checkpoint records that change the timeline. But currently the standby (and pg_receivewal) depends on the protocol for that. If it is important, why only for historical timelines? Well, the latest timeline doesn't have any timeline switches, by definition. If you're connected to a standby server, IOW you're doing cascading replication, then the current timeline can become historic, if the standby follows a timeline switch. In that case, the replication is stopped when you reach the timeline switch, just like when you request a historic timeline. Hmm, the timeline in the START_REPLICATION command is not specifying a recovery target timeline, so I don't think "latest" or "current" make much sense there. Per above, it just tells the server which timeline the requested starting point belongs to, so it's actually redundant. That's not very clear from the docs: "if TIMELINE option is specified, streaming starts on timeline tli...". Part of the confusion is that there's not a good distinction in terminology between: 1. a timeline ID, which is a specific segment of a timeline 2. a timeline made up of the given timeline ID and all its ancestors, terminating at the given ID 3. the timeline made up of the current ID, all ancestor IDs, and all descendent IDs that the current active primary switches to 4. the set of all timelines that contain a given ID Agreed, that's a bit confusing. It seems you are saying that replication only concerns itself with #3, which does not require a timeline ID at all. That seems basically correct for now, but since we already document the protocol to take a timeline, it makes sense to me to just have the primary serve it if possible. If we (continue to?) allow timelines for replication, it will start to treat the primary like an archive. That might not be quite what was intended, but could be powerful. You could imagine a special archive that implements the replication protocol, and have replicas directly off the archive, or maybe doing PITR off the archive. True. - Heikki
Re: Version reporting in pgbench
Hello Tom, Why not move the printVersion call right after the connection is created, at line 6374? I started with that, and one of the 001_pgbench_with_server.pl tests fell over --- it was expecting no stdout output before a "Perhaps you need to do initialization" failure. If you don't mind changing that, Why would I mind? I agree that printing immediately after the connection is made is a bit less astonishing. Ok, so let's just update the test? Attached a proposal with the version moved. Note that if no connections are available, then you do not get the version, which may be a little bit strange. Attached v3 prints out the local version in that case. Not sure whether it is worth the effort. -- Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index d7479925cb..e61055b6b7 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -63,6 +63,7 @@ #include "common/username.h" #include "fe_utils/cancel.h" #include "fe_utils/conditional.h" +#include "fe_utils/string_utils.h" #include "getopt_long.h" #include "libpq-fe.h" #include "pgbench.h" @@ -5493,6 +5494,37 @@ printSimpleStats(const char *prefix, SimpleStats *ss) } } +/* print version banner */ +static void +printVersion(PGconn *con) +{ + int server_ver = PQserverVersion(con); + int client_ver = PG_VERSION_NUM; + + if (server_ver != client_ver) + { + const char *server_version; + char sverbuf[32]; + + /* Try to get full text form, might include "devel" etc */ + server_version = PQparameterStatus(con, "server_version"); + /* Otherwise fall back on server_ver */ + if (!server_version) + { + formatPGVersionNumber(server_ver, true, + sverbuf, sizeof(sverbuf)); + server_version = sverbuf; + } + + printf(_("%s (%s, server %s)\n"), + "pgbench", PG_VERSION, server_version); + } + /* For version match, only print pgbench version */ + else + printf("%s (%s)\n", "pgbench", PG_VERSION); + fflush(stdout); +} + /* print out results */ static void printResults(StatsData *total, @@ -5506,7 +5538,6 @@ printResults(StatsData *total, double bench_duration = PG_TIME_GET_DOUBLE(total_duration); double tps = ntx / bench_duration; - printf("pgbench (PostgreSQL) %d.%d\n", PG_VERSION_NUM / 1, PG_VERSION_NUM % 100); /* Report test parameters. */ printf("transaction type: %s\n", num_scripts == 1 ? sql_script[0].desc : "multiple scripts"); @@ -6334,6 +6365,9 @@ main(int argc, char **argv) if (con == NULL) exit(1); + /* report pgbench and server versions */ + printVersion(con); + pg_log_debug("pghost: %s pgport: %s nclients: %d %s: %d dbName: %s", PQhost(con), PQport(con), nclients, duration <= 0 ? "nxacts" : "duration", diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl index 55b3c3f6fd..49fe48093c 100644 --- a/src/bin/pgbench/t/001_pgbench_with_server.pl +++ b/src/bin/pgbench/t/001_pgbench_with_server.pl @@ -100,7 +100,7 @@ pgbench( 'no such database'); pgbench( - '-S -t 1', 1, [qr{^$}], + '-S -t 1', 1, [qr{^pgbench \([^\n]+\)$}], [qr{Perhaps you need to do initialization}], 'run without init'); diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index d7479925cb..20910e5ad1 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -63,6 +63,7 @@ #include "common/username.h" #include "fe_utils/cancel.h" #include "fe_utils/conditional.h" +#include "fe_utils/string_utils.h" #include "getopt_long.h" #include "libpq-fe.h" #include "pgbench.h" @@ -5493,6 +5494,37 @@ printSimpleStats(const char *prefix, SimpleStats *ss) } } +/* print version banner */ +static void +printVersion(PGconn *con) +{ + int server_ver = PQserverVersion(con); + int client_ver = PG_VERSION_NUM; + + if (server_ver != client_ver) + { + const char *server_version; + char sverbuf[32]; + + /* Try to get full text form, might include "devel" etc */ + server_version = PQparameterStatus(con, "server_version"); + /* Otherwise fall back on server_ver */ + if (!server_version) + { + formatPGVersionNumber(server_ver, true, + sverbuf, sizeof(sverbuf)); + server_version = sverbuf; + } + + printf(_("%s (%s, server %s)\n"), + "pgbench", PG_VERSION, server_version); + } + /* For version match, only print pgbench version */ + else + printf("%s (%s)\n", "pgbench", PG_VERSION); + fflush(stdout); +} + /* print out results */ static void printResults(StatsData *total, @@ -5506,7 +5538,6 @@ printResults(StatsData *total, double bench_duration = PG_TIME_GET_DOUBLE(total_duration); double tps = ntx / bench_duration; - printf("pgbench (PostgreSQL) %d.%d\n", PG_VERSION_NUM / 1, PG_VERSION_NUM % 100); /* Report test parameters. */ printf("transaction type: %s\n", num_scripts == 1 ? sql_script[0].desc : "multiple scripts"); @@ -6332,7 +6363,13 @@ main(int argc, char **argv) /* opening connection... */ con = doConnect(); if (con ==
Re: Version reporting in pgbench
Fabien COELHO writes: > Note that if no connections are available, then you do not get the > version, which may be a little bit strange. Attached v3 prints out the > local version in that case. Not sure whether it is worth the effort. I'm inclined to think that the purpose of that output is mostly to report the server version, so not printing it if we fail to connect isn't very surprising. Certainly that's how psql has acted for decades. regards, tom lane
Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)
It'd sure be nice if seawasp stopped spamming the buildfarm failure log, too. There was a silent API breakage (same API, different behavior, how nice…) in llvm main that Andres figured out, which will have to be fixed at some point, so this is reminder that it is still a todo… Not sure when a fix is planned, though. I'm afraid portability may require that different code is executed depending on llvm version. Or maybe we should wrestle a revert on llvm side? Hmmm… So I'm not very confident that the noise will go away quickly, sorry. -- Fabien.
Re: Version reporting in pgbench
Note that if no connections are available, then you do not get the version, which may be a little bit strange. Attached v3 prints out the local version in that case. Not sure whether it is worth the effort. I'm inclined to think that the purpose of that output is mostly to report the server version, so not printing it if we fail to connect isn't very surprising. Certainly that's how psql has acted for decades. I'm fine with having a uniform behavior over pg commands. Thanks for the improvement! -- Fabien.
Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)
Fabien COELHO writes: >> It'd sure be nice if seawasp stopped spamming the buildfarm failure log, >> too. > There was a silent API breakage (same API, different behavior, how nice…) > in llvm main that Andres figured out, which will have to be fixed at some > point, so this is reminder that it is still a todo… If it were *our* todo, that would be one thing; but it isn't. > Not sure when a fix is > planned, though. I'm afraid portability may require that different code is > executed depending on llvm version. Or maybe we should wrestle a revert on > llvm side? Hmmm… > So I'm not very confident that the noise will go away quickly, sorry. Could you please just shut down the animal until that's dealt with? It's extremely unpleasant to have to root through a lot of useless failures to find the ones that might be of interest. Right now serinus and seawasp are degrading this report nearly to uselessness: https://buildfarm.postgresql.org/cgi-bin/show_failures.pl regards, tom lane
Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)
Hello Tom, So I'm not very confident that the noise will go away quickly, sorry. Could you please just shut down the animal until that's dealt with? Hmmm… Obviously I can. However, please note that the underlying logic of "a test is failing, let's just remove it" does not sound right to me at all:-( The test is failing because there is a problem, and shuting down the test to improve a report does not in any way help to fix it, it just helps to hide it. It's extremely unpleasant to have to root through a lot of useless failures I do not understand how they are useless. Pg does not work properly with current LLVM, and keeps on not working. I think that this information is worthy, even if I do not like it and would certainly prefer a quick fix. to find the ones that might be of interest. Right now serinus and seawasp are degrading this report nearly to uselessness: https://buildfarm.postgresql.org/cgi-bin/show_failures.pl IMHO, the report should be improved, not the test removed. If you insist I will shut down the animal, bit I'd prefer not to. I think that the reminder has value, and just because some report is not designed to handle this nicely does not seem like a good reason to do that. -- Fabien.
Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)
Fabien COELHO writes: >> Could you please just shut down the animal until that's dealt with? > The test is failing because there is a problem, and shuting down the test > to improve a report does not in any way help to fix it, it just helps to > hide it. Our buildfarm is run for the use of the Postgres project, not the LLVM project. I'm not really happy that it contains any experimental-compiler animals at all, but as long as they're unobtrusive I can stand it. serinus and seawasp are being the opposite of unobtrusive. If you don't want to shut it down entirely, maybe backing it off to run only once a week would be an acceptable compromise. Since you only update its compiler version once a week, I doubt we learn much from runs done more often than that anyway. regards, tom lane
Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)
Hello Tom, Could you please just shut down the animal until that's dealt with? The test is failing because there is a problem, and shuting down the test to improve a report does not in any way help to fix it, it just helps to hide it. Our buildfarm is run for the use of the Postgres project, not the LLVM project. The point of these animals is to have early warning of upcoming compiler changes. Given the release cycle of the project and the fact that a version is expected to work for 5 years, this is a clear benefit for postgres, IMO. When the compiler is broken, it is noisy, too bad. In this instance the compiler is not broken, but postgres is. If the consensus is that these animals are useless, I'll remove them, and be sad that the community is not able to see their value. I'm not really happy that it contains any experimental-compiler animals at all, but as long as they're unobtrusive I can stand it. serinus and seawasp are being the opposite of unobtrusive. I think that the problem is the report, not the failing animal. In French we say "ce n’est pas en cassant le thermomètre qu’on fait tomber la fièvre", which is an equivalent of "don't shoot the messenger". If you don't want to shut it down entirely, maybe backing it off to run only once a week would be an acceptable compromise. Since you only update its compiler version once a week, I doubt we learn much from runs done more often than that anyway. Hmmm… I can slow it down. We will wait one week to learn that the problems have been fixed, wow. . -- Fabien.
Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)
On Sat, Jun 19, 2021 at 9:46 AM Tom Lane wrote: > Fabien COELHO writes: > >> It'd sure be nice if seawasp stopped spamming the buildfarm failure log, > >> too. > > > There was a silent API breakage (same API, different behavior, how nice…) > > in llvm main that Andres figured out, which will have to be fixed at some > > point, so this is reminder that it is still a todo… > > If it were *our* todo, that would be one thing; but it isn't. Over on the other thread[1] we learned that this is an API change affecting reference counting semantics[2], so unless there is some discussion somewhere about reverting the LLVM change that I'm unaware of, I'm guessing we're going to need to change our code sooner or later. I have a bleeding edge LLVM on my dev machine, and I'm willing to try to reproduce the crash and write the trivial patch (that is, figure out the right preprocessor incantation to detect the version or feature, and bump the reference count as appropriate), if Andres and/or Fabien aren't already on the case. [1] https://www.postgresql.org/message-id/flat/CA%2BhUKGLEy8mgtN7BNp0ooFAjUedDTJj5dME7NxLU-m91b85siA%40mail.gmail.com [2] https://github.com/llvm/llvm-project/commit/c8fc5e3ba942057d6c4cdcd1faeae69a28e7b671
Re: pgbench logging broken by time logic changes
On Fri, Jun 18, 2021 at 12:31 PM Michael Paquier wrote: > On Fri, Jun 18, 2021 at 12:49:42AM +1200, Thomas Munro wrote: > > Yeah I've been catching up with these threads. > > Thomas, do you want me to look more at this issue? I don't feel > comfortable with the idea of doing anything if you are planning to > look at this thread and you are the owner here, so that should be your > call. > > From what I can see, we have the same area getting patched with > patches across two threads, so it seems better to give up the other > thread and just focus on the discussion here, where v7 has been sent: > https://www.postgresql.org/message-id/20210617175542.ad6b9b82926d8469e8520...@sraoss.co.jp > https://www.postgresql.org/message-id/CAF7igB1r6wRfSCEAB-iZBKxkowWY6%2BdFF2jObSdd9%2BiVK%2BvHJg%40mail.gmail.com Thanks for looking so far. It's the weekend here and I need to unplug, but I'll test these changes and if all looks good push on Monday.
Re: Speed up pg_checksums in cases where checksum already set
On Fri, Jun 18, 2021 at 1:57 AM Michael Paquier wrote: > This doc addition is a bit confusing, as it could mean that each file > has just one single checksum. We could be more precise, say: > "When enabling checksums, each relation file block with a changed > checksum is rewritten in place." > Agreed, I like that wording. New patch attached. > Should we also mention that the sync happens even if no blocks are > rewritten based on the reasoning of upthread (aka we'd better do the > final flush as an interrupted pg_checksums may let a portion of the > files as not flushed)? > I don't know that we need to bother: the default is already to sync and one has to go out of one's way using the -N argument to NOT sync, so I think it's a pretty safe assumption to everyone (except those who read my first version of my patch!) that syncing always happens. Cheers, Greg 003.pg_checksums.optimize.writes.and.always.sync.patch Description: Binary data
Re: Optionally automatically disable logical replication subscriptions on error
> On Jun 17, 2021, at 11:34 PM, Peter Smith wrote: > > I tried your patch. Thanks for the quick and thorough review! > (2) New column "disabled_by_error". > > I wondered if there was actually any need for this column. Isn't the > same information conveyed by just having "subenabled" = false, at same > time as as non-empty "suberrmsg"? This would remove any confusion for > having 2 booleans which both indicate disabled. Yeah, I wondered about that before posting v1. I removed the disabled_by_error field for v2. > (3) New columns "disabled_by_error", "disabled_on_error". > > All other columns of the pg_subscription have a "sub" prefix. I don't feel strongly about this. How about "subdisableonerr"? I used that in v2. > I did not find any code using that newly added member "errhint". Thanks for catching that. I had tried to remove all references to "errhint" before posting v1. The original idea was that both the message and hint of the error would be kept, but in testing I found the hint field was typically empty, so I removed it. Sorry that I left one mention of it lying around. > (5) dump.c I didn't bother getting pg_dump working before posting v1, and I still have not done so, as I mainly want to solicit feedback on whether the basic direction I am going will work for the community. > (6) Patch only handles errors only from the Apply worker. > > Tablesync can give similar errors (e.g. PK violation during DATASYNC > phase) which will trigger re-launch forever regardless of the setting > of "disabled_on_error". > (confirmed by observations below) Yes, this is a good point, and also mentioned by Amit. I have fixed it in v2 and adjusted the regression test to trigger an automatic disabling for initial table sync as well as for change replication. > 2021-06-18 15:12:45.905 AEST [25904] LOG: edata is true for > subscription 'tap_sub': message = "duplicate key value violates unique > constraint "test_tab_pkey"", hint = "" You didn't call this out, but FYI, I don't intend to leave this particular log message in the patch. It was for development only. I have removed it for v2 and have added a different log message much sooner after catching the error, to avoid squashing the error in case some other action fails. The regression test shows this, if you open tmp_check/log/022_disable_on_error_subscriber.log: 2021-06-18 16:25:20.138 PDT [56926] LOG: logical replication subscription "s1" will be disabled due to error: duplicate key value violates unique constraint "s1_tbl_unique" 2021-06-18 16:25:20.139 PDT [56926] ERROR: duplicate key value violates unique constraint "s1_tbl_unique" 2021-06-18 16:25:20.139 PDT [56926] DETAIL: Key (i)=(1) already exists. 2021-06-18 16:25:20.139 PDT [56926] CONTEXT: COPY tbl, line 2 The first line logs the error prior to attempting to disable the subscription, and the next three lines are due to rethrowing the error after committing the successful disabling of the subscription. If the attempt to disable the subscription itself throws, these additional three lines won't show up, but the first one should. Amit mentioned this upthread. Do you think this will be ok, or would you like to also have a suberrdetail field so that the detail doesn't get lost? I haven't added such an extra field, and am inclined to think it would be excessive, but maybe others feel differently? > == > > Test: Cause a PK violation in the Tablesync copy (DATASYNC) phase. > (when disable_on_error = true) > Observation: This patch changes nothing for this case. The Tablesyn > re-launchs in a forever loop the same as current functionality. In v2, tablesync copy errors should also be caught. The test has been extended to cover this also. v2-0001-Optionally-disabling-subscriptions-on-error.patch Description: Binary data — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pgbench logging broken by time logic changes
On Sat, Jun 19, 2021 at 11:59:16AM +1200, Thomas Munro wrote: > Thanks for looking so far. It's the weekend here and I need to > unplug, but I'll test these changes and if all looks good push on > Monday. Thanks for the update. Have a good weekend. -- Michael signature.asc Description: PGP signature
Re: Teaching users how they can get the most out of HOT in Postgres 14
On Thu, Jun 17, 2021 at 7:26 PM Peter Geoghegan wrote: > Thanks for the review! > > Attached is v3, which has all the changes that you suggested (plus the > doc stuff from Justin). Just pushed a version of that with much improved documentation. Thanks again -- Peter Geoghegan
Re: PG 14 release notes, first draft
On Mon, Jun 14, 2021 at 1:11 PM Bruce Momjian wrote: > FYI, the most recent PG 14 relnote doc build is at: > > https://momjian.us/pgsql_docs/release-14.html I just pushed a commit that makes the existing vacuum_index_cleanup reloption and INDEX_CLEANUP VACUUM parameter support disabling the "Allow vacuum to skip index vacuuming when the number of removable index entries is insignificant" behavior. This should be mentioned in the release notes. -- Peter Geoghegan
Re: Centralizing protective copying of utility statements
On Fri, Jun 18, 2021 at 11:24:00AM -0400, Tom Lane wrote: > Julien Rouhaud writes: > > On Fri, Jun 18, 2021 at 10:24:20AM -0400, Tom Lane wrote: > >> Maybe "if true, pstmt's node tree must not be modified" ? > > > Thanks, I find it way better! > > OK, pushed that way, and with a couple other comment tweaks from > an additional re-reading. Thanks! For the record I already pushed the required compatibility changes for hypopg extension.
Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)
There was a silent API breakage (same API, different behavior, how nice…) in llvm main that Andres figured out, which will have to be fixed at some point, so this is reminder that it is still a todo… If it were *our* todo, that would be one thing; but it isn't. Over on the other thread[1] we learned that this is an API change affecting reference counting semantics[2], so unless there is some discussion somewhere about reverting the LLVM change that I'm unaware of, I'm guessing we're going to need to change our code sooner or later. Indeed, I'm afraid the solution will have to be on pg side. I have a bleeding edge LLVM on my dev machine, and I'm willing to try to reproduce the crash and write the trivial patch (that is, figure out the right preprocessor incantation to detect the version or feature, and bump the reference count as appropriate), if Andres and/or Fabien aren't already on the case. I'm not in the case, I'm only the one running the farm animal which barks too annoyingly for Tom. -- Fabien.
Re: seawasp failing, maybe in glibc allocator
On Sat, May 22, 2021 at 12:25 PM Andres Freund wrote: > On 2021-05-21 15:57:01 -0700, Andres Freund wrote: > > I found the LLVM commit to blame (c8fc5e3ba942057d6c4cdcd1faeae69a28e7b671). > > Contacting the author and reading the change to see if I can spit the > > issue myself. > > Hrmpf. It's a silent API breakage. The author intended to email us about > it, but apparently forgot. One now needs to increment a string-pool > refcount. The reason that didn't trigger a reliable crash is that > there's a path where the refcount of string-pool entries aren't asserted > to be above before decrementing the refcount... And that there > practically never are references to the pool entries after use. > > Continuing to discusss whether there's a better way to deal with this. Any news? FWIW this change appears to fix the problem for my system (LLVM 13 build from a couple of days ago). No more weird results, valgrind errors gone. I ran the leak checker to see if I now had the opposite problem, and although there are various leaks reported, I didn't see obvious intern pool related stacks. diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c index 71029a39a9..7b09e520f5 100644 --- a/src/backend/jit/llvm/llvmjit.c +++ b/src/backend/jit/llvm/llvmjit.c @@ -1116,6 +1116,11 @@ llvm_resolve_symbols(LLVMOrcDefinitionGeneratorRef GeneratorObj, void *Ctx, if (error != LLVMErrorSuccess) LLVMOrcDisposeMaterializationUnit(mu); +#if LLVM_VERSION_MAJOR > 12 + for (int i = 0; i < LookupSetSize; i++) + LLVMOrcRetainSymbolStringPoolEntry(symbols[i].Name); +#endif + pfree(symbols); return error;