Re: Disallow cancellation of waiting for synchronous replication
> 11 янв. 2020 г., в 7:34, Bruce Momjian написал(а): > > Actually, it might be worse than that. In my reading of > RecordTransactionCommit(), we do this: > > write to WAL > flush WAL (durable) > make visible to other backends > replicate > communicate to the client > > I think this means we make the transaction commit visible to all > backends _before_ we replicate it, and potentially wait until we get a > replication reply to return SUCCESS to the client. No. Data is not visible to other backend when we await sync rep. It's easy to check. in one psql you can start waiting for sync rep: postgres=# \d+ x Table "public.x" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description +-+---+--+-+--+--+- key| integer | | not null | | plain| | data | text| | | | extended | | Indexes: "x_pkey" PRIMARY KEY, btree (key) Access method: heap postgres=# alter system set synchronous_standby_names to 'nonexistent'; ALTER SYSTEM postgres=# select pg_reload_conf(); 2020-01-12 16:09:58.167 +05 [45677] LOG: received SIGHUP, reloading configuration files pg_reload_conf t (1 row) postgres=# insert into x values (7, '7'); In other try to see inserted (already locally committed data) postgres=# select * from x where key = 7; key | data -+-- (0 rows) try to insert same data and backend will hand on locks postgres=# insert into x values (7,'7') on conflict do nothing; ProcessQuery (in postgres) + 189 [0x1014b05bd] standard_ExecutorRun (in postgres) + 301 [0x101339fcd] ExecModifyTable (in postgres) + 1106 [0x101362b62] ExecInsert (in postgres) + 494 [0x10136344e] ExecCheckIndexConstraints (in postgres) + 570 [0x10133910a] check_exclusion_or_unique_constraint (in postgres) + 977 [0x101338db1] XactLockTableWait (in postgres) + 176 [0x101492770] LockAcquireExtended (in postgres) + 1274 [0x101493aaa] Thanks! Best regards, Andrey Borodin.
Re: [Logical Replication] TRAP: FailedAssertion("rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident == REPLICA_IDE
On Sat, Jan 11, 2020 at 10:34:20AM +0900, Michael Paquier wrote: > Thanks for the lookup. I'll look at that again in a couple of days > and hopefully wrap it by then. And done. -- Michael signature.asc Description: PGP signature
Re: our checks for read-only queries are not great
On Fri, 2020-01-10 at 09:29 -0500, Tom Lane wrote: > > ALTER SYSTEM is read only in my mind. > > I'm still having trouble with this conclusion. I think it can only > be justified by a very narrow reading of "reflected in pg_dump" > that relies on the specific factorization we've chosen for upgrade > operations, ie that postgresql.conf mods have to be carried across > by hand. But that's mostly historical baggage, rather than a sane > basis for defining "read only". If somebody comes up with a patch > that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for > whatever is in postgresql.auto.conf (not an unreasonable idea BTW), > will you then decide that ALTER SYSTEM SET is no longer read-only? I think that having ALTER SYSTEM commands in pg_dumpall output would be a problem. It would cause all kinds of problems whenever parameters change. Thinking of the transition "checkpoint_segments" -> "max_wal_size", you'd have to build some translation magic into pg_dump. Besides, such a feature would make it harder to restore a dump taken with version x into version x + n for n > 0. > Or, perhaps, reject such a patch on the grounds that it breaks this > arbitrary definition of read-only-ness? I agree with Robert that such a patch should be rejected on other grounds. Concerning the topic of the thread, I personally have come to think that changing GUCs is *not* writing to the database. But that is based on the fact that you can change GUCs on streaming replication standbys, and it may be surprising to a newcomer. Perhaps it would be good to consider this question: Do we call something "read-only" if it changes nothing, or do we call it "read-only" if it is allowed on a streaming replication standby? The first would be more correct, but the second may be more convenient. Yours, Laurenz Albe
Re: our checks for read-only queries are not great
Laurenz Albe writes: > Perhaps it would be good to consider this question: > Do we call something "read-only" if it changes nothing, or do we call it > "read-only" if it is allowed on a streaming replication standby? > The first would be more correct, but the second may be more convenient. Yeah, this is really the larger point at stake. I'm not sure that "read-only" and "allowed on standby" should be identical, nor even that one should be an exact subset of the other. They're certainly by-and-large the same sets of operations, but there might be exceptions that belong to only one set. "read-only" is driven by (some reading of) the SQL standard, while "allowed on standby" is driven by implementation limitations, so I think it'd be dangerous to commit ourselves to those being identical. regards, tom lane
Re: [Proposal] Add accumulated statistics for wait event
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:tested, passed I like this patch, because I used similar functionality some years ago very successfully. The implementation is almost simple, and the result should be valid by used method. The potential problem is performance impact. Very early test show impact cca 3% worst case, but I'll try to repeat these tests. There are some ending whitespaces and useless tabs. The new status of this patch is: Waiting on Author
Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)
On Fri, Jan 10, 2020 at 10:51:50PM -0500, Tom Lane wrote: ! Here's a draft patch that cleans up all the logic errors I could find. Okiee, thank You! Let's see (was a bit busy yesterday trying to upgrade pgadmin3 - difficult matter), now lets sort this out: With the first patch applied (as from Friday - applied only on the client side), the application did appear to work well. But then, when engaging bandwidth-limiting to some modem-speed, it did not work: psql would receive all (or most of) the data from a SELECT, but then somehow not recognize the end of it and sit there and wait for whatever: > flowmdev=> select * from flows; > ^CCancel request sent > ^CCancel request sent Now with the new patches 0001+0003 applied, on both server & client, all now running 12.1 release, on a first run I did not perceive a malfunction, bandwidth limited or not. I'll leave them applied, but this here will not experience serious loads; You'll need somebody else to test for that... rgds, PMc
Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)
Peter writes: > With the first patch applied (as from Friday - applied only on the > client side), the application did appear to work well. > But then, when engaging bandwidth-limiting to some modem-speed, it did > not work: psql would receive all (or most of) the data from a SELECT, > but then somehow not recognize the end of it and sit there and wait for > whatever: Yeah, that's just the behavior I'd expect (and was able to reproduce here) because of the additional design problem. > Now with the new patches 0001+0003 applied, on both server & client, > all now running 12.1 release, on a first run I did not perceive > a malfunction, bandwidth limited or not. > I'll leave them applied, but this here will not experience serious > loads; You'll need somebody else to test for that... Cool, let us know if you do see any problems. regards, tom lane
Re: Consolidate 'unique array values' logic into a reusable function?
On Wed, Jan 08, 2020 at 02:49:48PM +1300, Thomas Munro wrote: > On Sun, Dec 29, 2019 at 8:02 PM Noah Misch wrote: > > ==00:00:00:28.322 1527557== Source and destination overlap in > > memcpy(0x1000104, 0x1000104, 4) > > ==00:00:00:28.322 1527557==at 0x4C2E74D: memcpy@@GLIBC_2.14 > > (vg_replace_strmem.c:1035) > > ==00:00:00:28.322 1527557==by 0xA9A57B: qunique (qunique.h:34) > > ==00:00:00:28.322 1527557==by 0xA9A843: InitCatalogCache > > (syscache.c:1056) > > ==00:00:00:28.322 1527557==by 0xAB6B18: InitPostgres (postinit.c:682) > > ==00:00:00:28.322 1527557==by 0x91F98E: PostgresMain (postgres.c:3909) > > ==00:00:00:28.322 1527557==by 0x872DE9: BackendRun (postmaster.c:4498) > > ==00:00:00:28.322 1527557==by 0x8725B3: BackendStartup > > (postmaster.c:4189) > > ==00:00:00:28.322 1527557==by 0x86E7F4: ServerLoop (postmaster.c:1727) > > ==00:00:00:28.322 1527557==by 0x86E0AA: PostmasterMain > > (postmaster.c:1400) > > ==00:00:00:28.322 1527557==by 0x77CB56: main (main.c:210) > > ==00:00:00:28.322 1527557== > > { > > > >Memcheck:Overlap > >fun:memcpy@@GLIBC_2.14 > >fun:qunique > >fun:InitCatalogCache > >fun:InitPostgres > >fun:PostgresMain > >fun:BackendRun > >fun:BackendStartup > >fun:ServerLoop > >fun:PostmasterMain > >fun:main > > } > > > > This is like the problem fixed in 9a9473f; the precedent from there would be > > to test src!=dst before calling mempcy(), e.g. as attached. I suppose the > > alternative would be to add a suppression like the one 9a9473f removed. > > Thanks for fixing that. Glad to help. > > I do wonder why the Valgrind buildfarm animals haven't noticed. > > Optimisation levels? For example, I see that skink is using -Og, at > which level my local GCC inlines qunique() and memcpy() so that in the > case you quoted there's a MOV instruction and valgrind has nothing to > complain about. That explains it. I would have been using -O0. (It would be a nice bonus to have both an -O0 Valgrind buildfarm animal and an optimized Valgrind animal, since neither catches everything.)
Re: refactoring - standardize integer parsing in front-end utilities
Joe Nelson wrote: > > I see this patch has been moved to the next commitfest. What's the next > > step, does it need another review? Peter Eisentraut wrote: > I think you need to promote your patch better. Thanks for taking the time to revive this thread. Quick sales pitch for the patch: * Standardizes bounds checking and error message format in utilities pg_standby, pg_basebackup, pg_receivewal, pg_recvlogical, pg_ctl, pg_dump, pg_restore, pg_upgrade, pgbench, reindexdb, and vacuumdb * Removes Undefined Behavior caused by atoi() as described in the C99 standard, section 7.20.1 * Unifies integer parsing between the front- and back-end using functions provided in https://commitfest.postgresql.org/25/2272/ In reality I doubt my patch is fixing any pressing problem, it's just a small refactor. > The thread subject and the commit fest entry title are somewhat > nonsensical and no longer match what the patch actually does. I thought changing the subject line might be discouraged, but since you suggest doing it, I just did. Updated the title of the commitfest entry https://commitfest.postgresql.org/26/2197/ as well. > The patch contains no commit message Does this list not accept plain patches, compatible with git-apply? (Maybe your point is that I should make it as easy for committers as possible, and asking them to invent a commit message is just extra work.) > and no documentation or test changes The interfaces of the utilities remain the same. Same flags. The only change would be the error messages produced for incorrect values. The tests I ran passed successfully, but perhaps there were others I didn't try running and should have. > Moreover, a lot of this error message tweaking is opinion-based, so > it's more burdensome to do an objective review. This patch is > competing for attention against more than 200 other patches that have > more going for them in these aspects. True. I think the code looks nicer and the errors are more informative, but I'll leave it in the community's hands to determine if this is something they want. Once again, I appreciate your taking the time to help me with this process.
Question regarding heap_multi_insert documentation
While reading the code for heapam.c:heap_multi_insert I happened upon this comment which I'm either too thick for, or it lacks a word or two: * .. * A check here does not definitively prevent a serialization anomaly; * that check MUST be done at least past the point of acquiring an * exclusive buffer content lock on every buffer that will be affected, * and MAY be done after all inserts are reflected in the buffers and * those locks are released; otherwise there race condition. Since * multiple buffers can be locked and unlocked in the loop below, and it * would not be feasible to identify and lock all of those buffers before * the loop, we must do a final check at the end. * .. The part I don't understand is "otherwise there race condition", it doesn't sound complete to me as a non-native english speaker. Should that really be "otherwise there *is a (potential)* race condition" or something similar? cheers ./daniel
Re: Question regarding heap_multi_insert documentation
Daniel Gustafsson writes: > The part I don't understand is "otherwise there race condition", it doesn't > sound complete to me as a non-native english speaker. Should that really be > "otherwise there *is a (potential)* race condition" or something similar? I agree, it's missing "is a". regards, tom lane
Re: Question regarding heap_multi_insert documentation
> On 13 Jan 2020, at 00:25, Tom Lane wrote: > > Daniel Gustafsson writes: >> The part I don't understand is "otherwise there race condition", it doesn't >> sound complete to me as a non-native english speaker. Should that really be >> "otherwise there *is a (potential)* race condition" or something similar? > > I agree, it's missing "is a". Thanks for clarifying. PFA tiny patch for this. cheers ./daniel multiinsert_comment.diff Description: Binary data
Re: benchmarking Flex practices
John Naylor writes: >> I no longer use state variables to track scanner state, and in fact I >> removed the existing "state_before" variable in ECPG. Instead, I used >> the Flex builtins yy_push_state(), yy_pop_state(), and yy_top_state(). >> These have been a feature for a long time, it seems, so I think we're >> okay as far as portability. I think it's cleaner this way, and >> possibly faster. Hmm ... after a bit of research I agree that these functions are not a portability hazard. They are present at least as far back as flex 2.5.33 which is as old as we've got in the buildfarm. However, I'm less excited about them from a performance standpoint. The BEGIN() macro expands to (ordinarily) yyg->yy_start = integer-constant which is surely pretty cheap. However, yy_push_state is substantially more expensive than that, not least because the first invocation in a parse cycle will involve a malloc() or palloc(). Likewise yy_pop_state is multiple times more expensive than plain BEGIN(). Now, I agree that this is negligible for ECPG's usage, so if pushing/popping state is helpful there, let's go for it. But I am not convinced it's negligible for the backend, and I also don't see that we actually need to track any nested scanner states there. So I'd rather stick to using BEGIN in the backend. Not sure about psql. BTW, while looking through the latest patch it struck me that "UCONST" is an underspecified and potentially confusing name. It doesn't indicate what kind of constant we're talking about, for instance a C programmer could be forgiven for thinking it means something like "123U". What do you think of "USCONST", following UIDENT's lead of prefixing U onto whatever the underlying token type is? regards, tom lane
Re: Using multiple extended statistics for estimates
Hi, I've pushed these two improvements after some minor improvements, mostly to comments. I ended up not using the extra tests, as it wasn't clear to me it's worth the extra duration. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: vacuum verbose detail logs are unclear; log at *start* of each stage; show allvisible/frozen/hintbits
On Sun, Dec 29, 2019 at 01:15:24PM -0500, Jeff Janes wrote: > On Fri, Dec 20, 2019 at 12:11 PM Justin Pryzby wrote: > > > This is a usability complaint. If one knows enough about vacuum and/or > > logging, I'm sure there's no issue. > > > | 11 DEBUG: "t": found 999 removable, 999 nonremovable row versions in 9 > > out of 9 pages > > I agree the mixture of pre-action and after-action reporting is rather > confusing sometimes. I'm more concerned about what the user sees in their > terminal, though, rather than the server's log file. Sorry, I ran vacuum (not verbose) with client_min_messages=debug, which was confusing. > Also, the above quoted line is confusing. It makes it sound like it found > removable items, but didn't actually remove them. I think that that is > taking grammatical parallelism too far. How about something like: > > DEBUG: "t": removed 999 row versions, found 999 nonremovable row versions in > 9 out of 9 pages Since da4ed8bf, lazy_vacuum_heap() actually says: "removed %d [row versions] in %d pages". Strangely, the "found .. removable, .. nonremovable" in lazy_scan_heap() is also from da4ed8bf. Should we change them to match ? > Also, I'd appreciate a report on how many hint-bits were set > and how many pages were marked all-visible and/or frozen. Possibly should fork this part to a different thread, but.. hint bits are being set by heap_prune_chain(): |#0 HeapTupleSatisfiesVacuum (htup=htup@entry=0x7fffabf0, OldestXmin=OldestXmin@entry=536, buffer=buffer@entry=167) at heapam_visibility.c:1245 |#1 0x7fb6eb3eb848 in heap_prune_chain (prstate=0x7fffabfccf30, OldestXmin=536, rootoffnum=1, buffer=167, relation=0x7fb6eb1e6858) at pruneheap.c:488 |#2 heap_page_prune (relation=relation@entry=0x7fb6eb1e6858, buffer=buffer@entry=167, OldestXmin=536, report_stats=report_stats@entry=false, latestRemovedXid=latestRemovedXid@entry=0x7fb6ed84a13c) at pruneheap.c:223 |#3 0x7fb6eb3f02a2 in lazy_scan_heap (aggressive=false, nindexes=0, Irel=0x0, vacrelstats=0x7fb6ed84a0c0, params=0x7fffabfcdfd0, onerel=0x7fb6eb1e6858) at vacuumlazy.c:970 |#4 heap_vacuum_rel (onerel=0x7fb6eb1e6858, params=0x7fffabfcdfd0, bstrategy=) at vacuumlazy.c:302 In the attached, I moved heap_page_prune to avoid a second loop over items. Then, initdb crashed until I avoided calling heap_prepare_freeze_tuple() for HEAPTUPLE_DEAD. I'm not sure that's ok or maybe if it's exposing an issue. I'm also not sure if t_infomask!=oldt_infomask is the right test. One of my usability complaints was that the DETAIL includes newlines, which makes it not apparent that it's detail, or that it's associated with the preceding INFO. Should those all be separate DETAIL messages (currently, only the first errdetail is used, but maybe they should be catted together usefully). Should errdetail do something with newlines, like change them to \n\t for output to the client (but not logfile). Should vacuum itself do something (but probably no change to logfiles). I remembered that log_statement_stats looks like this: 2020-01-01 11:28:33.758 CST [3916] LOG: EXECUTOR STATISTICS 2020-01-01 11:28:33.758 CST [3916] DETAIL: ! system usage stats: ! 0.050185 s user, 0.000217 s system, 0.050555 s elapsed ! [2.292346 s user, 0.215656 s system total] [...] It calls errdetail_internal("%s", str.data), same as vaccum, but the multi-line detail messages are written like this: |appendStringInfo(&str, "!\t...") |... |ereport(LOG, | (errmsg_internal("%s", title), | errdetail_internal("%s", str.data))); Since they can run multiple times, including rusage, and there's not currently any message shown before their action, I propose that lazy_vacuum_index/heap should write VACUUM VERBOSE logs at DEBUG level. Or otherwise show a log before starting each action, at least those for which it logs completion. I'm not sure why this one doesn't use get ngettext() ? Missed at a8d585c0 ? |appendStringInfo(&buf, _("There were %.0f unused item identifiers.\n"), Or why this one uses _/gettext() ? (580ddcec suggests that I'm missing something?). |appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0)); Anyway, now it looks like this: postgres=# VACUUM VERBOSE t; INFO: vacuuming "pg_temp_3.t" INFO: "t": removed 1998 row versions in 5 pages INFO: "t": removed 1998, found 999 nonremovable row versions in 9 out of 9 pages DETAIL: ! 0 dead row versions cannot be removed yet, oldest xmin: 4505 ! There were 0 unused item identifiers. ! Skipped 0 pages due to buffer pins, 0 frozen pages. ! 0 pages are entirely empty. ! Marked 9 pages all visible, 4 pages frozen. ! Wrote 1998 hint bits. ! CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM Thanks for your input. Justin >From d27f2cb5262ec3d3511de44ec7c15d1b1235e5ee Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Wed, 27 Nov 2019 20:07:10 -0600 Subject: [PATCH v1 1/6] Rename buf to avoid sh
Re: bitmaps and correlation
On Mon, Jan 06, 2020 at 11:26:06PM -0600, Justin Pryzby wrote: > As Jeff has pointed out, high correlation has two effects in cost_index(): > 1) the number of pages read will be less; > 2) the pages will be read more sequentially; > > cost_index reuses the pages_fetched variable, so (1) isn't particularly clear, I tried to make this more clear in 0001 > + cost_per_page_corr = spc_random_page_cost - > + (spc_random_page_cost - spc_seq_page_cost) > + * (1-correlation*correlation); And fixed bug: this should be c*c not 1-c*c. >From d0819177ef1c6f86a588e3d2700ecff638f83b4a Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Wed, 8 Jan 2020 19:23:51 -0600 Subject: [PATCH v4 1/2] Make more clear the computation of min/max IO.. ..and specifically the double use and effect of correlation. Avoid re-use of the "pages_fetched" variable --- src/backend/optimizer/path/costsize.c | 47 +++ 1 file changed, 25 insertions(+), 22 deletions(-) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index b5a0033..bdc23a0 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -491,12 +491,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count, csquared; double spc_seq_page_cost, spc_random_page_cost; - Cost min_IO_cost, + double min_pages_fetched, /* The min and max page count based on index correlation */ +max_pages_fetched; + Cost min_IO_cost, /* The min and max cost based on index correlation */ max_IO_cost; QualCost qpqual_cost; Cost cpu_per_tuple; double tuples_fetched; - double pages_fetched; double rand_heap_pages; double index_pages; @@ -579,7 +580,8 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count, * (just after a CLUSTER, for example), the number of pages fetched should * be exactly selectivity * table_size. What's more, all but the first * will be sequential fetches, not the random fetches that occur in the - * uncorrelated case. So if the number of pages is more than 1, we + * uncorrelated case (the index is expected to read fewer pages, *and* each + * page read is cheaper). So if the number of pages is more than 1, we * ought to charge * spc_random_page_cost + (pages_fetched - 1) * spc_seq_page_cost * For partially-correlated indexes, we ought to charge somewhere between @@ -604,17 +606,17 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count, * pro-rate the costs for one scan. In this case we assume all the * fetches are random accesses. */ - pages_fetched = index_pages_fetched(tuples_fetched * loop_count, + max_pages_fetched = index_pages_fetched(tuples_fetched * loop_count, baserel->pages, (double) index->pages, root); if (indexonly) - pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac)); + max_pages_fetched = ceil(max_pages_fetched * (1.0 - baserel->allvisfrac)); - rand_heap_pages = pages_fetched; + rand_heap_pages = max_pages_fetched; - max_IO_cost = (pages_fetched * spc_random_page_cost) / loop_count; + max_IO_cost = (max_pages_fetched * spc_random_page_cost) / loop_count; /* * In the perfectly correlated case, the number of pages touched by @@ -626,17 +628,17 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count, * where such a plan is actually interesting, only one page would get * fetched per scan anyway, so it shouldn't matter much.) */ - pages_fetched = ceil(indexSelectivity * (double) baserel->pages); + min_pages_fetched = ceil(indexSelectivity * (double) baserel->pages); - pages_fetched = index_pages_fetched(pages_fetched * loop_count, + min_pages_fetched = index_pages_fetched(min_pages_fetched * loop_count, baserel->pages, (double) index->pages, root); if (indexonly) - pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac)); + min_pages_fetched = ceil(min_pages_fetched * (1.0 - baserel->allvisfrac)); - min_IO_cost = (pages_fetched * spc_random_page_cost) / loop_count; + min_IO_cost = (min_pages_fetched * spc_random_page_cost) / loop_count; } else { @@ -644,30 +646,31 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count, * Normal case: apply the Mackert and Lohman formula, and then * interpolate between that and the correlation-derived result. */ - pages_fetched = index_pages_fetched(tuples_fetched, + + /* For the perfectly uncorrelated case (csquared=0) */ + max_pages_fetched = index_pages_fetched(tuples_fetched, baserel->pages, (double) index->pages, root); if (indexonly) - pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac)); + max_pages_fetched = ceil(max_pages_fetched * (1.0 - baserel->allvisfrac)); - rand_heap_pages = pages_fetched; +
Re: Questions/Observations related to Gist vacuum
On Thu, Jan 9, 2020 at 4:41 PM Mahendra Singh Thalor wrote: > > On Mon, 9 Dec 2019 at 14:37, Amit Kapila wrote: > > > > On Mon, Dec 9, 2019 at 2:27 PM Amit Kapila wrote: > > > > > > I have modified the patch for the above points and additionally ran > > > pgindent. Let me know what you think about the attached patch? > > > > > > > A new version with a slightly modified commit message. > > I reviewed v4 patch and below is the one review comment: > > + * These are used to memorize all internal and empty leaf pages. They are > + * used for deleting all the empty pages. > */ > After dot, there should be 2 spaces. Earlier, there was 2 spaces. > > Other than that patch looks fine to me. > Thanks for the comment. Amit has already taken care of this before pushing it. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Block level parallel vacuum
On Sat, Jan 11, 2020 at 7:48 PM Masahiko Sawada wrote: > > On Sat, 11 Jan 2020 at 13:18, Amit Kapila wrote: > > > > On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada > > wrote: > > > > > > On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor > > > wrote: > > > > > > > > On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov wrote: > > > > > > > > > > Hi > > > > > Thank you for update! I looked again > > > > > > > > > > (vacuum_indexes_leader) > > > > > + /* Skip the indexes that can be processed by parallel > > > > > workers */ > > > > > + if (!skip_index) > > > > > + continue; > > > > > > > > > > Does the variable name skip_index not confuse here? Maybe rename to > > > > > something like can_parallel? > > > > > > > > I also agree with your point. > > > > > > I don't think the change is a good idea. > > > > > > - boolskip_index = (get_indstats(lps->lvshared, > > > i) == NULL || > > > - > > > skip_parallel_vacuum_index(Irel[i], lps->lvshared)); > > > + boolcan_parallel = > > > (get_indstats(lps->lvshared, i) == NULL || > > > + > > > skip_parallel_vacuum_index(Irel[i], > > > + > > > lps->lvshared)); > > > > > > The above condition is true when the index can *not* do parallel index > > > vacuum. How about changing it to skipped_index and change the comment to > > > something like “We are interested in only index skipped parallel vacuum”? > > > > > > > Hmm, I find the current code and comment better than what you or > > Sergei are proposing. I am not sure what is the point of confusion in > > the current code? > > Yeah the current code is also good. I just thought they were concerned > that the variable name skip_index might be confusing because we skip > if skip_index is NOT true. > Okay, would it better if we get rid of this variable and have code like below? /* Skip the indexes that can be processed by parallel workers */ if ( !(get_indstats(lps->lvshared, i) == NULL || skip_parallel_vacuum_index(Irel[i], lps->lvshared))) continue; ... > > > > > > > > > > > > > > > > Another question about behavior on temporary tables. Use case: the > > > > > user commands just "vacuum;" to vacuum entire database (and has > > > > > enough maintenance workers). Vacuum starts fine in parallel, but on > > > > > first temporary table we hit: > > > > > > > > > > + if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0) > > > > > + { > > > > > + ereport(WARNING, > > > > > + (errmsg("disabling parallel option of > > > > > vacuum on \"%s\" --- cannot vacuum temporary tables in parallel", > > > > > + > > > > > RelationGetRelationName(onerel; > > > > > + params->nworkers = -1; > > > > > + } > > > > > > > > > > And therefore we turn off the parallel vacuum for the remaining > > > > > tables... Can we improve this case? > > > > > > > > Good point. > > > > Yes, we should improve this. I tried to fix this. > > > > > > +1 > > > > > > > Yeah, we can improve the situation here. I think we don't need to > > change the value of params->nworkers at first place if allow > > lazy_scan_heap to take care of this. Also, I think we shouldn't > > display warning unless the user has explicitly asked for parallel > > option. See the fix in the attached patch. > > Agreed. But with the updated patch the PARALLEL option without the > parallel degree doesn't display warning because params->nworkers = 0 > in that case. So how about restoring params->nworkers at the end of > vacuum_rel()? > I had also thought on those lines, but I was not entirely sure about this resetting of workers. Today, again thinking about it, it seems the idea Mahendra is suggesting that is giving an error if the parallel degree is not specified seems reasonable to me. This means Vacuum (parallel), Vacuum (parallel) , etc. will give an error "parallel degree must be specified". This idea has merit as now we are supporting a parallel vacuum by default, so a 'parallel' option without a parallel degree doesn't have any meaning. If we do that, then we don't need to do anything additional about the handling of temp tables (other than what patch is already doing) as well. What do you think? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Block level parallel vacuum
On Mon, Jan 13, 2020 at 9:20 AM Amit Kapila wrote: > > On Sat, Jan 11, 2020 at 7:48 PM Masahiko Sawada > wrote: > > > > On Sat, 11 Jan 2020 at 13:18, Amit Kapila wrote: > > > > > > On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada > > > wrote: > > > > > > > > On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor > > > > wrote: > > > > > > > > > > On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov wrote: > > > > > > > > > > > > Hi > > > > > > Thank you for update! I looked again > > > > > > > > > > > > (vacuum_indexes_leader) > > > > > > + /* Skip the indexes that can be processed by > > > > > > parallel workers */ > > > > > > + if (!skip_index) > > > > > > + continue; > > > > > > > > > > > > Does the variable name skip_index not confuse here? Maybe rename to > > > > > > something like can_parallel? > > > > > > > > > > I also agree with your point. > > > > > > > > I don't think the change is a good idea. > > > > > > > > - boolskip_index = > > > > (get_indstats(lps->lvshared, i) == NULL || > > > > - > > > > skip_parallel_vacuum_index(Irel[i], lps->lvshared)); > > > > + boolcan_parallel = > > > > (get_indstats(lps->lvshared, i) == NULL || > > > > + > > > > skip_parallel_vacuum_index(Irel[i], > > > > + > > > >lps->lvshared)); > > > > > > > > The above condition is true when the index can *not* do parallel index > > > > vacuum. How about changing it to skipped_index and change the comment > > > > to something like “We are interested in only index skipped parallel > > > > vacuum”? > > > > > > > > > > Hmm, I find the current code and comment better than what you or > > > Sergei are proposing. I am not sure what is the point of confusion in > > > the current code? > > > > Yeah the current code is also good. I just thought they were concerned > > that the variable name skip_index might be confusing because we skip > > if skip_index is NOT true. > > > > Okay, would it better if we get rid of this variable and have code like below? > > /* Skip the indexes that can be processed by parallel workers */ > if ( !(get_indstats(lps->lvshared, i) == NULL || > skip_parallel_vacuum_index(Irel[i], lps->lvshared))) > continue; > ... > > > > > > > > > > > > > > > > > > > > > Another question about behavior on temporary tables. Use case: the > > > > > > user commands just "vacuum;" to vacuum entire database (and has > > > > > > enough maintenance workers). Vacuum starts fine in parallel, but on > > > > > > first temporary table we hit: > > > > > > > > > > > > + if (RelationUsesLocalBuffers(onerel) && params->nworkers >= > > > > > > 0) > > > > > > + { > > > > > > + ereport(WARNING, > > > > > > + (errmsg("disabling parallel option > > > > > > of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel", > > > > > > + > > > > > > RelationGetRelationName(onerel; > > > > > > + params->nworkers = -1; > > > > > > + } > > > > > > > > > > > > And therefore we turn off the parallel vacuum for the remaining > > > > > > tables... Can we improve this case? > > > > > > > > > > Good point. > > > > > Yes, we should improve this. I tried to fix this. > > > > > > > > +1 > > > > > > > > > > Yeah, we can improve the situation here. I think we don't need to > > > change the value of params->nworkers at first place if allow > > > lazy_scan_heap to take care of this. Also, I think we shouldn't > > > display warning unless the user has explicitly asked for parallel > > > option. See the fix in the attached patch. > > > > Agreed. But with the updated patch the PARALLEL option without the > > parallel degree doesn't display warning because params->nworkers = 0 > > in that case. So how about restoring params->nworkers at the end of > > vacuum_rel()? > > > > I had also thought on those lines, but I was not entirely sure about > this resetting of workers. Today, again thinking about it, it seems > the idea Mahendra is suggesting that is giving an error if the > parallel degree is not specified seems reasonable to me. This means > Vacuum (parallel), Vacuum (parallel) , etc. will give an > error "parallel degree must be specified". This idea has merit as now > we are supporting a parallel vacuum by default, so a 'parallel' option > without a parallel degree doesn't have any meaning. If we do that, > then we don't need to do anything additional about the handling of > temp tables (other than what patch is already doing) as well. What do > you think? > This idea make sense to me. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Comment fix in session.h
This diff fixes what I consider a typo. -- Antonin Houska Web: https://www.cybertec-postgresql.com diff --git a/src/include/access/session.h b/src/include/access/session.h index ac552105b9..4c1f6ffd40 100644 --- a/src/include/access/session.h +++ b/src/include/access/session.h @@ -19,7 +19,7 @@ struct SharedRecordTypmodRegistry; /* * A struct encapsulating some elements of a user's session. For now this - * manages state that applies to parallel query, but it principle it could + * manages state that applies to parallel query, but in principle it could * include other things that are currently global variables. */ typedef struct Session
How to make a OpExpr check compatible among different versions
During one of my works for logical rewrite, I want to check if the expr is a given Expr. so the simplest way is: if (expr->opno == 418 && nodeTag(linitial(expr->args)) == T_xxx && nodeTag(lsecond(expr->args)) == T_ ) { .. } if we write code like above, we may have issues if the oid changed in the future version. so what would be your suggestion? Thanks