Re: [HACKERS] kqueue
Hi Thomas, On 28/09/2018 00:55, Thomas Munro wrote: > I would like to commit this patch for PostgreSQL 12, based on this > report. We know it helps performance on macOS developer machines and > big FreeBSD servers, and it is the right kernel interface for the job > on principle. Matteo Beccati reported a 5-10% performance drop on a > low-end Celeron NetBSD box which we have no explanation for, and we > have no reports from server-class machines on that OS -- so perhaps we > (or the NetBSD port?) should consider building with WAIT_USE_POLL on > NetBSD until someone can figure out what needs to be fixed there > (possibly on the NetBSD side)? Thanks for keeping me in the loop. Out of curiosity (and time permitting) I'll try to spin up a NetBSD 8 VM and run some benchmarks, but I guess we should leave it up to the pkgsrc people to eventually change the build flags. Cheers -- Matteo Beccati Development & Consulting - http://www.beccati.com/
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
On 28 September 2018 at 15:12, Michael Paquier wrote: > On Fri, Sep 28, 2018 at 02:46:30PM +1200, David Rowley wrote: >> I don't agree that we can skip explaining why one of the optimisations >> can't be applied just because we've explained why a similar >> optimisation cannot be applied somewhere close by. I think that the >> WAL/FSM optimisation can fairly easily be improved on and probably >> fixed in PG12 as we can just lazily determine per-partition if it can >> be applied to that partition or not. > > Have you guys looked at what the following patch does for partitions and > how it interacts with it? > https://commitfest.postgresql.org/19/528/ I've glanced at it. I don't think we're taking anything in the wrong direction. The patch looks like it would need rebased if this gets in first. > The proposed patch is missing the point that documentation also mentions > the optimizations for COPY with wal_level = minimal: > > COPY is fastest when used within the same > transaction as an earlier CREATE TABLE or > TRUNCATE command. In such cases no WAL > needs to be written, because in case of an error, the files > containing the newly loaded data will be removed anyway. > However, this consideration only applies when > is minimal as all > commands > must write WAL otherwise. > I've edited that in the attached patch. Also reworded a comment that Amit mentioned and made a small change to the COPY FREEZE docs to mention no support for partitioned tables. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services fix_incorrect_setting_of_hi_options_for_partitioned_tables_v2.patch Description: Binary data
Re: Collation versioning
Re: Thomas Munro 2018-09-27 > > > 4. After creating a new database, update that row as appropriate in > > > the new database (!). Or find some other way to write a new table out > > > and switch it around, or something like that. > > > > I've been hatching this exact scheme since the very beginning, even > > thinking about using the background session functionality to do this. > > It would solve a lot of problems, but there is the question of exactly > > how to do that "(!)" part. Making (!) work would also allow reassigning the "public" schema to the database owner. That would fix that gross security gap that is left with the default search_path, while still keeping usability. It would make a whole lot of sense to work on making this feasible. Christoph
Re: Libpq support to connect to standby server as priority
On Thu, Jul 19, 2018 at 10:59 PM Haribabu Kommi wrote: > > On Wed, Jul 18, 2018 at 10:53 PM Robert Haas > wrote: > >> On Wed, Jul 4, 2018 at 9:14 AM, Laurenz Albe >> wrote: >> > What about keeping the first successful connection open and storing >> it in a >> > variable if we are in "prefer-read" mode. >> > If we get the read-only connection we desire, close that cached >> connection, >> > otherwise use it. >> >> I like this idea. If I recall correctly, the logic in this area is >> getting pretty complex, so we might need to refactor it for better >> readability and maintainability. >> > > OK. I will work on the code refactoring first and then provide the > prefer-read option on top it. > commits d1c6a14bacf and 5ca00774194 have refactored the logic of handling the different connection states. Attached is a rebased patch after further refactoring the new option code for easier maintenance. Regards, Haribabu Kommi Fujitsu Australia 0001-Allow-taget-session-attrs-to-accept-prefer-read-opti_v3.patch Description: Binary data
Re: transction_timestamp() inside of procedures
On 26/09/2018 23:48, Peter Eisentraut wrote: > On 26/09/2018 17:54, Alvaro Herrera wrote: >> What could be the use for the transaction timestamp? I think one of the >> most important uses (at least in pg_stat_activity) is to verify that >> transactions are not taking excessively long time to complete; that's >> known to cause all sorts of trouble in Postgres, and probably other >> DBMSs too. If we don't accurately measure what it really is, and >> instead keep the compatibility behavior, we risk panicking people >> because they think some transaction has been running for a long time >> when in reality it's just a very long procedure which commits frequently >> enough not to be a problem. > > That's certainly a good argument. Note that if we implemented that the > transaction timestamp is advanced inside procedures, that would also > mean that the transaction timestamp as observed in pg_stat_activity > would move during VACUUM, for example. That might or might not be > desirable. Attached is a rough implementation. I'd be mildly in favor of doing this, but we have mentioned tradeoffs in this thread. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From 2a8c214479adb4b98c8b6c95875d8bebd88cb940 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Fri, 28 Sep 2018 09:33:24 +0200 Subject: [PATCH] Advance transaction timestamp in intra-procedure transactions --- src/backend/access/transam/xact.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 9aa63c8792..245735420c 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -1884,14 +1884,19 @@ StartTransaction(void) TRACE_POSTGRESQL_TRANSACTION_START(vxid.localTransactionId); /* -* set transaction_timestamp() (a/k/a now()). We want this to be the same -* as the first command's statement_timestamp(), so don't do a fresh -* GetCurrentTimestamp() call (which'd be expensive anyway). Also, mark -* xactStopTimestamp as unset. -*/ - xactStartTimestamp = stmtStartTimestamp; - xactStopTimestamp = 0; +* set transaction_timestamp() (a/k/a now()). Normally, we want this to +* be the same as the first command's statement_timestamp(), so don't do a +* fresh GetCurrentTimestamp() call (which'd be expensive anyway). But +* for transactions started inside statements (e.g., procedure calls), we +* want to advance the timestamp. +*/ + if (xactStartTimestamp < stmtStartTimestamp) + xactStartTimestamp = stmtStartTimestamp; + else + xactStartTimestamp = GetCurrentTimestamp(); pgstat_report_xact_timestamp(xactStartTimestamp); + /* Mark xactStopTimestamp as unset. */ + xactStopTimestamp = 0; /* * initialize current transaction state fields -- 2.19.0
Re: executor relation handling
On 28 September 2018 at 20:00, Amit Langote wrote: > I've made minor tweaks, which find in > the attached updated patches (a .diff file containing changes from v6 to > v7 is also attached). Thanks for looking over the changes. I've looked at the v6 to v7 diff and it seems all good, apart from: + * The following asserts that the necessary lock on the relation I think we maybe should switch the word "assert" for "verifies". The Assert is just checking we didn't get a NoLock and I don't think you're using "assert" meaning the Assert() marco, so likely should be changed to avoid confusion. Apart from that, I see nothing wrong with the patches, so I think we should get someone else to look. I'm marking it as ready for committer. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Removing LEFT JOINs in more cases
On 18 September 2018 at 20:02, Antonin Houska wrote: > Following are the findings from my review. Thanks for reviewing this. Time is short in this commitfest to make any changes, so I'm going to return this with feedback with the intention of addressing the items from your review for the next 'fest. Cheers -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
RE: [HACKERS] Cached plans and statement generalization
On Tuesday, August 7, 2018 at 0:36 AM, Konstantin Knizhnik wrote: > I have registered the patch for next commitfest. > For some reasons it doesn't find the latest autoprepare-10.patch and still > refer to autoprepare-6.patch as the latest attachement. I'm sorry for the late reply. I'm currently reviewing autoprepare. I could not make it in time for the commit fests in September, but I will continue to review for the next commitfest. Performance tests are good results. The results are shown below. pgbench -s 100 -c 8 -t 1 -S (average of 3 trials) - all autoprepare statements use same memory context. 18052.22706 TPS - each autoprepare statement use separate memory context. 18607.95889 TPS - calculate memory usage (autoprepare_memory_limit) 19171.60457 TPS From the above results, I think that adding/changing functions will not affect performance. I am currently checking the behavior when autoprepare_memory_limit is specified.
Re: executor relation handling
On 28 September 2018 at 20:28, Amit Langote wrote: > On 2018/09/28 17:21, David Rowley wrote: >> I think we maybe should switch the word "assert" for "verifies". The >> Assert is just checking we didn't get a NoLock and I don't think >> you're using "assert" meaning the Assert() marco, so likely should be >> changed to avoid confusion. > > Okay, I've revised the text in the attached updated patch. Meh, I just noticed that the WARNING text claims "InitPlan" is the function name. I think it's best to get rid of that. It's pretty much redundant anyway if you do: \set VERBOSITY verbose -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Postgres, fsync, and OSs (specifically linux)
On Thu, Aug 30, 2018 at 2:44 PM Craig Ringer wrote: > On 15 August 2018 at 07:32, Thomas Munro > wrote: >> I will soon post some more fix-up patches that add EXEC_BACKEND >> support, Windows support, and a counting scheme to fix the timing >> issue that I mentioned in my first review. I will probably squash it >> all down to a tidy patch-set after that. I went down a bit of a rabbit hole with the Windows support for Andres's patch set. I have something that works as far as I can tell, but my Windows environment consists of throwing things at Appveyor and seeing what sticks, so I'm hoping that someone with a real Windows system and knowledge will be able to comment. New patches in this WIP patch set: 0012: Fix for EXEC_BACKEND. 0013: Windows. This involved teaching latch.c to deal with Windows asynchronous IO events, since you can't wait for pipe readiness via WSAEventSelect. Pipes and sockets exist in different dimensions on Windows, and there are no "Unix" domain sockets (well, there are but they aren't usable yet[1]). An alternative would be to use TCP sockets for this, and then the code would look more like the Unix code, but that seems a bit strange. Note that the Windows version doesn't actually hand off file handles like the Unix code (it could fairly easily, but there is no reason to think that would actually be useful on that platform). I may be way off here... The 0013 patch also fixes a mistake in the 0010 patch: it is not appropriate to call CFI() while waiting to notify the checkpointer of a dirty segment, because then ^C could cause the following checkpoint not to flush dirty data. SendFsyncRequest() is essentially blocking, except that it uses non-blocking IO so that it multiplex postmaster death detection. 0014: Fix the ordering race condition mentioned upthread[2]. All files are assigned an increasing sequence number after [re]opening (ie before their first write), so that the checkpointer process can track the fd that must have the oldest Linux f_wb_err that could be relevant for writes done by PostgreSQL. The other patches in this tarball are all as posted already, but are now rebased and assembled in one place. Also pushed to https://github.com/macdice/postgres/tree/fsyncgate . Thoughts? [1] https://blogs.msdn.microsoft.com/commandline/2017/12/19/af_unix-comes-to-windows/ [2] https://www.postgresql.org/message-id/CAEepm%3D04ZCG_8N3m61kXZP-7Ecr02HUNNG-QsAhwyFLim4su2g%40mail.gmail.com -- Thomas Munro http://www.enterprisedb.com fsyncgate-v3.tgz Description: GNU Zip compressed data
Re: Postgres, fsync, and OSs (specifically linux)
On Fri, Sep 28, 2018 at 9:37 PM Thomas Munro wrote: > The 0013 patch also fixes a mistake in the 0010 patch: it is not > appropriate to call CFI() while waiting to notify the checkpointer of > a dirty segment, because then ^C could cause the following checkpoint > not to flush dirty data. (Though of course it wouldn't actually do that due to an LWLock being held, but still, I removed the CFI because it was at best misleading). -- Thomas Munro http://www.enterprisedb.com
Re: Slotification of partition tuple conversion
On Wed, 26 Sep 2018 at 03:33, Andres Freund wrote: > > Hi Amit, > > Could you rebase this patch, it doesn't apply anymore. Thanks for informing. Attached are both mine and Amit Langote's patch rebased and attached ... -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company v2-rebased-0001-Allocate-dedicated-slots-of-partition-tuple-conve.patch Description: Binary data v4_rebased-0002-Slotify-partition-tuple-conversion.patch Description: Binary data
Re: Slotification of partition tuple conversion
On 2018/09/28 19:06, Amit Khandekar wrote: > On Wed, 26 Sep 2018 at 03:33, Andres Freund wrote: >> >> Hi Amit, >> >> Could you rebase this patch, it doesn't apply anymore. > > Thanks for informing. Attached are both mine and Amit Langote's patch > rebased and attached ... Thanks Amit for also taking care of the other one. I don't have time today, but will try to take a look on Monday. Regards, Amit
Re: transction_timestamp() inside of procedures
On Wed, Sep 26, 2018 at 09:23:58PM +0200, Daniel Verite wrote: > Tom Lane wrote: > > > I agree that it would be surprising for transaction timestamp to be newer > > than statement timestamp. > > To me it's more surprising to start a new transaction and having > transaction_timestamp() still pointing at the start of a previous > transaction. > This feels like a side-effect of being spawned by a procedure, > and an exception to what transaction_timestamp() normally means > or meant until now. > > OTOH transaction_timestamp() being possibly newer than > statement_timestamp() seems like a normal consequence of > transactions being created intra-statement. Yes, that is a good point. My thought has always been that statements are inside of transactions, but the opposite is now possible. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: Postgres 11 release notes
On Fri, Sep 28, 2018 at 10:21:16AM +0900, Amit Langote wrote: > On 2018/09/27 23:24, Alvaro Herrera wrote: > > On 2018-Sep-27, Amit Langote wrote: > > > >> Sorry I couldn't reply sooner, but the following of your proposed text > >> needs to be updated a bit: > >> > >> + > >> + > >> + Having a "default" partition for storing data that does not > >> match a > >> + partition key > >> + > >> + > >> > >> I think "does not match a partition key" is not accurate. Description of > >> default partitions further below in the release notes says this: > >> > >> "The default partition can store rows that don't match any of the other > >> defined partitions, and is searched accordingly." > >> > >> So, we could perhaps write it as: > >> > >> Having a "default" partition for storing data that does not match any of > >> the remaining partitions > > > > Yeah, I agree that "a partition key" is not the right term to use there > > (and that term is used in the press release text also). However I don't > > think "remaining" is the right word there either, because it sounds as > > if you're removing something. > > > > For the Spanish translation of the press release, we ended up using the > > equivalent of "for the data that does not match any other partition". > > Yeah, "any other partition" is what the existing description uses too, so: > > Having a "default" partition for storing data that does not match any > other partition Uh, what text are you talkinga about? I see this text in the release notes since May: The default partition can store rows that don't match any of the other defined partitions, and is searched accordingly. The press release? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: Problem while setting the fpw with SIGHUP
On Fri, Sep 28, 2018 at 4:23 AM Michael Paquier wrote: > > On Thu, Sep 27, 2018 at 07:38:31PM +0530, Amit Kapila wrote: > > Okay, I am planning to commit the attached patch tomorrow unless you > > or anybody else has any objections to it. > > None from here. Thanks for taking care of it. > Thanks, pushed! I have backpatched till 9.5 as this has been introduced by the commit 2c03216d83 which added the XLOG machinery initialization in RecoveryInProgress code path. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: pgsql: Build src/port files as a library with -fPIC, and use that in li
Re: Tom Lane 2018-09-27 > Build src/port files as a library with -fPIC, and use that in libpq. This made the "pqsignal" symbol disappear from libpq5.so: 13:27:55 dpkg-gensymbols: error: some symbols or patterns disappeared in the symbols file: see diff output below 13:27:55 dpkg-gensymbols: warning: debian/libpq5/DEBIAN/symbols doesn't match completely debian/libpq5.symbols 13:27:55 --- debian/libpq5.symbols (libpq5_12~~devel~20180928.1058-1~226.git92a0342.pgdg+1_amd64) 13:27:55 +++ dpkg-gensymbolsoXZn54 2018-09-28 11:27:55.499157237 + 13:27:55 @@ -168,7 +168,7 @@ 13:27:55 pg_valid_server_encoding@Base 0 13:27:55 pg_valid_server_encoding_id@Base 8.3~beta1-2~ 13:27:55 pgresStatus@Base 0 13:27:55 - pqsignal@Base 0 13:27:55 +#MISSING: 12~~devel~20180928.1058-1~226.git92a0342.pgdg+1# pqsignal@Base 0 13:27:55 printfPQExpBuffer@Base 0 13:27:55 resetPQExpBuffer@Base 0 13:27:55 termPQExpBuffer@Base 0 Is this is a problem for libpq5 users? Christoph
Re: clarify documentation of BGW_NEVER_RESTART ?
On Wed, Sep 26, 2018 at 3:17 AM Chapman Flack wrote: > > I did not notice until today that there is some ambiguity in > this paragraph: > > bgw_restart_time is the interval, in seconds, that postgres should > wait before restarting the process, in case it crashes. It can be > any positive value, or BGW_NEVER_RESTART, indicating not to restart > the process in case of a crash. > > I had been reading "in case _it_ crashes" and "in case of _a_ crash" > as "in case _the background worker_ crashes", so I assumed with > BGW_NEVER_RESTART I was saying I don't want my worker restarted if > _it_ flakes out while PG is otherwise operating normally. > > But I was surprised when the unrelated crash of a different, normal > backend left my background worker killed and never restarted. I had > always regarded the fatal-error kick-out-all-backends-and-recover > handling as essentially equivalent to a PG restart, so I had expected > it to start the bgworker over just as a real restart would. > > But sure enough, ResetBackgroundWorkerCrashTimes() gets called in > that case, and treats every worker with BGW_NEVER_RESTART as gone > and forgotten. So it seems the "it" in "it crashes" can be "the > background worker" or "postgres itself" or "any shmem-connected > backend". > I think that kind of wording might suit for BGW_NEVER_RESTART value, but for any positive value, the current wording appears fine to me. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] kqueue
On Fri, Sep 28, 2018 at 11:09 AM Andres Freund wrote: > On 2018-09-28 10:55:13 +1200, Thomas Munro wrote: > > Matteo Beccati reported a 5-10% performance drop on a > > low-end Celeron NetBSD box which we have no explanation for, and we > > have no reports from server-class machines on that OS -- so perhaps we > > (or the NetBSD port?) should consider building with WAIT_USE_POLL on > > NetBSD until someone can figure out what needs to be fixed there > > (possibly on the NetBSD side)? > > Yea, I'm not too worried about that. It'd be great to test that, but > otherwise I'm also ok to just plonk that into the template. Thanks for the review! Ok, if we don't get a better idea I'll put this in src/template/netbsd: CPPFLAGS="$CPPFLAGS -DWAIT_USE_POLL" > > @@ -576,6 +592,10 @@ CreateWaitEventSet(MemoryContext context, int nevents) > > if (fcntl(set->epoll_fd, F_SETFD, FD_CLOEXEC) == -1) > > elog(ERROR, "fcntl(F_SETFD) failed on epoll descriptor: %m"); > > #endif /* > > EPOLL_CLOEXEC */ > > +#elif defined(WAIT_USE_KQUEUE) > > + set->kqueue_fd = kqueue(); > > + if (set->kqueue_fd < 0) > > + elog(ERROR, "kqueue failed: %m"); > > #elif defined(WAIT_USE_WIN32) > > Is this automatically opened with some FD_CLOEXEC equivalent? No. Hmm, I thought it wasn't necessary because kqueue descriptors are not inherited and backends don't execve() directly without forking, but I guess it can't hurt to add a fcntl() call. Done. > > + *((WaitEvent **)(&k_ev->udata)) = event; > > I'm mildly inclined to hide that behind a macro, so the other places > have a reference, via the macro definition, to this too. Done. > > + if (rc < 0 && event->events == WL_POSTMASTER_DEATH && errno == ESRCH) > > + { > > + /* > > + * The postmaster is already dead. Defer reporting this to > > the caller > > + * until wait time, for compatibility with the other > > implementations. > > + * To do that we will now add the regular alive pipe. > > + */ > > + WaitEventAdjustKqueueAdd(&k_ev[0], EVFILT_READ, EV_ADD, > > event); > > + rc = kevent(set->kqueue_fd, &k_ev[0], count, NULL, 0, NULL); > > + } > > That's, ... not particulary pretty. Kinda wonder if we shouldn't instead > just add a 'pending_events' field, that we can check at wait time. Done. > > +/* Define to 1 if you have the `kqueue' function. */ > > +#undef HAVE_KQUEUE > > + > Should adjust pg_config.win32.h too. Done. -- Thomas Munro http://www.enterprisedb.com 0001-Add-kqueue-2-support-for-WaitEventSet-v11.patch Description: Binary data
Re: Progress reporting for pg_verify_checksums
Hi, On Wed, Sep 26, 2018 at 10:46:06AM +0200, Fabien COELHO wrote: > The xml documentation should be updated! (It is kind of hard to notice what > is not there:-) > > See "doc/src/sgml/ref/pg_verify_checksums.sgml". Right, I've added a paragraph. > >>The time() granularity to the second makes the result awkward on small > >>tests: > >> > >> 8/1554552 kB (0%, 8 kB/s) > >> 1040856/1554552 kB (66%, 1040856 kB/s) > >> 1554552/1554552 kB (100%, 1554552 kB/s) > >> > >>I'd suggest to reuse the "portability/instr_time.h" stuff which allows a > >>much better precision. > > I still think it would make sense to use that instead of low-precision > time(). As the use-case of this is not for small tests, I don't think it is useful to make the code more complicated for this. > >>The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying > >>1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are > >>about storage which is usually measured in powers of 1,000, so I'd suggest > >>to use thousands. > >> > >>Another reserve I have is that on any hardware it is likely that there will > >>be a lot of kilos, so maybe megas (MB) should be used instead. > > > >The display is exactly the same as in pg_basebackup (except the > >bandwith is shown as well), so right now I think it is more useful to be > >consistent here. > > Hmmm... units are units, and the display is wrong. The fact that other > commands are wrong as well does not look like a good argument to persist in > an error. I've had a look around, and "kB" seems to be a common unit for 1024 bytes, e.g. in pg_test_fsync etc., unless I am mistaken? > >So if we change that, pg_basebackup should be changed as well I think. > > I'm okay with fixing pg_basebackup as well! I'm unsure about the best place > to put such a function, though. Probably under "src/common/" where there is > already some front-end specific code ("fe_memutils.c"). > > >Maybe the code could be factored out to some common file in the future. > > I would be okay with a progress printing function shared between some > commands. It just needs some place. pg_rewind also has a --rewind option. I guess you mean pg_rewind also has a --progress option. I do agree it makes sense to refactor that, but I don't think this should be part of this patch. > >> + memset(&act, '\0', sizeof(act)); > >> > >>pg uses 0 instead of '\0' everywhere else. > > > >Ok. > > Not '0', plain 0 (the integer of value zero). Oops, thanks for spotting that. I've attached v4 of the patch. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz diff --git a/doc/src/sgml/ref/pg_verify_checksums.sgml b/doc/src/sgml/ref/pg_verify_checksums.sgml index 905b8f1222..5550b7b2bf 100644 --- a/doc/src/sgml/ref/pg_verify_checksums.sgml +++ b/doc/src/sgml/ref/pg_verify_checksums.sgml @@ -80,6 +80,19 @@ PostgreSQL documentation + -P + --progress + + +Enable progress reporting. Turning this on will deliver an approximate +progress report during the checksum verification. Sending the +SIGUSR1 signal will toggle progress reporting +on or off during the verification run. + + + + + -V --version diff --git a/src/bin/pg_verify_checksums/pg_verify_checksums.c b/src/bin/pg_verify_checksums/pg_verify_checksums.c index 1bc020ab6c..ed0dec2325 100644 --- a/src/bin/pg_verify_checksums/pg_verify_checksums.c +++ b/src/bin/pg_verify_checksums/pg_verify_checksums.c @@ -10,6 +10,8 @@ #include "postgres_fe.h" #include +#include +#include #include #include @@ -29,9 +31,18 @@ static ControlFileData *ControlFile; static char *only_relfilenode = NULL; static bool verbose = false; +static bool show_progress = false; static const char *progname; +/* + * Progress status information. + */ +int64 total_size = 0; +int64 current_size = 0; +pg_time_t last_progress_update; +pg_time_t scan_started; + static void usage(void) { @@ -42,6 +53,7 @@ usage(void) printf(_(" [-D, --pgdata=]DATADIR data directory\n")); printf(_(" -v, --verbose output verbose messages\n")); printf(_(" -r RELFILENODE check only relation with specified relfilenode\n")); + printf(_(" -P, --progress show progress information\n")); printf(_(" -V, --version output version information, then exit\n")); printf(_(" -?, --help show this help, then exit\n")); printf(_("\nIf no data directory (DATADIR) is specified, " @@ -57,6 +69,71 @@ static const char *const skip[] = { NULL, }; +static void +toggle_progre
Re: pgsql: Build src/port files as a library with -fPIC, and use that in li
Christoph Berg writes: > Re: Tom Lane 2018-09-27 >> Build src/port files as a library with -fPIC, and use that in libpq. > This made the "pqsignal" symbol disappear from libpq5.so: Oh, interesting. I'd seen an actual error on prairiedog, but apparently some other linkers just silently omit the export, if the symbol is in a .a file rather than .o. > Is this is a problem for libpq5 users? I proposed in https://www.postgresql.org/message-id/19581.1538077...@sss.pgh.pa.us that we should remove pqsignal, as well as pg_utf_mblen pg_encoding_to_char pg_char_to_encoding pg_valid_server_encoding pg_valid_server_encoding_id from libpq's exports, on the grounds that (a) nobody should be using those (they're undocumented as exports), and (b) anybody who is using them should get them from libpgport/libpgcommon instead. Thoughts? regards, tom lane
Re: pgsql: Build src/port files as a library with -fPIC, and use that in li
Re: Tom Lane 2018-09-28 <19404.1538140...@sss.pgh.pa.us> > > Is this is a problem for libpq5 users? > > I proposed in > https://www.postgresql.org/message-id/19581.1538077...@sss.pgh.pa.us > > that we should remove pqsignal, as well as > pg_utf_mblen > pg_encoding_to_char > pg_char_to_encoding > pg_valid_server_encoding > pg_valid_server_encoding_id > > from libpq's exports, on the grounds that (a) nobody should be using > those (they're undocumented as exports), and (b) anybody who is using > them should get them from libpgport/libpgcommon instead. Thoughts? I'm fine with that if we say (a) should be true, and even if it is not, (b) is an easy workaround. Bumping the libpq SONAME just because of this seems excessive. On the Debian side, I can simply remove the symbol from the tracking file and the buildsystem will be happy again. Christoph
Re: pgsql: Build src/port files as a library with -fPIC, and use that in li
Christoph Berg writes: > Re: Tom Lane 2018-09-28 <19404.1538140...@sss.pgh.pa.us> >> I proposed in >> https://www.postgresql.org/message-id/19581.1538077...@sss.pgh.pa.us >> >> that we should remove pqsignal, as well as >> pg_utf_mblen >> pg_encoding_to_char >> pg_char_to_encoding >> pg_valid_server_encoding >> pg_valid_server_encoding_id >> >> from libpq's exports, on the grounds that (a) nobody should be using >> those (they're undocumented as exports), and (b) anybody who is using >> them should get them from libpgport/libpgcommon instead. Thoughts? > I'm fine with that if we say (a) should be true, and even if it is > not, (b) is an easy workaround. Bumping the libpq SONAME just because > of this seems excessive. Yeah, if anyone insists that this requires a soname bump, I'd probably look for another solution. Making the makefiles a bit cleaner is not worth the churn that would cause, IMO. The place where we'd probably end up if anyone's unhappy about this is that we'd still be symlinking the three files pqsignal.c, wchar.c, and encnames.c into libpq. That's not very desirable, but at least it'd be a fixed list rather than something we're continually having to change. regards, tom lane
Re: executor relation handling
Hi, On 9/28/18 4:58 AM, Amit Langote wrote: Okay, I've revised the text in the attached updated patch. Meh, I just noticed that the WARNING text claims "InitPlan" is the function name. I think it's best to get rid of that. It's pretty much redundant anyway if you do: \set VERBOSITY verbose Oops, good catch that one. Removed "InitPlan: " from the message in the attached. I have looked at the patch (v9), and have no further comments. I can confirm a speedup in the SELECT FOR SHARE case. Thanks for working on this ! Best regards, Jesper
Re: buildfarm and git pull
On 09/27/2018 11:32 AM, Alexander Kuzmenkov wrote: Hi Andrew, I have a question about how buildfarm works with git, could you please help? We use buildfarm locally at PGPro to test our branches, and it breaks when I rebase and force push to the repository. To get the remote changes, buildfarm does 'git checkout .' followed by 'git pull', and the latter breaks when the remote branch was rebased. I was wondering if the buildfarm really has to do 'git pull'? Pull is supposed to be used to integrate local changes with remote ones, but buildfarm doesn't have any local changes, does it? It just has to checkout the remote branch as-is. To do that, when the state of working directory is not know, I'd do the following commands: git fetch # get the remote changes git checkout -f / # checkout the needed remote branch; on conflict, use the remote files git reset --hard # revert all modifications in tracked files git clean -xfd # recursively delete all unversioned and ignored files Do you think this approach is correct or am I missing something? possibly. It seems a little violent. We don't do rebase + forced push in Postgres - it's something of a nono in public repositories according to my understanding. Send me a patch and I'll take a look at it. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Let's stop with the retail rebuilds of src/port/ files already
On 09/27/2018 03:48 PM, Tom Lane wrote: I wrote: Here's a partial patch for that: it adds the third build variant to src/port/ and teaches libpq to use it. We'd want to likewise modify src/common/ and fix up other callers such as ecpg, but this seems to be enough to test whether the idea works or not. ... What I think would make sense is to push this and see what the buildfarm thinks of it. If there are unfixable problems then we won't have wasted time fleshing out the concept. Otherwise, I'll do the remaining pieces. Well, the buildfarm did turn up one problem: on really old macOS (cf prairiedog) the libpq link step fails with ld: symbols names listed in -exported_symbols_list: exports.list not in linked objects _pqsignal Apparently, with that linker, the exported symbols list is resolved against just what is found in the listed *.o files, not anything pulled in from a library file. Now, the question that raises in my mind is why is libpq.so exporting pqsignal() at all? Probably there was once a reason for it, but nowadays I would think that any client program using pqsignal() should get it from -lpgport, not from libpq. Having more than one place to get it from seems more likely to create issues than solve them. And we certainly do not document it as a function available from libpq. So my recommendation is to remove pqsignal from libpq's exports.txt. I've verified that prairiedog builds happily with that change, confirming my expectation that all consumers of the symbol can get it from someplace else. Now, if we go forward with that solution, there will be issues with some other things that libpq exports without having defined itself: src/backend/utils/mb/wchar.c: pg_utf_mblen src/backend/utils/mb/encnames.c: pg_encoding_to_char pg_char_to_encoding pg_valid_server_encoding pg_valid_server_encoding_id Now, I'd already had my eye on those two files, because after applying a similar fix for src/common/, those two files would be the only ones that libpq needs to symlink from somewhere else. What I was thinking of proposing was to move those two files out of the backend and into src/common/, thereby normalizing their status as modules available in both frontend and backend, and removing the need for a special build rule for them in libpq. (initdb could be simplified too.) Per this discovery, we'd need to also remove these symbols from libpq's exports list, meaning that clients *must* get them from -lpgcommon not from libpq. There's a small chance that this'd break third-party clients that are using these symbols out of libpq. We've never documented them as being available, but somebody might be using them anyway. If that does happen, it could be repaired by linking against -lpgcommon along with libpq, but it'd possibly still make people unhappy. Seems a small enough price to pay. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
On 09/28/2018 06:02 AM, Amit Langote wrote: > On 2018/09/28 12:12, Michael Paquier wrote: >> On Fri, Sep 28, 2018 at 02:46:30PM +1200, David Rowley wrote: >>> I don't agree that we can skip explaining why one of the optimisations >>> can't be applied just because we've explained why a similar >>> optimisation cannot be applied somewhere close by. I think that the >>> WAL/FSM optimisation can fairly easily be improved on and probably >>> fixed in PG12 as we can just lazily determine per-partition if it can >>> be applied to that partition or not. >> >> Have you guys looked at what the following patch does for partitions and >> how it interacts with it? >> https://commitfest.postgresql.org/19/528/ > > Just looked at that patch and noticed that the following hunk won't cope > if COPY's target table is partitioned: > > diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c > index 7674369613..7b9a7af2d2 100644 > --- a/src/backend/commands/copy.c > +++ b/src/backend/commands/copy.c > @@ -2416,10 +2416,8 @@ CopyFrom(CopyState cstate) > { > hi_options |= HEAP_INSERT_SKIP_FSM; > > - if (!XLogIsNeeded() && > - cstate->rel->trigdesc == NULL && > - RelationGetNumberOfBlocks(cstate->rel) == 0) > - hi_options |= HEAP_INSERT_SKIP_WAL; > + if (!XLogIsNeeded() && RelationGetNumberOfBlocks(cstate->rel) > == 0) > + hi_options |= HEAP_INSERT_SKIP_WAL; > } > > /* > > RelationGetNumberOfBlocks would blow up if passed a partitioned table to it. > > Applying David's patch will take care of that though. > >> The proposed patch is missing the point that documentation also mentions >> the optimizations for COPY with wal_level = minimal: >> >> COPY is fastest when used within the same >> transaction as an earlier CREATE TABLE or >> TRUNCATE command. In such cases no WAL >> needs to be written, because in case of an error, the files >> containing the newly loaded data will be removed anyway. >> However, this consideration only applies when >> is minimal as all >> commands >> must write WAL otherwise. >> > > I might be wrong but I'm not sure if we should mention here that this > optimization is not applied to partitioned tables due to what appears to > be a implementation-level limitation? > Aren't most limitations implementation-level? ;-) IMHO it's entirely appropriate to mention this in user-level docs. Based on the current wording, it's quite natural to assume the optimization applies to both partitioned and non-partitioned tables. And in fact it does not. It will still work for individual partitions, though. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
On 19/09/2018 20:23, Peter Geoghegan wrote: > Attached is v5, So. I don't know much about the btree code, so don't believe anything I say. I was very interested in the bloat test case that you posted on 2018-07-09 and I tried to understand it more. The current method for inserting a duplicate value into a btree is going to the leftmost point for that value and then move right until we find some space or we get "tired" of searching, in which case just make some space right there. The problem is that it's tricky to decide when to stop searching, and there are scenarios when we stop too soon and repeatedly miss all the good free space to the right, leading to bloat even though the index is perhaps quite empty. I tried playing with the getting-tired factor (it could plausibly be a reloption), but that wasn't very successful. You can use that to postpone the bloat, but you won't stop it, and performance becomes terrible. You propose to address this by appending the tid to the index key, so each key, even if its "payload" is a duplicate value, is unique and has a unique place, so we never have to do this "tiresome" search. This makes a lot of sense, and the results in the bloat test you posted are impressive and reproducible. I tried a silly alternative approach by placing a new duplicate key in a random location. This should be equivalent since tids are effectively random. I didn't quite get this to fully work yet, but at least it doesn't blow up, and it gets the same regression test ordering differences for pg_depend scans that you are trying to paper over. ;-) As far as the code is concerned, I agree with Andrey Lepikhov that one more abstraction layer that somehow combines the scankey and the tid or some combination like that would be useful, instead of passing the tid as a separate argument everywhere. I think it might help this patch move along if it were split up a bit, for example 1) suffix truncation, 2) tid stuff, 3) new split strategies. That way, it would also be easier to test out each piece separately. For example, how much space does suffix truncation save in what scenario, are there any performance regressions, etc. In the last few versions, the patches have still been growing significantly in size and functionality, and most of the supposed benefits are not readily visible in tests. And of course we need to think about how to handle upgrades, but you have already started a separate discussion about that. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: On-disk compatibility for nbtree-unique-key enhancement
On 21/09/2018 01:18, Peter Geoghegan wrote: > * This means that there is a compatibility issue for anyone that is > already right on the threshold -- we *really* don't want to see a > REINDEX fail, but that seems like a possibility that we need to talk > about now. When would the REINDEX need to happen? Will the code still be able to read and write v3 btrees? Could there perhaps be an amcheck or pageinspect feature that tells you ahead of time if there are too large items in an old index? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: C99 compliance for src/port/snprintf.c
Hi, On 2018-08-25 13:08:18 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2018-08-16 11:41:30 -0400, Tom Lane wrote: > >> Andres Freund writes: > >>> While I'd personally have no problem kicking gcc 3.4 to the curb, I'm > >>> still confused what causes this error mode. Kinda looks like > >>> out-of-sync headers with gcc or something. > > >> Yeah, this is *absolutely* unsurprising for a non-native gcc installation > >> on an old platform. > > > Sure, but that still requires the headers to behave differently between > > C89 and C99 mode, as this worked before. But it turns out there's two > > different math.h implementation headers, depending on c99 being enabled > > (math_c99.h being the troublesome). If I understand correctly the > > problem is more that the system library headers are *newer* (and assume > > a sun studio emulating/copying quite a bit of gcc) than the gcc that's > > being used, and therefore gcc fails. > > I have some more info on this issue, based on having successfully > updated "gaur" using gcc 3.4.6 (which I picked because it was the last > of the 3.x release series). It seems very unlikely that there's much > difference between 3.4.3 and 3.4.6 as far as external features go. > What I find in the 3.4.6 documentation is > > -- Built-in Function: double __builtin_inf (void) > Similar to `__builtin_huge_val', except a warning is generated if > the target floating-point format does not support infinities. > This function is suitable for implementing the ISO C99 macro > `INFINITY'. > > Note that the function is called "__builtin_inf", whereas what we see > protosciurus choking on is "__builtin_infinity". So I don't think this > is a version skew issue at all. I think that the system headers are > written for the Solaris cc, and its name for the equivalent function is > __builtin_infinity, whereas what gcc wants is __builtin_inf. Likewise, > the failures we see for __builtin_isinf and __builtin_isnan are because > Solaris cc provides those but gcc does not. > > If we wanted to keep protosciurus going without a compiler update, my > thought would be to modify gcc's copy of math_c99.h to correct the > function name underlying INFINITY, and change the definitions of isinf() > and isnan() back to whatever was being used pre-C99. > > It's possible that newer gcc releases have been tweaked so that they > make appropriate corrections in this header file automatically, but > that's not a sure thing. I've pinged Dave about this, and he said: On 2018-09-26 17:04:29 -0400, Dave Page wrote: > Unfortunately, i think that whole machine is basically EOL now. It's > already skipping OpenSSL for some builds, as being stuck on a very old > version of the buildfarm client, as some of the modules used in newer > versions just won't compile or work. I don't have any support contract, or > access to newer versions of SunStudio, and the guys that used to package > GCC for Solaris now charge to download their packages. > > I could potentially build my own version of GCC, but I question whether > it's really worth it, given the other problems. He's now disabled building master on protosciurus and casteroides. We still have damselfly and rover_firefly so I don't feel too bad about that. I've pinged their owners to ask whether they could set up a sun studio (or however that's called in their solaris descendants) version. Greetings, Andres Freund
Re: master, static inline and #ifndef FRONTEND
Hi, On 2018-09-10 09:50:15 -0700, Andres Freund wrote: > On 2018-09-10 12:39:16 -0400, Tom Lane wrote: > > Andres Freund writes: > > > In a recent commit [1] I added a static inline which castoroides > > > dislikes: ... > > > It's obviously trivial to fix this case with by adding an #ifndef > > > FRONTEND. But as castoroides appears to be the only animal showing the > > > problem - after subtracting the animals dead due to the C99 requirement > > > - I wonder if we shouldn't just require "proper" static inline > > > support. castoroides runs a 13yo OS, and newer compilers that do not > > > have the problem are readily available. > > > > Given that we currently have *no* working Solaris buildfarm members, > > I'm not prepared to tolerate "I can't be bothered to fix this", > > which is what your argument boils down to. > > Uh, this seems unnecessarily dismissive. I wrote the above email minutes > after noticing the issue ( which in turn was shortly after the issue > occured), asking for feedback. Hardly "I can't be bothered to fix it" > territory. But adding a lot of #ifndef FRONTENDs isn't entirely free > either... Fwi, I've pinged Dave about upgrading the compiler on that machine, and he wrote: On 2018-09-26 17:04:29 -0400, Dave Page wrote: > Unfortunately, i think that whole machine is basically EOL now. It's > already skipping OpenSSL for some builds, as being stuck on a very old > version of the buildfarm client, as some of the modules used in newer > versions just won't compile or work. I don't have any support contract, or > access to newer versions of SunStudio, and the guys that used to package > GCC for Solaris now charge to download their packages. he has subsequently disabled building master on protosciurus and casteroides. Greetings, Andres Freund
Re: executor relation handling
On 2018-Sep-28, Amit Langote wrote: > On 2018/09/28 17:48, David Rowley wrote: > > Meh, I just noticed that the WARNING text claims "InitPlan" is the > > function name. I think it's best to get rid of that. It's pretty much > > redundant anyway if you do: \set VERBOSITY verbose > > Oops, good catch that one. Removed "InitPlan: " from the message in the > attached. Were there two cases of that? Because one still remains. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: SQL/JSON: documentation
On 28/06/2018 01:36, Nikita Glukhov wrote: > Attached patch with draft of SQL/JSON documentation written by > Liudmila Mantrova, Oleg Bartunov and me. > > Also it can be found in our sqljson repository on sqljson_doc branch: > https://github.com/postgrespro/sqljson/tree/sqljson_doc > > We continue to work on it. Some structural comments: - I don't think this should be moved to a separate file. Yes, func.sgml is pretty big, but if we're going to split it up, we should do it in a systematic way, not just one section. - The refentries are not a bad idea, but again, if we just used them for this one section, the navigation will behave weirdly. So I'd do it without them, just using normal subsections. - Stick to one-space indentation in XML. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
On Fri, Sep 28, 2018 at 7:50 AM Peter Eisentraut wrote: > So. I don't know much about the btree code, so don't believe anything I > say. I think that showing up and reviewing this patch makes you somewhat of an expert, by default. There just isn't enough expertise in this area. > I was very interested in the bloat test case that you posted on > 2018-07-09 and I tried to understand it more. Up until recently, I thought that I would justify the patch primarily as a project to make B-Trees less bloated when there are many duplicates, with maybe as many as a dozen or more secondary benefits. That's what I thought it would say in the release notes, even though the patch was always a broader strategic thing. Now I think that the TPC-C multiple insert point bloat issue might be the primary headline benefit, though. I hate to add more complexity to get it to work well, but just look at how much smaller the indexes are following an initial bulk load (bulk insertions) using my working copy of the patch: Master customer_pkey: 75 MB district_pkey: 40 kB idx_customer_name: 107 MB item_pkey: 2216 kB new_order_pkey: 22 MB oorder_o_w_id_o_d_id_o_c_id_o_id_key: 60 MB oorder_pkey: 78 MB order_line_pkey: 774 MB stock_pkey: 181 MB warehouse_pkey: 24 kB Patch customer_pkey: 50 MB district_pkey: 40 kB idx_customer_name: 105 MB item_pkey: 2216 kB new_order_pkey: 12 MB oorder_o_w_id_o_d_id_o_c_id_o_id_key: 61 MB oorder_pkey: 42 MB order_line_pkey: 429 MB stock_pkey: 111 MB warehouse_pkey: 24 kB All of the indexes used by oltpbench to do TPC-C are listed, so you're seeing the full picture for TPC-C bulk loading here (actually, there is another index that has an identical definition to oorder_o_w_id_o_d_id_o_c_id_o_id_key for some reason, which is omitted as redundant). As you can see, all the largest indexes are *significantly* smaller, with the exception of oorder_o_w_id_o_d_id_o_c_id_o_id_key. You won't be able to see this improvement until I post the next version, though, since this is a brand new development. Note that VACUUM hasn't been run at all, and doesn't need to be run, as there are no dead tuples. Note also that this has *nothing* to do with getting tired -- almost all of these indexes are unique indexes. Note that I'm also testing TPC-E and TPC-H in a very similar way, which have both been improved noticeably, but to a degree that's much less compelling than what we see with TPC-C. They have "getting tired" cases that benefit quite a bit, but those are the minority. Have you ever used HammerDB? I got this data from oltpbench, but I think that HammerDB might be the way to go with TPC-C testing Postgres. > You propose to address this by appending the tid to the index key, so > each key, even if its "payload" is a duplicate value, is unique and has > a unique place, so we never have to do this "tiresome" search.This > makes a lot of sense, and the results in the bloat test you posted are > impressive and reproducible. Thanks. > I tried a silly alternative approach by placing a new duplicate key in a > random location. This should be equivalent since tids are effectively > random. You're never going to get any other approach to work remotely as well, because while the TIDs may seem to be random in some sense, they have various properties that are very useful from a high level, data life cycle point of view. For insertions of duplicates, heap TID has temporal locality -- you are only going to dirty one or two leaf pages, rather than potentially dirtying dozens or hundreds. Furthermore, heap TID is generally strongly correlated with primary key values, so VACUUM can be much much more effective at killing duplicates in low cardinality secondary indexes when there are DELETEs with a range predicate on the primary key. This is a lot more realistic than the 2018-07-09 test case, but it still could make as big of a difference. > I didn't quite get this to fully work yet, but at least it > doesn't blow up, and it gets the same regression test ordering > differences for pg_depend scans that you are trying to paper over. ;-) FWIW, I actually just added to the papering over, rather than creating a new problem. There are plenty of instances of "\set VERBOSITY terse" in the regression tests already, for the same reason. If you run the regression tests with ignore_system_indexes=on, there are very similar failures [1]. > As far as the code is concerned, I agree with Andrey Lepikhov that one > more abstraction layer that somehow combines the scankey and the tid or > some combination like that would be useful, instead of passing the tid > as a separate argument everywhere. I've already drafted this in my working copy. It is a clear improvement. You can expect it in the next version. > I think it might help this patch move along if it were split up a bit, > for example 1) suffix truncation, 2) tid stuff, 3) new split strategies. > That way, it would also be easier to test out each piece separately. > For example,
Re: Adding a note to protocol.sgml regarding CopyData
On 25/09/2018 13:55, Amit Kapila wrote: > On Tue, Sep 25, 2018 at 4:51 AM Bradley DeJong wrote: >> >> On 2018-09-22, Amit Kapila wrote ... >> > ... duplicate the same information in different words at three >> different places ... >> >> I count 7 different places. In the protocol docs, there is the old >> mention in the "Summary of Changes since Protocol 2.0" section >> > > Below text is present in the section quoted by you: > The special \. last line is not > needed anymore, and is not sent during COPY OUT. > (It is still recognized as a terminator during COPY > IN, but its use is deprecated and will eventually be > removed.) > > This email started with the need to mention this in protocol.sgml and > it is already present although at a different place than the place at > which it is proposed. Personally, I don't see the need to add it to > more places. Does anybody else have any opinion on this matter? Yeah, I don't see why we need to document it three times in the same chapter. Also, that chapter is specifically about version 3.0 of the protocol, so documenting version 2.0 is out of scope. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Use durable_unlink for .ready and .done files for WAL segment removal
Greetings, * Michael Paquier (mich...@paquier.xyz) wrote: > While reviewing the archiving code, I have bumped into the fact that > XLogArchiveCleanup() thinks that it is safe to do only a plain unlink() > for .ready and .done files when removing a past segment. I don't think > that it is a smart move, as on a subsequent crash we may still see > those, but the related segment would have gone away. This is not really > a problem for .done files, but it could confuse the archiver to see some > .ready files about things that have already gone away. Is there an issue with making the archiver able to understand that situation instead of being confused by it..? Seems like that'd probably be a good thing to do regardless of this, but that would then remove the need for this kind of change.. Thanks! Stephen signature.asc Description: PGP signature
Pithy patch for more detailed error reporting when effective_io_concurrency is set to nonzero on platforms lacking posix_fadvise()
Per Tom's suggestion on bug #15396, here's a patch to have platforms such as OSX give a more descriptive message when rejecting a nonzero value for effective_io_concurrency. I had to adjust the GUC's wiring in the #ifndef case so that check_effective_io_concurrency() would be called when a nonzero value is supplied instead of just short-circuiting in parse_and_validate_value() when outside of [conf->min, conf->max]. James no_fadvise_better_warning_bug_15396.patch Description: Binary data - James Robinson ja...@jlr-photo.com http://jlr-photo.com/
Odd 9.4, 9.3 buildfarm failure on s390x
Hi, It seems Mark started a new buildfarm animal on s390x. It shows a pretty odd failure on 9.3 and 9.4, but *not* on newer animals: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lumpsucker&dt=2018-09-26%2020%3A30%3A58 == pgsql.build/src/test/regress/regression.diffs === *** /home/linux1/build-farm-8-clang/buildroot/REL9_4_STABLE/pgsql.build/src/test/regress/expected/uuid.out Mon Sep 24 17:49:30 2018 --- /home/linux1/build-farm-8-clang/buildroot/REL9_4_STABLE/pgsql.build/src/test/regress/results/uuid.out Wed Sep 26 16:31:31 2018 *** *** 64,72 SELECT guid_field FROM guid1 ORDER BY guid_field DESC; guid_field -- - 3f3e3c3b-3a30-3938-3736-353433a2313e - ---- ---- (3 rows) -- = operator test --- 64,72 SELECT guid_field FROM guid1 ORDER BY guid_field DESC; guid_field -- ---- + ---- + 3f3e3c3b-3a30-3938-3736-353433a2313e (3 rows) -- = operator test == Mark, is there anything odd for specific branches? I don't see anything immediately suspicious in the relevant comparator code... Greetings, Andres Freund
Re: Let's stop with the retail rebuilds of src/port/ files already
I wrote: > Now, if we go forward with that solution, there will be issues with > some other things that libpq exports without having defined itself: > src/backend/utils/mb/wchar.c: > pg_utf_mblen > src/backend/utils/mb/encnames.c: > pg_encoding_to_char > pg_char_to_encoding > pg_valid_server_encoding > pg_valid_server_encoding_id > What I was thinking of proposing was to move those two files out of the > backend and into src/common/, thereby normalizing their status as > modules available in both frontend and backend, and removing the need > for a special build rule for them in libpq. (initdb could be simplified > too.) Per this discovery, we'd need to also remove these symbols from > libpq's exports list, meaning that clients *must* get them from -lpgcommon > not from libpq. After further study I've concluded that moving those two files would be more neatnik-ism than is justified. While it'd get rid of the symlink-a-source-file technique in libpq, there'd still be other occurrences of that in our tree, so the actual cleanup benefit seems pretty limited. And while I'm prepared to believe that nobody outside PG uses pqsignal() or should do so, it's a little harder to make that case for the encnames.c functions; so the risk of causing problems seems noticeably greater. Accordingly, I cleaned up the usage of the existing src/common/ files but didn't move anything around. I plan to stop here unless the buildfarm shows more issues. regards, tom lane
Re: doc - improve description of default privileges
Some thoughts: We should keep the GRANT reference page about GRANT. There is a section about Privileges in the Data Definition chapter, which we could use to expand on general concepts. The ALTER DEFAULT PRIVILEGES reference page would be another place this could be put. The Owner column is redundant, because it's always all applicable privileges. (Having this column would give the impression that it's not always all privileges, so it would be confusing.) Privileges should be listed using their full name (e.g., "SELECT"), not their internal abbreviation letter. The psql commands seem out of place here. If you want to learn about how to use psql, you can go to the psql documentation. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ALTER TABLE on system catalogs
On 2018-Aug-21, Andres Freund wrote: > On 2018-08-21 17:04:41 +0200, Peter Eisentraut wrote: > > > That doesn't solve the original problem, which is being able to set > > reloptions on pg_attribute, because pg_attribute doesn't have a toast > > table but would need one according to existing rules. > > I still think it's wrong to work around this than to actually fix the > issue with pg_attribute not having a toast table. FWIW I'm still bothered by the inability to move pg_largeobject to a different tablespace, per https://postgr.es/m/20160502163033.GA15384@alvherre.pgsql While that needs even more work (preservability across pg_dump for one), this item here would be one thing to fix. Also, I don't quite understand what's so horrible about setting autovacuum options for system catalogs, including those that don't have toast tables. That seems a pretty general feature that needs fixing, too. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: automatic restore point
On 06/09/2018 02:16, Yotsunaga, Naoki wrote: > -Original Message- > From: Yotsunaga, Naoki [mailto:yotsunaga.na...@jp.fujitsu.com] > Sent: Tuesday, June 26, 2018 10:18 AM > To: Postgres hackers > Subject: automatic restore point > >> Hi, I attached a patch to output the LSN before execution to the server log >> >when executing a specific command and accidentally erasing data. > > Since there was an error in the attached patch, I attached the modified patch. I think this should be done using event triggers. Right now, you just have it hardcoded to TRUNCATE and DROP TABLE, which seems somewhat arbitrary. With event triggers, you have the full flexibility to do what you want. You can pick which commands to apply it to, you can log the LSN, you can create restore points, etc. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ALTER TABLE on system catalogs
On 2018-09-28 16:06:30 -0300, Alvaro Herrera wrote: > On 2018-Aug-21, Andres Freund wrote: > > > On 2018-08-21 17:04:41 +0200, Peter Eisentraut wrote: > > > > > That doesn't solve the original problem, which is being able to set > > > reloptions on pg_attribute, because pg_attribute doesn't have a toast > > > table but would need one according to existing rules. > > > > I still think it's wrong to work around this than to actually fix the > > issue with pg_attribute not having a toast table. > > FWIW I'm still bothered by the inability to move pg_largeobject to a > different tablespace, per > https://postgr.es/m/20160502163033.GA15384@alvherre.pgsql > While that needs even more work (preservability across pg_dump for one), > this item here would be one thing to fix. > > Also, I don't quite understand what's so horrible about setting > autovacuum options for system catalogs, including those that don't have > toast tables. That seems a pretty general feature that needs fixing, > too. I'm not sure what that has to do with my point? What I'm saying is that we shouldn't have some weird "should have a toast table but doesn't" exception, not that we shouldn't allow any sort of DDL on catalogs. Greetings, Andres Freund
Re: doc - improve description of default privileges
On 2018-Sep-28, Peter Eisentraut wrote: > The psql commands seem out of place here. If you want to learn about > how to use psql, you can go to the psql documentation. There is a legitimate point in doing this, though, since the GRANT page is already explaining how does psql display privileges. Maybe the right solution is move that stuff all to the psql documentation, and alter the GRANT page to list privileges in terms of their names rather than the way psql displays them. (And of course add cross-links, so that it all makes sense.) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ALTER TABLE on system catalogs
On 2018-Sep-28, Andres Freund wrote: > On 2018-09-28 16:06:30 -0300, Alvaro Herrera wrote: > > On 2018-Aug-21, Andres Freund wrote: > > > > > I still think it's wrong to work around this than to actually fix the > > > issue with pg_attribute not having a toast table. > > > > FWIW I'm still bothered by the inability to move pg_largeobject to a > > different tablespace, per > > https://postgr.es/m/20160502163033.GA15384@alvherre.pgsql > > While that needs even more work (preservability across pg_dump for one), > > this item here would be one thing to fix. > > > > Also, I don't quite understand what's so horrible about setting > > autovacuum options for system catalogs, including those that don't have > > toast tables. That seems a pretty general feature that needs fixing, > > too. > > I'm not sure what that has to do with my point? What I'm saying is that > we shouldn't have some weird "should have a toast table but doesn't" > exception, not that we shouldn't allow any sort of DDL on catalogs. Well, the subtext in your argument seemed to be "let's just add a toast table to pg_attribute and then we don't need any of this". I'm just countering that if we don't have toast tables for some catalogs, it's because that's something we've already beaten to death; so rather than continue to beat it, we should discuss alternative ways to attack the resulting side-effects. I mean, surely adding a toast table to pg_largeobject would be completely silly. Yet if we leave this code unchanged, trying to move it to a different tablespace would "blow up" in a way. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Odd 9.4, 9.3 buildfarm failure on s390x
Andres Freund writes: > It seems Mark started a new buildfarm animal on s390x. It shows a pretty > odd failure on 9.3 and 9.4, but *not* on newer animals: No, lumpsucker is showing the same failure on 9.5 as well. I suspect that the reason 9.6 and up are OK is that 9.6 is where we introduced the abbreviated-sort-key machinery. IOW, the problem exists in the old-style UUID sort comparator but not the new one. Which is pretty darn odd, because the old-style comparator is just memcmp(). How could that be broken without causing lots more issues? regards, tom lane
Re: On-disk compatibility for nbtree-unique-key enhancement
On Fri, Sep 28, 2018 at 8:00 AM Peter Eisentraut wrote: > On 21/09/2018 01:18, Peter Geoghegan wrote: > > * This means that there is a compatibility issue for anyone that is > > already right on the threshold -- we *really* don't want to see a > > REINDEX fail, but that seems like a possibility that we need to talk > > about now. > > When would the REINDEX need to happen? Will the code still be able to > read and write v3 btrees? The patch doesn't do that at the moment, because I've been busy refining it, and because there are a couple of outstanding questions about how to go about it -- the questions I pose on this thread. I accept that it's absolutely essential that nbtree be able to read both v2 and v3 indexes as part of any new v4. Without a measurable performance penalty. That's the standard that this project should be held to. A REINDEX will never *need* to happen. v2 and v3 indexes will gradually go extinct, without many users actually noticing. The on-disk representation of my patch leaves several free status bits in INDEX_ALT_TID_MASK tuples free (3 total will remain, since I'm now using 1 of the 4 for BT_HEAP_TID_ATTR), so it should be easier to add various further enhancements to a v5 or v6 of nbtree. This is similar to how changes to GIN were managed in the past (it may be interesting to look at a GIN leaf page with pg_hexedit, since it'll show you the gory details in a relatively accessible way). I can imagine a INDEX_ALT_TID_MASK bit being used for tuples that point to the heap -- not just for pivot tuples. I have an eye on things like duplicate lists on the leaf level, which would probably work like a GIN posting list. > Could there perhaps be an amcheck or > pageinspect feature that tells you ahead of time if there are too large > items in an old index? That would be easy, but it might not be any better than just having REINDEX or CREATE INDEX [CONCURRENTLY] throw an error. They're already pretty fast. I could easily raise a WARNING when amcheck is run against an index of a version before v4, that has an index tuple that's too big to make it under the lower limit. Actually, I could even write an SQL query that had pageinspect notice affected tuples, without changing any C code. Bear in mind that TOAST compression accidentally plays a big role here. It makes it very unlikely that indexes in the field are right at the old 2712 byte threshold, without even 8 bytes of wiggle room, because it's impossible to predict how well the pglz compression will work with that kind of precision. Several highly improbable things need to happen at the same time before REINDEX can break. I cannot see how any app could have evolved to depend on having 2712 bytes, without even a single MAXALIGN() quantum to spare. I wrote a stress test around the new "1/3 of a page" restriction. It involved a large text attribute with PLAIN storage, since I couldn't sensibly test the restriction while using pglz compression in the index. When all of your tuples are 2704 bytes, you end up with a ridiculously tall B-Tree, that performs horribly. I think that I saw that it had 11 levels with the test case. The tallest B-Tree that you'll ever see in the wild is probably one that's 5 levels deep, which is very tall indeed. Because of the logarithmic nature of how a new level is added to a B-Tree, 11 levels is just ludicrous. (Granted, you only have to have one tuple that's precisely 2712 bytes in length for REINDEX to break.) -- Peter Geoghegan
Re: Pithy patch for more detailed error reporting when effective_io_concurrency is set to nonzero on platforms lacking posix_fadvise()
James Robinson writes: > Per Tom's suggestion on bug #15396, here's a patch to have platforms such as > OSX give a more descriptive message when rejecting a nonzero value for > effective_io_concurrency. Pushed with minor editorialization. regards, tom lane
Re: Continue work on changes to recovery.conf API
On 29/08/2018 17:43, Sergei Kornilov wrote: > Current patch moves recovery.conf settings into GUC system: > - if startup process found recovery.conf - it throw error OK > - recovery mode is turned on if new special file recovery.signal found OK > - standby_mode setting was removed. Standby mode can be enabled if startup > found new special file standby.signal OK > - if present both standby.signal and recovery.signal - we use standby mode > (this design from previous thread) Didn't find the discussion on that and don't understand the reasons, but seems OK and easy to change if we don't like it. > - recovery parameters recovery_target_inclusive, recovery_target_timeline, > recovery_target_action are new GUC without logic changes OK > - recovery_target (immediate), recovery_target_name, recovery_target_time, > recovery_target_xid, recovery_target_lsn are replaced to recovery_target_type > and recovery_target_value (was discussed and changed in previous thread) I think this was the major point of contention. I reread the old thread, and it's still not clear why we need to change this. _type and _value look like an EAV system to me. GUC variables should be verifiable independent of another variable. The other idea of using only one variable seems better, but using two variables seems like a poor compromise between one and five. I propose to move this patch forward, keep the settings as they are. It can be another patch to rename or reshuffle them. > - restore_command, archive_cleanup_command, recovery_end_command, > primary_conninfo, primary_slot_name and recovery_min_apply_delay are just new > GUC OK > - trigger_file was renamed to promote_signal_file for clarify (my change, in > prev thread was removed) OK to add the "promote" prefix, but I'd prefer to keep the "trigger" name. There is a lot of discussion and knowledge around that. Seems unnecessary to change. > - all new GUC are PGC_POSTMASTER (discussed in prev thread) OK > - pg_basebackup with -R (--write-recovery-conf) option will create > standby.signal file and append primary_conninfo and primary_slot_name (if was > used) to postgresql.auto.conf instead of writing new recovery.conf I wonder if that would cause any problems. The settings in postgresql.auto.conf are normally not PGC_POSTMASTER, otherwise you couldn't use ALTER SYSTEM to put them there. Maybe it's OK. > - appropriate changes in tests and docs looks good -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: SQL/JSON: documentation
On 09/28/2018 01:29 PM, Peter Eisentraut wrote: On 28/06/2018 01:36, Nikita Glukhov wrote: Attached patch with draft of SQL/JSON documentation written by Liudmila Mantrova, Oleg Bartunov and me. Also it can be found in our sqljson repository on sqljson_doc branch: https://github.com/postgrespro/sqljson/tree/sqljson_doc We continue to work on it. Some structural comments: - I don't think this should be moved to a separate file. Yes, func.sgml is pretty big, but if we're going to split it up, we should do it in a systematic way, not just one section. I'm in favor of doing that. It's rather a monster. I agree it should not be done piecemeal. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Continue work on changes to recovery.conf API
Peter Eisentraut writes: > I think this was the major point of contention. I reread the old > thread, and it's still not clear why we need to change this. _type and > _value look like an EAV system to me. GUC variables should be > verifiable independent of another variable. No, they MUST be independently verifiable. The interactions between the check_xxx functions in this patch are utterly unsafe. We've learned that lesson before. > I propose to move this patch forward, keep the settings as they are. It > can be another patch to rename or reshuffle them. Please do not commit this in this state. > I wonder if that would cause any problems. The settings in > postgresql.auto.conf are normally not PGC_POSTMASTER, otherwise you > couldn't use ALTER SYSTEM to put them there. Maybe it's OK. Actually, that works fine. You do have to restart the postmaster to make them take effect, but it's the same as if you edited the main config file by hand. regards, tom lane
Re: Continue work on changes to recovery.conf API
Hi, On 2018-09-28 16:36:35 -0400, Tom Lane wrote: > Peter Eisentraut writes: > > I think this was the major point of contention. I reread the old > > thread, and it's still not clear why we need to change this. _type and > > _value look like an EAV system to me. GUC variables should be > > verifiable independent of another variable. > > No, they MUST be independently verifiable. The interactions between > the check_xxx functions in this patch are utterly unsafe. We've > learned that lesson before. I'm not sure those concerns apply quite the same way here - we can move the interdependent verification to the the point where they're used first rather than relying on guc.c infrastructure. We already have plenty of checks interdependent that way, without it causing many problems. UI wise that's not too bad, if they're things that cannot be changed arbitrarily at runtime. Greetings, Andres Freund
Re: SQL/JSON: documentation
Andrew Dunstan writes: > On 09/28/2018 01:29 PM, Peter Eisentraut wrote: >> - I don't think this should be moved to a separate file. Yes, func.sgml >> is pretty big, but if we're going to split it up, we should do it in a >> systematic way, not just one section. > I'm in favor of doing that. It's rather a monster. > I agree it should not be done piecemeal. Maybe split it into one file per existing section? Although TBH, I am not convinced that the benefits of doing that will exceed the back-patching pain we'll incur. regards, tom lane
Re: Continue work on changes to recovery.conf API
Andres Freund writes: > On 2018-09-28 16:36:35 -0400, Tom Lane wrote: >> No, they MUST be independently verifiable. The interactions between >> the check_xxx functions in this patch are utterly unsafe. We've >> learned that lesson before. > I'm not sure those concerns apply quite the same way here - we can move > the interdependent verification to the the point where they're used > first rather than relying on guc.c infrastructure. And, if they're bad, what happens? Recovery fails? I don't think it's a great idea to lose out on whatever error checking the existing GUC infrastructure can provide, just so as to use a GUC design that's not very nice in the first place. regards, tom lane
Re: ALTER TABLE on system catalogs
On 21/08/2018 17:24, Andres Freund wrote: >> Attached is a patch that instead moves those special cases into >> needs_toast_table(), which was one of the options suggested by Andres. >> There were already similar checks there, so I guess this makes a bit of >> sense. > The big difference is that that then only takes effect when called with > check=true. The callers without it, importantly NewHeapCreateToastTable > & NewRelationCreateToastTable, then won't run into this check. But still > into the error (see below). I don't follow. The call to needs_toast_table() is not conditional on the check argument. The check argument only checks that the correct lock level is passed in. >> @@ -145,21 +146,6 @@ create_toast_table(Relation rel, Oid toastOid, Oid >> toastIndexOid, >> ObjectAddress baseobject, >> toastobject; >> >> -/* >> - * Toast table is shared if and only if its parent is. >> - * >> - * We cannot allow toasting a shared relation after initdb (because >> - * there's no way to mark it toasted in other databases' pg_class). >> - */ >> -shared_relation = rel->rd_rel->relisshared; >> -if (shared_relation && !IsBootstrapProcessingMode()) >> -ereport(ERROR, >> - >> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), >> - errmsg("shared tables cannot be toasted after >> initdb"))); > This error check imo shouldn't be removed, but moved down. We could keep it, but it would probably be dead code since needs_toast_table() would return false for all shared tables anyway. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Odd 9.4, 9.3 buildfarm failure on s390x
On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote: > Mark, is there anything odd for specific branches? No... I don't have anything in the config that would be applied to specific branches... Regards, Mark -- Mark Wonghttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
Re: Odd 9.4, 9.3 buildfarm failure on s390x
Hi, On 2018-09-28 15:22:23 -0700, Mark Wong wrote: > On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote: > > Mark, is there anything odd for specific branches? > > No... I don't have anything in the config that would be applied to > specific branches... Could you perhaps do some manual debugging on that machine? Maybe starting with manually running something like: SELECT uuid_cmp('----'::uuid, '----'::uuid); SELECT uuid_cmp('----'::uuid, '----'::uuid); SELECT uuid_cmp('----'::uuid, '1113----'::uuid); SELECT uuid_cmp('----'::uuid, '1110----'::uuid); on both master and one of the failing branches? Greetings, Andres Freund
Re: Use durable_unlink for .ready and .done files for WAL segment removal
On Fri, Sep 28, 2018 at 02:36:19PM -0400, Stephen Frost wrote: > Is there an issue with making the archiver able to understand that > situation instead of being confused by it..? Seems like that'd probably > be a good thing to do regardless of this, but that would then remove the > need for this kind of change.. I thought about that a bit, and there is as well a lot which can be done within the archive_command itself regarding that, so I am not sure that there is the argument to make pgarch.c more complicated than it should. Now it is true that for most users having a .ready file but no segment would most likely lead in a failure. I suspect that a large user base is still just using plain cp in archive_command, which would cause the archiver to be stuck. So we could actually just tweak pgarch_readyXlog to check if the segment fetched actually exists (see bottom of the so-said function). If it doesn't, then the archiver removes the .ready file and retries fetching a new segment. -- Michael signature.asc Description: PGP signature
Re: [PATCH] Include application_name in "connection authorized" log message
Greetings, * Stephen Frost (sfr...@snowman.net) wrote: > * Stephen Frost (sfr...@snowman.net) wrote: > > I still don't see that as a reason for tools to be suseptible to serious > > issues if a funky user gets created and I'd be surprised if there > > weren't other ways to get funky characters into the log file, but that's > > all ultimately an independent issue from this. I'll add the comments as > > discussed and discourage using the clean ascii function, but otherwise > > keep things as-is in that regard. > > Updated patch with lots of comments added around pg_clean_ascii() about > why it exists, why we do the cleaning, and warnings to discourage people > from using it without good cause. I've also done some additional > testing with it and it seems to be working well. I'll poke at it a bit > more tomorrow but if there aren't any other concerns, I'll commit it > towards the end of the day. Pushed, thanks all for the help and discussion, and thanks to Don for his patch! Thanks! Stephen signature.asc Description: PGP signature
Re: Use durable_unlink for .ready and .done files for WAL segment removal
Greetings, * Michael Paquier (mich...@paquier.xyz) wrote: > On Fri, Sep 28, 2018 at 02:36:19PM -0400, Stephen Frost wrote: > > Is there an issue with making the archiver able to understand that > > situation instead of being confused by it..? Seems like that'd probably > > be a good thing to do regardless of this, but that would then remove the > > need for this kind of change.. > > I thought about that a bit, and there is as well a lot which can be done > within the archive_command itself regarding that, so I am not sure that > there is the argument to make pgarch.c more complicated than it should. > Now it is true that for most users having a .ready file but no segment > would most likely lead in a failure. I suspect that a large user base > is still just using plain cp in archive_command, which would cause the > archiver to be stuck. So we could actually just tweak pgarch_readyXlog > to check if the segment fetched actually exists (see bottom of the > so-said function). If it doesn't, then the archiver removes the .ready > file and retries fetching a new segment. Yes, checking if the WAL file exists before calling archive_command on it is what I was thinking we'd do here, and if it doesn't, then just remove the .ready file. An alternative would be to go through the .ready files on crash-recovery and remove any .ready files that don't have corresponding WAL files, or if we felt that it was necessary, we could do that on every restart but do we really think we'd need to do that..? Thanks! Stephen signature.asc Description: PGP signature
Re: [PATCH] Include application_name in "connection authorized" log message
Thanks for all the guidance! Don. On Fri, Sep 28, 2018, 18:12 Stephen Frost wrote: > Greetings, > > * Stephen Frost (sfr...@snowman.net) wrote: > > * Stephen Frost (sfr...@snowman.net) wrote: > > > I still don't see that as a reason for tools to be suseptible to > serious > > > issues if a funky user gets created and I'd be surprised if there > > > weren't other ways to get funky characters into the log file, but > that's > > > all ultimately an independent issue from this. I'll add the comments > as > > > discussed and discourage using the clean ascii function, but otherwise > > > keep things as-is in that regard. > > > > Updated patch with lots of comments added around pg_clean_ascii() about > > why it exists, why we do the cleaning, and warnings to discourage people > > from using it without good cause. I've also done some additional > > testing with it and it seems to be working well. I'll poke at it a bit > > more tomorrow but if there aren't any other concerns, I'll commit it > > towards the end of the day. > > Pushed, thanks all for the help and discussion, and thanks to Don for > his patch! > > Thanks! > > Stephen >
Re: Odd 9.4, 9.3 buildfarm failure on s390x
Hi Andres, On Fri, Sep 28, 2018 at 03:41:27PM -0700, Andres Freund wrote: > On 2018-09-28 15:22:23 -0700, Mark Wong wrote: > > On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote: > > > Mark, is there anything odd for specific branches? > > > > No... I don't have anything in the config that would be applied to > > specific branches... > > Could you perhaps do some manual debugging on that machine? > > Maybe starting with manually running something like: > > SELECT uuid_cmp('----'::uuid, > '----'::uuid); > SELECT uuid_cmp('----'::uuid, > '----'::uuid); > SELECT uuid_cmp('----'::uuid, > '1113----'::uuid); > SELECT uuid_cmp('----'::uuid, > '1110----'::uuid); > > on both master and one of the failing branches? I've attached the output for head and the 9.4 stable branch. It appears they are returning the same results. I built them both by: CC=/usr/bin/clang ./configure --enable-cassert --enable-debug \ --enable-nls --with-perl --with-python --with-tcl \ --with-tclconfig=/usr/lib64 --with-gssapi --with-openssl \ --with-ldap --with-libxml --with-libxslt What should I try next? Regards, Mark -- Mark Wonghttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services psql (12devel) Type "help" for help. postgres=# SELECT uuid_cmp('----'::uuid, '----'::uuid); uuid_cmp - -2147483648 (1 row) postgres=# SELECT uuid_cmp('----'::uuid, '----'::uuid); uuid_cmp -- 0 (1 row) postgres=# SELECT uuid_cmp('----'::uuid, '1113----'::uuid); uuid_cmp - -2147483648 (1 row) postgres=# SELECT uuid_cmp('----'::uuid, '1110----'::uuid); uuid_cmp -- 1 (1 row) psql (9.4.19) Type "help" for help. postgres=# SELECT uuid_cmp('----'::uuid, '----'::uuid); uuid_cmp - -2147483648 (1 row) postgres=# SELECT uuid_cmp('----'::uuid, '----'::uuid); uuid_cmp -- 0 (1 row) postgres=# SELECT uuid_cmp('----'::uuid, '1113----'::uuid); uuid_cmp - -2147483648 (1 row) postgres=# SELECT uuid_cmp('----'::uuid, '1110----'::uuid); uuid_cmp -- 1 (1 row)
Oid returned from AddSubscriptionRelState/UpdateSubscriptionRelState
Hi, How come those two functions return oids, even though, as far as I understand, the underlying relation is BKI_WITHOUT_OIDS. I assume that's just an oversight? Greetings, Andres Freund
Re: [HACKERS] Something for the TODO list: deprecating abstime and friends
Andres Freund writes: > Here's a refreshed version of this patch. First patch removes > contrib/spi/timetravel, second patch removes abstime, reltime, tinterval > together with timeofday(). I'd kind of like to keep timeofday(); it's the only simple way to get a time display that includes "native" timezone info. For instance, I get regression=# select timeofday(); timeofday - Sat Sep 29 00:37:35.490977 2018 EDT (1 row) I think every other option would show me "-04" not "EDT". +1 for removing the rest of that, though. Unless somebody is motivated to recast contrib/spi/timetravel with timestamptz as the datetime type? regards, tom lane
RE: auto_explain: Include JIT output if applicable
Hi, I tried this feature. I think that 'if (es->costs)' of the source code auto_explain.c will always be ‘true’. Because it is not changed after 'es-> cost = true' in NewExplainState () function several rows ago. So I attached a patch to delete this if statement. Regards, Noriyoshi Shinoda From: Lukas Fittl [mailto:lu...@fittl.com] Sent: Tuesday, September 25, 2018 6:38 AM To: Andres Freund Cc: Pg Hackers Subject: Re: auto_explain: Include JIT output if applicable On Mon, Sep 24, 2018 at 1:48 PM, Andres Freund mailto:and...@anarazel.de>> wrote: Thanks for noticing - pushed! Thanks! Best, Lukas -- Lukas Fittl auto_explain.diff Description: auto_explain.diff
Re: auto_explain: Include JIT output if applicable
Hi, On 2018-09-29 05:04:25 +, Shinoda, Noriyoshi (PN Japan GCS Delivery) wrote: > I tried this feature. > I think that 'if (es->costs)' of the source code auto_explain.c will always > be ‘true’. > > Because it is not changed after 'es-> cost = true' in NewExplainState () > function several rows ago. > So I attached a patch to delete this if statement. I think it's better to stay closer to what explain.c itself is doing - it's not like that if statement costs us anything really... - Andres
RE: auto_explain: Include JIT output if applicable
Hi, > I think it's better to stay closer to what explain.c itself is doing - it's > not like that if statement costs us anything really... Oh, I understood. Thank you. -Original Message- From: Andres Freund [mailto:and...@anarazel.de] Sent: Saturday, September 29, 2018 2:11 PM To: Shinoda, Noriyoshi (PN Japan GCS Delivery) Cc: Lukas Fittl ; Pg Hackers Subject: Re: auto_explain: Include JIT output if applicable Hi, On 2018-09-29 05:04:25 +, Shinoda, Noriyoshi (PN Japan GCS Delivery) wrote: > I tried this feature. > I think that 'if (es->costs)' of the source code auto_explain.c will always > be ‘true’. > > Because it is not changed after 'es-> cost = true' in NewExplainState () > function several rows ago. > So I attached a patch to delete this if statement. I think it's better to stay closer to what explain.c itself is doing - it's not like that if statement costs us anything really... - Andres
Re: Odd 9.4, 9.3 buildfarm failure on s390x
> "Mark" == Mark Wong writes: Mark> What should I try next? What is the size of a C "int" on this platform? -- Andrew (irc:RhodiumToad)
Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
28.09.2018 23:08, Peter Geoghegan wrote: On Fri, Sep 28, 2018 at 7:50 AM Peter Eisentraut wrote: I think it might help this patch move along if it were split up a bit, for example 1) suffix truncation, 2) tid stuff, 3) new split strategies. That way, it would also be easier to test out each piece separately. For example, how much space does suffix truncation save in what scenario, are there any performance regressions, etc. I'll do my best. I don't think I can sensibly split out suffix truncation from the TID stuff -- those seem truly inseparable, since my mental model for suffix truncation breaks without fully unique keys. I can break out all the cleverness around choosing a split point into its own patch, though -- _bt_findsplitloc() has only been changed to give weight to several factors that become important. It's the "brain" of the optimization, where 90% of the complexity actually lives. Removing the _bt_findsplitloc() changes will make the performance of the other stuff pretty poor, and in particular will totally remove the benefit for cases that "become tired" on the master branch. That could be slightly interesting, I suppose. I am reviewing this patch too. And join to Peter Eisentraut opinion about splitting the patch into a hierarchy of two or three patches: "functional" - tid stuff and "optimizational" - suffix truncation & splitting. My reasons are simplification of code review, investigation and benchmarking. Now benchmarking is not clear. Possible performance degradation from TID ordering interfere with positive effects from the optimizations in non-trivial way. -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company