Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Tomas Vondra
On 04/09/2018 09:37 PM, Andres Freund wrote: > > > On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos > wrote: > >> I honestly do not expect that keeping around the failed pages will >> be an acceptable change for most kernels, and as such the >> recommendation >> will probably be to coord

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
On 2018-04-09 14:41:19 -0500, Justin Pryzby wrote: > On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote: > > You could make the argument that it's OK to forget if the entire file > > system goes away. But actually, why is that ok? > > I was going to say that it'd be okay to clear error f

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
Hi, On 2018-04-09 21:54:05 +0200, Tomas Vondra wrote: > Isn't the expectation that when a fsync call fails, the next one will > retry writing the pages in the hope that it succeeds? Some people expect that, I personally don't think it's a useful expectation. We should just deal with this by cras

Re: Shared PostgreSQL libraries and symbol versioning

2018-04-09 Thread Peter Eisentraut
On 4/5/18 02:04, Pavel Raiskup wrote: > Hello, for the support of multiple versions of PostgreSQL RPM packages on > one system, we are thinking about having only one libpq.so.5 > (libecpg.so.6, libpgtype.so.3 respectively) supported and about building > (linking) all the PostgreSQL package versions

Re: Shared PostgreSQL libraries and symbol versioning

2018-04-09 Thread Stephen Frost
Greetings, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 4/5/18 02:04, Pavel Raiskup wrote: > > Hello, for the support of multiple versions of PostgreSQL RPM packages on > > one system, we are thinking about having only one libpq.so.5 > > (libecpg.so.6, libpgtype.so.3 respectiv

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Mark Dilger
> On Apr 9, 2018, at 12:13 PM, Andres Freund wrote: > > Hi, > > On 2018-04-09 15:02:11 -0400, Robert Haas wrote: >> I think the simplest technological solution to this problem is to >> rewrite the entire backend and all supporting processes to use >> O_DIRECT everywhere. To maintain adequate p

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Tomas Vondra
On 04/09/2018 10:04 PM, Andres Freund wrote: > Hi, > > On 2018-04-09 21:54:05 +0200, Tomas Vondra wrote: >> Isn't the expectation that when a fsync call fails, the next one will >> retry writing the pages in the hope that it succeeds? > > Some people expect that, I personally don't think it's a

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
Hi, On 2018-04-09 13:25:54 -0700, Mark Dilger wrote: > I was reading this thread up until now as meaning that the standby could > receive corrupt WAL data and become corrupted. I don't see that as a real problem here. For one the problematic scenarios shouldn't readily apply, for another WAL is c

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
Hi, On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote: > Maybe. I'd certainly prefer automated recovery from an temporary I/O > issues (like full disk on thin-provisioning) without the database > crashing and restarting. But I'm not sure it's worth the effort. Oh, I agree on that one. But that's m

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Tomas Vondra
On 04/09/2018 10:25 PM, Mark Dilger wrote: > >> On Apr 9, 2018, at 12:13 PM, Andres Freund wrote: >> >> Hi, >> >> On 2018-04-09 15:02:11 -0400, Robert Haas wrote: >>> I think the simplest technological solution to this problem is to >>> rewrite the entire backend and all supporting processes to

Re: pgsql: Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.

2018-04-09 Thread Tom Lane
I wrote: > Michael Paquier writes: >> That takes care of the problem from the root of the directory, but when >> doing the same from src/bin/ then the same issue shows up even if >> src/Makefile is patched to handle install targets. > Hm. Not sure how far we want to go in that direction. It's n

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Mark Dilger
> On Apr 9, 2018, at 1:43 PM, Tomas Vondra wrote: > > > > On 04/09/2018 10:25 PM, Mark Dilger wrote: >> >>> On Apr 9, 2018, at 12:13 PM, Andres Freund wrote: >>> >>> Hi, >>> >>> On 2018-04-09 15:02:11 -0400, Robert Haas wrote: I think the simplest technological solution to this proble

Re: Shared PostgreSQL libraries and symbol versioning

2018-04-09 Thread Tom Lane
Stephen Frost writes: > * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: >> On 4/5/18 02:04, Pavel Raiskup wrote: >>> As a followup thought; there are probably two major obstacles ATM >>> - the DSOs' symbols are not yet versioned, and >>> - the build-system doesn't seem to know how to

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
Hi, On 2018-04-09 13:55:29 -0700, Mark Dilger wrote: > I can also imagine a master and standby that are similarly provisioned, > and thus hit an out of disk error at around the same time, resulting in > corruption on both, even if not the same corruption. I think it's a grave mistake conflating E

Re: Fix pg_rewind which can be run as root user

2018-04-09 Thread Michael Paquier
On Mon, Apr 09, 2018 at 09:36:40PM +0200, Magnus Hagander wrote: > Applied, and pushed this way. OK, thanks for the commit. -- Michael signature.asc Description: PGP signature

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Tomas Vondra
On 04/09/2018 11:08 PM, Andres Freund wrote: > Hi, > > On 2018-04-09 13:55:29 -0700, Mark Dilger wrote: >> I can also imagine a master and standby that are similarly provisioned, >> and thus hit an out of disk error at around the same time, resulting in >> corruption on both, even if not the sam

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Mark Dilger
> On Apr 9, 2018, at 2:25 PM, Tomas Vondra wrote: > > > > On 04/09/2018 11:08 PM, Andres Freund wrote: >> Hi, >> >> On 2018-04-09 13:55:29 -0700, Mark Dilger wrote: >>> I can also imagine a master and standby that are similarly provisioned, >>> and thus hit an out of disk error at around the

Re: [HACKERS] Optional message to user when terminating/cancelling backend

2018-04-09 Thread Daniel Gustafsson
> On 09 Apr 2018, at 02:47, Michael Paquier wrote: > > On Fri, Apr 06, 2018 at 11:18:34AM +0200, Daniel Gustafsson wrote: >> Yep, I completely agree. Attached are patches with the quotes removed and >> rebased since Oids were taken etc. > > I still find this idea interesting for plugin authors.

Re: [HACKERS] Optional message to user when terminating/cancelling backend

2018-04-09 Thread Andres Freund
Hi, On 2017-06-20 13:01:35 -0700, Andres Freund wrote: > For extensions it'd also be useful if it'd be possible to overwrite the > error code. E.g. for citus there's a distributed deadlock detector, > running out of process because there's no way to interrupt lock waits > locally, and we've to do

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Thomas Munro
On Tue, Apr 10, 2018 at 2:22 AM, Anthony Iliopoulos wrote: > On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote: >> Well, there seem to be kernels that seem to do exactly that already. At >> least that's how I understand what this thread says about FreeBSD and >> Illumos, for example. So

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Thomas Munro
On Tue, Apr 10, 2018 at 10:33 AM, Thomas Munro wrote: > I wonder if anyone can tell us what Windows, AIX and HPUX do here. I created a wiki page to track what we know (or think we know) about fsync() on various operating systems: https://wiki.postgresql.org/wiki/Fsync_Errors If anyone has more

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andreas Karlsson
On 04/09/2018 02:16 PM, Craig Ringer wrote: I'd like a middle ground where the kernel lets us register our interest and tells us if it lost something, without us having to keep eight million FDs open for some long period. "Tell us about anything that happens under pgdata/" or an inotify-style p

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Andres Freund
Hi, On 2018-04-05 12:20:38 -0700, Andres Freund wrote: > > While it's not POSIX, at least some platforms are capable of delivering > > a separate signal on parent process death. Perhaps using that where > > available would be enough of an answer. > > Yea, that'd work on linux. Which is probably

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Alvaro Herrera
Andres Freund wrote: > Another approach, that's simpler to implement, is to simply have a > second selfpipe, just for WL_POSTMASTER_DEATH. Would it work to use this second pipe, to which each child writes a byte that postmaster never reads, and then rely on SIGPIPE when postmaster dies? Then we

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Andres Freund
On April 9, 2018 6:31:07 PM PDT, Alvaro Herrera wrote: >Andres Freund wrote: > >> Another approach, that's simpler to implement, is to simply have a >> second selfpipe, just for WL_POSTMASTER_DEATH. > >Would it work to use this second pipe, to which each child writes a >byte >that postmaster nev

Re: Boolean partitions syntax

2018-04-09 Thread Kyotaro HORIGUCHI
Hello, I returned to this. I'd like to insisnt on prposing to use existing parser element. At Mon, 9 Apr 2018 10:11:08 -0400, "Jonathan S. Katz" wrote in <27021281-2ed7-4cde-9d82-366af10b3...@excoventures.com> > > On Apr 9, 2018, at 10:06 AM, Tom Lane wrote: > > It's premature to discuss whet

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Thomas Munro
On Tue, Apr 10, 2018 at 12:53 PM, Andres Freund wrote: > I coincidentally got pinged about our current approach causing > performance problems on FreeBSD and started writing a patch. The > problem there appears to be that constantly attaching events to the read > pipe end, from multiple processes

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Andres Freund
On April 9, 2018 6:36:19 PM PDT, Thomas Munro wrote: >On Tue, Apr 10, 2018 at 12:53 PM, Andres Freund >wrote: >> I coincidentally got pinged about our current approach causing >> performance problems on FreeBSD and started writing a patch. The >> problem there appears to be that constantly at

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Craig Ringer
On 10 April 2018 at 03:59, Andres Freund wrote: > On 2018-04-09 14:41:19 -0500, Justin Pryzby wrote: >> On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote: >> > You could make the argument that it's OK to forget if the entire file >> > system goes away. But actually, why is that ok? >> >

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Thomas Munro
On Tue, Apr 10, 2018 at 1:44 PM, Craig Ringer wrote: > On 10 April 2018 at 03:59, Andres Freund wrote: >> I don't think that's as hard as some people argued in this thread. We >> could very well open a pipe in postmaster with the write end open in >> each subprocess, and the read end open only i

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Craig Ringer
On 10 April 2018 at 04:25, Mark Dilger wrote: > I was reading this thread up until now as meaning that the standby could > receive corrupt WAL data and become corrupted. Yes, it can, but not directly through the first error. What can happen is that we think a block got written when it didn't.

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Alvaro Herrera
Andres Freund wrote: > > On April 9, 2018 6:31:07 PM PDT, Alvaro Herrera > wrote: > >Would it work to use this second pipe, to which each child writes a > >byte that postmaster never reads, and then rely on SIGPIPE when > >postmaster dies? Then we never need to do a syscall. > > I'm not follo

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Craig Ringer
On 10 April 2018 at 04:37, Andres Freund wrote: > Hi, > > On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote: >> Maybe. I'd certainly prefer automated recovery from an temporary I/O >> issues (like full disk on thin-provisioning) without the database >> crashing and restarting. But I'm not sure it's

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Andres Freund
On April 9, 2018 6:57:23 PM PDT, Alvaro Herrera wrote: >Andres Freund wrote: >> >> On April 9, 2018 6:31:07 PM PDT, Alvaro Herrera > wrote: > >> >Would it work to use this second pipe, to which each child writes a >> >byte that postmaster never reads, and then rely on SIGPIPE when >> >postmaste

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
On April 9, 2018 6:59:03 PM PDT, Craig Ringer wrote: >On 10 April 2018 at 04:37, Andres Freund wrote: >> Hi, >> >> On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote: >>> Maybe. I'd certainly prefer automated recovery from an temporary I/O >>> issues (like full disk on thin-provisioning) without

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Craig Ringer
On 10 April 2018 at 08:41, Andreas Karlsson wrote: > On 04/09/2018 02:16 PM, Craig Ringer wrote: >> >> I'd like a middle ground where the kernel lets us register our interest >> and tells us if it lost something, without us having to keep eight million >> FDs open for some long period. "Tell us ab

Re: WIP: Covering + unique indexes.

2018-04-09 Thread Peter Geoghegan
On Sun, Apr 8, 2018 at 11:19 PM, Teodor Sigaev wrote: > Thank you, pushed. I noticed a few more issues following another pass-through of the patch: * There is no pfree() within _bt_buildadd() for truncated tuples, even though that's a context where it's clearly not okay. * It might be a good id

Re: ON CONFLICT DO UPDATE for partitioned tables

2018-04-09 Thread Amit Langote
On 2018/03/27 13:27, Amit Langote wrote: > On 2018/03/26 23:20, Alvaro Herrera wrote: >> The one thing I wasn't terribly in love with is the four calls to >> map_partition_varattnos(), creating the attribute map four times ... but >> we already have it in the TupleConversionMap, no? Looks like we

Re: pgsql: Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.

2018-04-09 Thread Michael Paquier
On Mon, Apr 09, 2018 at 04:46:34PM -0400, Tom Lane wrote: > After further contemplation I decided that that was, in fact, the only > reasonable way to improve matters. If we have multiple subdirectories > independently firing the "make generated-headers" action, then we have > parallel make hazard

Re: Warnings and uninitialized variables in TAP tests

2018-04-09 Thread Michael Paquier
On Mon, Apr 09, 2018 at 09:46:29PM +0200, Magnus Hagander wrote: > Applied, thanks. Thanks for the commit. -- Michael signature.asc Description: PGP signature

Re: [sqlsmith] Failed assertion on pfree() via perform_pruning_combine_step

2018-04-09 Thread Amit Langote
On 2018/04/09 22:59, Alvaro Herrera wrote: > Hello, > > Amit Langote wrote: > >> I have reproduced this and found that the problem is that >> perform_pruning_combine_step forgets to *copy* the bitmapset of the first >> step in the handling of an COMBINE_INTERSECT step. > > Pushed, thanks Amit an

Gotchas about pg_verify_checksums

2018-04-09 Thread Michael Paquier
Hi all, I have not been giving much attention to the thread about enabling checksums online, which has resulted in the revert of the feature, but there is still pg_verify_checksums around. So I looked at it a bit. I have a couple of questions/gotchas about it: 1) The documentation states that th

Re: pruning disabled for array, enum, record, range type partition keys

2018-04-09 Thread Amit Langote
Thanks for the comment. On 2018/04/09 23:22, Tom Lane wrote: > Amit Langote writes: >> I noticed that the newly added pruning does not work if the partition key >> is of one of the types that have a corresponding pseudo-type. > > While I don't recall the details due to acute caffeine shortage, >

Re: [HACKERS] path toward faster partition pruning

2018-04-09 Thread Ashutosh Bapat
On Mon, Apr 9, 2018 at 8:56 PM, Robert Haas wrote: > On Fri, Apr 6, 2018 at 11:41 PM, Tom Lane wrote: >> David Rowley writes: >>> Sounds like you're saying that if we have too many alternative files >>> then there's a chance that one could pass by luck. >> >> Yeah, exactly: it passed, but did it

Re: lazy detoasting

2018-04-09 Thread Andrew Gierth
> "Chapman" == Chapman Flack writes: Chapman> AFAICS, that is *all* that comment block has to say about why Chapman> there's an active snapshot stack. I believe you are saying it Chapman> has another important function, namely that its top element is Chapman> what tells the executor what

Re: [sqlsmith] Failed assertion on pfree() via perform_pruning_combine_step

2018-04-09 Thread Michael Paquier
On Mon, Apr 09, 2018 at 10:59:48AM -0300, Alvaro Herrera wrote: > Amit Langote wrote: >> I have reproduced this and found that the problem is that >> perform_pruning_combine_step forgets to *copy* the bitmapset of the first >> step in the handling of an COMBINE_INTERSECT step. > > Pushed, thanks A

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Michael Paquier
On Mon, Apr 09, 2018 at 03:02:11PM -0400, Robert Haas wrote: > Another consequence of this behavior that initdb -S is never reliable, > so pg_rewind's use of it doesn't actually fix the problem it was > intended to solve. It also means that initdb itself isn't crash-safe, > since the data file cha

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-04-09 Thread Andrey Borodin
> 9 апр. 2018 г., в 23:04, Heikki Linnakangas написал(а): > > On 09/04/18 18:21, Andrey Borodin wrote: >>> 9 апр. 2018 г., в 19:50, Teodor Sigaev >>> написал(а): 3. Why do we *not* lock the entry leaf page, if there is no match? We still need a lock to remember that we probed for tha

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Craig Ringer
On 10 April 2018 at 13:04, Michael Paquier wrote: > On Mon, Apr 09, 2018 at 03:02:11PM -0400, Robert Haas wrote: >> Another consequence of this behavior that initdb -S is never reliable, >> so pg_rewind's use of it doesn't actually fix the problem it was >> intended to solve. It also means that i

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Michael Paquier
On Tue, Apr 10, 2018 at 01:37:19PM +0800, Craig Ringer wrote: > On 10 April 2018 at 13:04, Michael Paquier wrote: >> And pg_basebackup. And pg_dump. And pg_dumpall. Anything using initdb >> -S or fsync_pgdata would enter in those waters. > > ... but *only if they hit an I/O error* or they're o

Re: [sqlsmith] Failed assertion on pfree() via perform_pruning_combine_step

2018-04-09 Thread Amit Langote
On 2018/04/10 13:55, Michael Paquier wrote: > On Mon, Apr 09, 2018 at 10:59:48AM -0300, Alvaro Herrera wrote: >> Amit Langote wrote: >>> I have reproduced this and found that the problem is that >>> perform_pruning_combine_step forgets to *copy* the bitmapset of the first >>> step in the handling o

Re: lazy detoasting

2018-04-09 Thread Chapman Flack
On 04/10/18 00:30, Andrew Gierth wrote: > That's not precisely true - ultimately, the routines that do actual > scans take the snapshot to use as a parameter, and the executor mostly > references the snapshot from the EState; but a bunch of places do > require that ActiveSnapshot be set to the cur

<    1   2