On 04/09/2018 09:37 PM, Andres Freund wrote:
>
>
> On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos
> wrote:
>
>> I honestly do not expect that keeping around the failed pages will
>> be an acceptable change for most kernels, and as such the
>> recommendation
>> will probably be to coord
On 2018-04-09 14:41:19 -0500, Justin Pryzby wrote:
> On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote:
> > You could make the argument that it's OK to forget if the entire file
> > system goes away. But actually, why is that ok?
>
> I was going to say that it'd be okay to clear error f
Hi,
On 2018-04-09 21:54:05 +0200, Tomas Vondra wrote:
> Isn't the expectation that when a fsync call fails, the next one will
> retry writing the pages in the hope that it succeeds?
Some people expect that, I personally don't think it's a useful
expectation.
We should just deal with this by cras
On 4/5/18 02:04, Pavel Raiskup wrote:
> Hello, for the support of multiple versions of PostgreSQL RPM packages on
> one system, we are thinking about having only one libpq.so.5
> (libecpg.so.6, libpgtype.so.3 respectively) supported and about building
> (linking) all the PostgreSQL package versions
Greetings,
* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 4/5/18 02:04, Pavel Raiskup wrote:
> > Hello, for the support of multiple versions of PostgreSQL RPM packages on
> > one system, we are thinking about having only one libpq.so.5
> > (libecpg.so.6, libpgtype.so.3 respectiv
> On Apr 9, 2018, at 12:13 PM, Andres Freund wrote:
>
> Hi,
>
> On 2018-04-09 15:02:11 -0400, Robert Haas wrote:
>> I think the simplest technological solution to this problem is to
>> rewrite the entire backend and all supporting processes to use
>> O_DIRECT everywhere. To maintain adequate p
On 04/09/2018 10:04 PM, Andres Freund wrote:
> Hi,
>
> On 2018-04-09 21:54:05 +0200, Tomas Vondra wrote:
>> Isn't the expectation that when a fsync call fails, the next one will
>> retry writing the pages in the hope that it succeeds?
>
> Some people expect that, I personally don't think it's a
Hi,
On 2018-04-09 13:25:54 -0700, Mark Dilger wrote:
> I was reading this thread up until now as meaning that the standby could
> receive corrupt WAL data and become corrupted.
I don't see that as a real problem here. For one the problematic
scenarios shouldn't readily apply, for another WAL is c
Hi,
On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote:
> Maybe. I'd certainly prefer automated recovery from an temporary I/O
> issues (like full disk on thin-provisioning) without the database
> crashing and restarting. But I'm not sure it's worth the effort.
Oh, I agree on that one. But that's m
On 04/09/2018 10:25 PM, Mark Dilger wrote:
>
>> On Apr 9, 2018, at 12:13 PM, Andres Freund wrote:
>>
>> Hi,
>>
>> On 2018-04-09 15:02:11 -0400, Robert Haas wrote:
>>> I think the simplest technological solution to this problem is to
>>> rewrite the entire backend and all supporting processes to
I wrote:
> Michael Paquier writes:
>> That takes care of the problem from the root of the directory, but when
>> doing the same from src/bin/ then the same issue shows up even if
>> src/Makefile is patched to handle install targets.
> Hm. Not sure how far we want to go in that direction. It's n
> On Apr 9, 2018, at 1:43 PM, Tomas Vondra wrote:
>
>
>
> On 04/09/2018 10:25 PM, Mark Dilger wrote:
>>
>>> On Apr 9, 2018, at 12:13 PM, Andres Freund wrote:
>>>
>>> Hi,
>>>
>>> On 2018-04-09 15:02:11 -0400, Robert Haas wrote:
I think the simplest technological solution to this proble
Stephen Frost writes:
> * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
>> On 4/5/18 02:04, Pavel Raiskup wrote:
>>> As a followup thought; there are probably two major obstacles ATM
>>> - the DSOs' symbols are not yet versioned, and
>>> - the build-system doesn't seem to know how to
Hi,
On 2018-04-09 13:55:29 -0700, Mark Dilger wrote:
> I can also imagine a master and standby that are similarly provisioned,
> and thus hit an out of disk error at around the same time, resulting in
> corruption on both, even if not the same corruption.
I think it's a grave mistake conflating E
On Mon, Apr 09, 2018 at 09:36:40PM +0200, Magnus Hagander wrote:
> Applied, and pushed this way.
OK, thanks for the commit.
--
Michael
signature.asc
Description: PGP signature
On 04/09/2018 11:08 PM, Andres Freund wrote:
> Hi,
>
> On 2018-04-09 13:55:29 -0700, Mark Dilger wrote:
>> I can also imagine a master and standby that are similarly provisioned,
>> and thus hit an out of disk error at around the same time, resulting in
>> corruption on both, even if not the sam
> On Apr 9, 2018, at 2:25 PM, Tomas Vondra wrote:
>
>
>
> On 04/09/2018 11:08 PM, Andres Freund wrote:
>> Hi,
>>
>> On 2018-04-09 13:55:29 -0700, Mark Dilger wrote:
>>> I can also imagine a master and standby that are similarly provisioned,
>>> and thus hit an out of disk error at around the
> On 09 Apr 2018, at 02:47, Michael Paquier wrote:
>
> On Fri, Apr 06, 2018 at 11:18:34AM +0200, Daniel Gustafsson wrote:
>> Yep, I completely agree. Attached are patches with the quotes removed and
>> rebased since Oids were taken etc.
>
> I still find this idea interesting for plugin authors.
Hi,
On 2017-06-20 13:01:35 -0700, Andres Freund wrote:
> For extensions it'd also be useful if it'd be possible to overwrite the
> error code. E.g. for citus there's a distributed deadlock detector,
> running out of process because there's no way to interrupt lock waits
> locally, and we've to do
On Tue, Apr 10, 2018 at 2:22 AM, Anthony Iliopoulos wrote:
> On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote:
>> Well, there seem to be kernels that seem to do exactly that already. At
>> least that's how I understand what this thread says about FreeBSD and
>> Illumos, for example. So
On Tue, Apr 10, 2018 at 10:33 AM, Thomas Munro
wrote:
> I wonder if anyone can tell us what Windows, AIX and HPUX do here.
I created a wiki page to track what we know (or think we know) about
fsync() on various operating systems:
https://wiki.postgresql.org/wiki/Fsync_Errors
If anyone has more
On 04/09/2018 02:16 PM, Craig Ringer wrote:
I'd like a middle ground where the kernel lets us register our interest
and tells us if it lost something, without us having to keep eight
million FDs open for some long period. "Tell us about anything that
happens under pgdata/" or an inotify-style p
Hi,
On 2018-04-05 12:20:38 -0700, Andres Freund wrote:
> > While it's not POSIX, at least some platforms are capable of delivering
> > a separate signal on parent process death. Perhaps using that where
> > available would be enough of an answer.
>
> Yea, that'd work on linux. Which is probably
Andres Freund wrote:
> Another approach, that's simpler to implement, is to simply have a
> second selfpipe, just for WL_POSTMASTER_DEATH.
Would it work to use this second pipe, to which each child writes a byte
that postmaster never reads, and then rely on SIGPIPE when postmaster
dies? Then we
On April 9, 2018 6:31:07 PM PDT, Alvaro Herrera wrote:
>Andres Freund wrote:
>
>> Another approach, that's simpler to implement, is to simply have a
>> second selfpipe, just for WL_POSTMASTER_DEATH.
>
>Would it work to use this second pipe, to which each child writes a
>byte
>that postmaster nev
Hello, I returned to this.
I'd like to insisnt on prposing to use existing parser element.
At Mon, 9 Apr 2018 10:11:08 -0400, "Jonathan S. Katz"
wrote in
<27021281-2ed7-4cde-9d82-366af10b3...@excoventures.com>
> > On Apr 9, 2018, at 10:06 AM, Tom Lane wrote:
> > It's premature to discuss whet
On Tue, Apr 10, 2018 at 12:53 PM, Andres Freund wrote:
> I coincidentally got pinged about our current approach causing
> performance problems on FreeBSD and started writing a patch. The
> problem there appears to be that constantly attaching events to the read
> pipe end, from multiple processes
On April 9, 2018 6:36:19 PM PDT, Thomas Munro
wrote:
>On Tue, Apr 10, 2018 at 12:53 PM, Andres Freund
>wrote:
>> I coincidentally got pinged about our current approach causing
>> performance problems on FreeBSD and started writing a patch. The
>> problem there appears to be that constantly at
On 10 April 2018 at 03:59, Andres Freund wrote:
> On 2018-04-09 14:41:19 -0500, Justin Pryzby wrote:
>> On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote:
>> > You could make the argument that it's OK to forget if the entire file
>> > system goes away. But actually, why is that ok?
>>
>
On Tue, Apr 10, 2018 at 1:44 PM, Craig Ringer wrote:
> On 10 April 2018 at 03:59, Andres Freund wrote:
>> I don't think that's as hard as some people argued in this thread. We
>> could very well open a pipe in postmaster with the write end open in
>> each subprocess, and the read end open only i
On 10 April 2018 at 04:25, Mark Dilger wrote:
> I was reading this thread up until now as meaning that the standby could
> receive corrupt WAL data and become corrupted.
Yes, it can, but not directly through the first error.
What can happen is that we think a block got written when it didn't.
Andres Freund wrote:
>
> On April 9, 2018 6:31:07 PM PDT, Alvaro Herrera
> wrote:
> >Would it work to use this second pipe, to which each child writes a
> >byte that postmaster never reads, and then rely on SIGPIPE when
> >postmaster dies? Then we never need to do a syscall.
>
> I'm not follo
On 10 April 2018 at 04:37, Andres Freund wrote:
> Hi,
>
> On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote:
>> Maybe. I'd certainly prefer automated recovery from an temporary I/O
>> issues (like full disk on thin-provisioning) without the database
>> crashing and restarting. But I'm not sure it's
On April 9, 2018 6:57:23 PM PDT, Alvaro Herrera wrote:
>Andres Freund wrote:
>>
>> On April 9, 2018 6:31:07 PM PDT, Alvaro Herrera
> wrote:
>
>> >Would it work to use this second pipe, to which each child writes a
>> >byte that postmaster never reads, and then rely on SIGPIPE when
>> >postmaste
On April 9, 2018 6:59:03 PM PDT, Craig Ringer wrote:
>On 10 April 2018 at 04:37, Andres Freund wrote:
>> Hi,
>>
>> On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote:
>>> Maybe. I'd certainly prefer automated recovery from an temporary I/O
>>> issues (like full disk on thin-provisioning) without
On 10 April 2018 at 08:41, Andreas Karlsson wrote:
> On 04/09/2018 02:16 PM, Craig Ringer wrote:
>>
>> I'd like a middle ground where the kernel lets us register our interest
>> and tells us if it lost something, without us having to keep eight million
>> FDs open for some long period. "Tell us ab
On Sun, Apr 8, 2018 at 11:19 PM, Teodor Sigaev wrote:
> Thank you, pushed.
I noticed a few more issues following another pass-through of the patch:
* There is no pfree() within _bt_buildadd() for truncated tuples, even
though that's a context where it's clearly not okay.
* It might be a good id
On 2018/03/27 13:27, Amit Langote wrote:
> On 2018/03/26 23:20, Alvaro Herrera wrote:
>> The one thing I wasn't terribly in love with is the four calls to
>> map_partition_varattnos(), creating the attribute map four times ... but
>> we already have it in the TupleConversionMap, no? Looks like we
On Mon, Apr 09, 2018 at 04:46:34PM -0400, Tom Lane wrote:
> After further contemplation I decided that that was, in fact, the only
> reasonable way to improve matters. If we have multiple subdirectories
> independently firing the "make generated-headers" action, then we have
> parallel make hazard
On Mon, Apr 09, 2018 at 09:46:29PM +0200, Magnus Hagander wrote:
> Applied, thanks.
Thanks for the commit.
--
Michael
signature.asc
Description: PGP signature
On 2018/04/09 22:59, Alvaro Herrera wrote:
> Hello,
>
> Amit Langote wrote:
>
>> I have reproduced this and found that the problem is that
>> perform_pruning_combine_step forgets to *copy* the bitmapset of the first
>> step in the handling of an COMBINE_INTERSECT step.
>
> Pushed, thanks Amit an
Hi all,
I have not been giving much attention to the thread about enabling
checksums online, which has resulted in the revert of the feature, but
there is still pg_verify_checksums around. So I looked at it a bit.
I have a couple of questions/gotchas about it:
1) The documentation states that th
Thanks for the comment.
On 2018/04/09 23:22, Tom Lane wrote:
> Amit Langote writes:
>> I noticed that the newly added pruning does not work if the partition key
>> is of one of the types that have a corresponding pseudo-type.
>
> While I don't recall the details due to acute caffeine shortage,
>
On Mon, Apr 9, 2018 at 8:56 PM, Robert Haas wrote:
> On Fri, Apr 6, 2018 at 11:41 PM, Tom Lane wrote:
>> David Rowley writes:
>>> Sounds like you're saying that if we have too many alternative files
>>> then there's a chance that one could pass by luck.
>>
>> Yeah, exactly: it passed, but did it
> "Chapman" == Chapman Flack writes:
Chapman> AFAICS, that is *all* that comment block has to say about why
Chapman> there's an active snapshot stack. I believe you are saying it
Chapman> has another important function, namely that its top element is
Chapman> what tells the executor what
On Mon, Apr 09, 2018 at 10:59:48AM -0300, Alvaro Herrera wrote:
> Amit Langote wrote:
>> I have reproduced this and found that the problem is that
>> perform_pruning_combine_step forgets to *copy* the bitmapset of the first
>> step in the handling of an COMBINE_INTERSECT step.
>
> Pushed, thanks A
On Mon, Apr 09, 2018 at 03:02:11PM -0400, Robert Haas wrote:
> Another consequence of this behavior that initdb -S is never reliable,
> so pg_rewind's use of it doesn't actually fix the problem it was
> intended to solve. It also means that initdb itself isn't crash-safe,
> since the data file cha
> 9 апр. 2018 г., в 23:04, Heikki Linnakangas написал(а):
>
> On 09/04/18 18:21, Andrey Borodin wrote:
>>> 9 апр. 2018 г., в 19:50, Teodor Sigaev
>>> написал(а):
3. Why do we *not* lock the entry leaf page, if there is no
match? We still need a lock to remember that we probed for tha
On 10 April 2018 at 13:04, Michael Paquier wrote:
> On Mon, Apr 09, 2018 at 03:02:11PM -0400, Robert Haas wrote:
>> Another consequence of this behavior that initdb -S is never reliable,
>> so pg_rewind's use of it doesn't actually fix the problem it was
>> intended to solve. It also means that i
On Tue, Apr 10, 2018 at 01:37:19PM +0800, Craig Ringer wrote:
> On 10 April 2018 at 13:04, Michael Paquier wrote:
>> And pg_basebackup. And pg_dump. And pg_dumpall. Anything using initdb
>> -S or fsync_pgdata would enter in those waters.
>
> ... but *only if they hit an I/O error* or they're o
On 2018/04/10 13:55, Michael Paquier wrote:
> On Mon, Apr 09, 2018 at 10:59:48AM -0300, Alvaro Herrera wrote:
>> Amit Langote wrote:
>>> I have reproduced this and found that the problem is that
>>> perform_pruning_combine_step forgets to *copy* the bitmapset of the first
>>> step in the handling o
On 04/10/18 00:30, Andrew Gierth wrote:
> That's not precisely true - ultimately, the routines that do actual
> scans take the snapshot to use as a parameter, and the executor mostly
> references the snapshot from the EState; but a bunch of places do
> require that ActiveSnapshot be set to the cur
101 - 152 of 152 matches
Mail list logo