> On 2 Sep 2021, at 13:18, Daniel Gustafsson wrote:
>
>> On 9 Jul 2021, at 22:00, Ibrar Ahmed wrote:
>
>> I am changing the status to "Waiting on Author" based on the latest comments
>> of @David Steele
>> and secondly the patch does not apply cleanly.
>
> This patch hasn’t moved since marke
> On 9 Jul 2021, at 22:00, Ibrar Ahmed wrote:
> I am changing the status to "Waiting on Author" based on the latest comments
> of @David Steele
> and secondly the patch does not apply cleanly.
This patch hasn’t moved since marked as WoA in the last CF and still doesn’t
apply, unless there is a
On Tue, Mar 9, 2021 at 10:43 PM David Steele wrote:
> On 11/30/20 6:38 PM, David Steele wrote:
> > On 11/30/20 9:27 AM, Stephen Frost wrote:
> >> * Michael Paquier (mich...@paquier.xyz) wrote:
> >>> On Fri, Nov 27, 2020 at 11:15:27AM -0500, Stephen Frost wrote:
> * Magnus Hagander (mag...@ha
On 11/30/20 6:38 PM, David Steele wrote:
On 11/30/20 9:27 AM, Stephen Frost wrote:
* Michael Paquier (mich...@paquier.xyz) wrote:
On Fri, Nov 27, 2020 at 11:15:27AM -0500, Stephen Frost wrote:
* Magnus Hagander (mag...@hagander.net) wrote:
On Thu, Nov 26, 2020 at 8:42 AM Michael Paquier
wrot
On 11/30/20 9:27 AM, Stephen Frost wrote:
Greetings,
* Michael Paquier (mich...@paquier.xyz) wrote:
On Fri, Nov 27, 2020 at 11:15:27AM -0500, Stephen Frost wrote:
* Magnus Hagander (mag...@hagander.net) wrote:
On Thu, Nov 26, 2020 at 8:42 AM Michael Paquier wrote:
But here the checksum is b
Greetings,
* Michael Paquier (mich...@paquier.xyz) wrote:
> On Fri, Nov 27, 2020 at 11:15:27AM -0500, Stephen Frost wrote:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> >> On Thu, Nov 26, 2020 at 8:42 AM Michael Paquier
> >> wrote:
> >>> But here the checksum is broken, so while the offse
On Fri, Nov 27, 2020 at 11:15:27AM -0500, Stephen Frost wrote:
> * Magnus Hagander (mag...@hagander.net) wrote:
>> On Thu, Nov 26, 2020 at 8:42 AM Michael Paquier wrote:
>>> But here the checksum is broken, so while the offset is something we
>>> can rely on how do you make sure that the LSN is fi
Greetings,
* Magnus Hagander (mag...@hagander.net) wrote:
> On Thu, Nov 26, 2020 at 8:42 AM Michael Paquier wrote:
> > On Tue, Nov 24, 2020 at 12:38:30PM -0500, David Steele wrote:
> > > We are not just looking at one LSN value. Here are the steps we are
> > > proposing (I'll skip checks for zero
On Thu, Nov 26, 2020 at 8:42 AM Michael Paquier wrote:
>
> On Tue, Nov 24, 2020 at 12:38:30PM -0500, David Steele wrote:
> > We are not just looking at one LSN value. Here are the steps we are
> > proposing (I'll skip checks for zero pages here):
> >
> > 1) Test the page checksum. If it passes the
On Tue, Nov 24, 2020 at 12:38:30PM -0500, David Steele wrote:
> We are not just looking at one LSN value. Here are the steps we are
> proposing (I'll skip checks for zero pages here):
>
> 1) Test the page checksum. If it passes the page is OK.
> 2) If the checksum does not pass then record the pag
Hi Michael,
On 11/23/20 8:10 PM, Michael Paquier wrote:
On Mon, Nov 23, 2020 at 10:35:54AM -0500, Stephen Frost wrote:
Also- what is the point of reading the page from shared buffers
anyway..? All we need to do is prove that the page will be rewritten
during WAL replay. If we can prove that,
Greetings,
On Mon, Nov 23, 2020 at 20:28 Michael Paquier wrote:
> On Mon, Nov 23, 2020 at 05:28:52PM -0500, Stephen Frost wrote:
> > * Anastasia Lubennikova (a.lubennik...@postgrespro.ru) wrote:
> >> Yes and this is a tricky part. Until you have explained it in your
> latest
> >> message, I wasn
On Mon, Nov 23, 2020 at 05:28:52PM -0500, Stephen Frost wrote:
> * Anastasia Lubennikova (a.lubennik...@postgrespro.ru) wrote:
>> Yes and this is a tricky part. Until you have explained it in your latest
>> message, I wasn't sure how we can distinct concurrent update from a page
>> header corruptio
On Mon, Nov 23, 2020 at 10:35:54AM -0500, Stephen Frost wrote:
> * Anastasia Lubennikova (a.lubennik...@postgrespro.ru) wrote:
>> It seems reasonable to me to rely on checksums only.
>>
>> As for retry, I think that API for concurrent I/O will be complicated.
>> Instead, we can introduce a functio
Greetings,
* Anastasia Lubennikova (a.lubennik...@postgrespro.ru) wrote:
> On 23.11.2020 18:35, Stephen Frost wrote:
> >* Anastasia Lubennikova (a.lubennik...@postgrespro.ru) wrote:
> >>On 21.11.2020 04:30, Michael Paquier wrote:
> >>>The only method I can think as being really
> >>>reliable is ba
On 23.11.2020 18:35, Stephen Frost wrote:
Greetings,
* Anastasia Lubennikova (a.lubennik...@postgrespro.ru) wrote:
On 21.11.2020 04:30, Michael Paquier wrote:
The only method I can think as being really
reliable is based on two facts:
- Do a check only on pd_checksums, as that validates the fu
Greetings,
* Anastasia Lubennikova (a.lubennik...@postgrespro.ru) wrote:
> On 21.11.2020 04:30, Michael Paquier wrote:
> >The only method I can think as being really
> >reliable is based on two facts:
> >- Do a check only on pd_checksums, as that validates the full contents
> >of the page.
> >- Wh
Greetings,
* Michael Paquier (mich...@paquier.xyz) wrote:
> On Fri, Nov 20, 2020 at 11:08:27AM -0500, Stephen Frost wrote:
> > David Steele (da...@pgmasters.net) wrote:
> >> Our current plan for pgBackRest:
> >>
> >> 1) Remove the LSN check as you have done in your patch and when rechecking
> >>
On 21.11.2020 04:30, Michael Paquier wrote:
The only method I can think as being really
reliable is based on two facts:
- Do a check only on pd_checksums, as that validates the full contents
of the page.
- When doing a retry, make sure that there is no concurrent I/O
activity in the shared buffer
On Fri, Nov 20, 2020 at 11:08:27AM -0500, Stephen Frost wrote:
> David Steele (da...@pgmasters.net) wrote:
>> Our current plan for pgBackRest:
>>
>> 1) Remove the LSN check as you have done in your patch and when rechecking
>> see if the page has become valid *or* the LSN is ascending.
>> 2) Check
Greetings,
* David Steele (da...@pgmasters.net) wrote:
> On 11/20/20 2:28 AM, Michael Paquier wrote:
> >On Mon, Nov 16, 2020 at 11:41:51AM +0100, Magnus Hagander wrote:
> >>I was referring to the latest patch on the thread. But as I said, I have
> >>not read up on all the different issues raised i
Hi Michael,
On 11/20/20 2:28 AM, Michael Paquier wrote:
On Mon, Nov 16, 2020 at 11:41:51AM +0100, Magnus Hagander wrote:
I was referring to the latest patch on the thread. But as I said, I have
not read up on all the different issues raised in the thread, so take it
with a big grain os salt.
A
On Mon, Nov 16, 2020 at 11:41:51AM +0100, Magnus Hagander wrote:
> I was referring to the latest patch on the thread. But as I said, I have
> not read up on all the different issues raised in the thread, so take it
> with a big grain os salt.
>
> And I would also echo the previous comment that thi
On Mon, Nov 16, 2020 at 1:23 AM Michael Paquier wrote:
> On Sun, Nov 15, 2020 at 04:37:36PM +0100, Magnus Hagander wrote:
> > On Tue, Nov 10, 2020 at 5:44 AM Michael Paquier
> wrote:
> >> On Thu, Nov 05, 2020 at 10:57:16AM +0900, Michael Paquier wrote:
> >>> I was referring to the patch I sent o
On Sun, Nov 15, 2020 at 04:37:36PM +0100, Magnus Hagander wrote:
> On Tue, Nov 10, 2020 at 5:44 AM Michael Paquier wrote:
>> On Thu, Nov 05, 2020 at 10:57:16AM +0900, Michael Paquier wrote:
>>> I was referring to the patch I sent on this thread that fixes the
>>> detection of a corruption for the
On Tue, Nov 10, 2020 at 5:44 AM Michael Paquier wrote:
> On Thu, Nov 05, 2020 at 10:57:16AM +0900, Michael Paquier wrote:
> > I was referring to the patch I sent on this thread that fixes the
> > detection of a corruption for the zero-only case and where pd_lsn
> > and/or pg_upper are trashed by
On Thu, Nov 05, 2020 at 10:57:16AM +0900, Michael Paquier wrote:
> I was referring to the patch I sent on this thread that fixes the
> detection of a corruption for the zero-only case and where pd_lsn
> and/or pg_upper are trashed by a corruption of the page header. Both
> cases allow a base backu
On Wed, Nov 04, 2020 at 05:41:39PM +0100, Michael Banck wrote:
> Am Mittwoch, den 04.11.2020, 17:48 +0900 schrieb Michael Paquier:
>> So, I have done much more testing of this patch using an instance with
>> a small shared buffer pool and pgbench running in parallel for having
>> a large eviction r
Hi,
Am Mittwoch, den 04.11.2020, 17:48 +0900 schrieb Michael Paquier:
> On Fri, Oct 30, 2020 at 11:30:28AM +0900, Michael Paquier wrote:
> > Playing with dd and generating random pages, this detects random
> > corruptions, making use of a wait/retry loop if a failure is detected.
> > As mentioned
On Fri, Oct 30, 2020 at 11:30:28AM +0900, Michael Paquier wrote:
> Playing with dd and generating random pages, this detects random
> corruptions, making use of a wait/retry loop if a failure is detected.
> As mentioned upthread, this is a double-edged sword, increasing the
> number of retries redu
On Thu, Oct 22, 2020 at 10:41:53AM +0900, Michael Paquier wrote:
> We cannot trust the fields fields of the page header because these may
> have been messed up with some random corruption, so what really
> matters is that the checksums don't match, and that we can just rely
> on that. The zero-onl
On Wed, Oct 21, 2020 at 07:10:34PM +0900, Michael Paquier wrote:
> My guess is that we should be able to make use of that for base
> backups as well, but I also think that I'd rather let v13 go with more
> retries without depending on a new API layer, removing of the LSN
> check altogether. Thinki
On Wed, Oct 21, 2020 at 12:00:23PM +0200, Michael Banck wrote:
> The check was ported (or the concept of it adapted) from pgBackRest if I
> remember correctly.
Okay, I did not know that.
>> As things stand, I'd like to think that it would be much more useful
>> to remove this check and to have on
Hi,
Am Dienstag, den 20.10.2020, 18:11 +0900 schrieb Michael Paquier:
> On Mon, Apr 06, 2020 at 04:45:44PM -0400, Tom Lane wrote:
> > Actually, after thinking about that a bit more: why is there an LSN-based
> > special condition at all? It seems like it'd be far more useful to
> > checksum every
On Mon, Apr 06, 2020 at 04:45:44PM -0400, Tom Lane wrote:
> Actually, after thinking about that a bit more: why is there an LSN-based
> special condition at all? It seems like it'd be far more useful to
> checksum everything, and on failure try to re-read and re-verify the page
> once or twice, so
> On 5 Jul 2020, at 13:52, Daniel Gustafsson wrote:
>
>> On 6 Apr 2020, at 23:15, Michael Banck wrote:
>
>> Probably we need to take a step back;
>
> This patch has been Waiting on Author since the last commitfest (and no longer
> applies as well), and by the sounds of the thread there are som
> On 6 Apr 2020, at 23:15, Michael Banck wrote:
> Probably we need to take a step back;
This patch has been Waiting on Author since the last commitfest (and no longer
applies as well), and by the sounds of the thread there are some open issues
with it. Should it be Returned with Feedback to be
Hi,
Am Montag, den 06.04.2020, 16:45 -0400 schrieb Tom Lane:
> I wrote:
> > Another thing that's bothering me is that the patch compares page LSN
> > against GetInsertRecPtr(); but that function says
> > ...
> > I'm not convinced that an approximation is good enough here. It seems
> > like a page
I wrote:
> Another thing that's bothering me is that the patch compares page LSN
> against GetInsertRecPtr(); but that function says
> ...
> I'm not convinced that an approximation is good enough here. It seems
> like a page that's just now been updated could have an LSN beyond the
> current XLOG
Michael Banck writes:
> [ 0001-Fix-checksum-verification-in-base-backups-for-random_V3.patch ]
I noticed that the cfbot wasn't testing this because of a minor merge
conflict. I rebased it over that, and also readjusted things a little bit
to avoid unnecessarily reindenting existing code, in hope
Hi,
thanks for reviewing this patch!
Am Donnerstag, den 27.02.2020, 10:57 + schrieb Asif Rehman:
> The following review has been posted through the commitfest application:
> make installcheck-world: tested, passed
> Implements feature: tested, passed
> Spec compliant: tested,
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation:not tested
The patch applies cleanly and works as expected. Just a few minor
Hi,
On 2019-03-30 12:56:21 +0100, Magnus Hagander wrote:
> > ISTM that the fact that we had to teach it about different segment files
> > for checksum verification by splitting up the filename at "." implies
> > that it is not the correct level of abstraction (but maybe it could get
> > schooled s
On Fri, Mar 29, 2019 at 10:08 PM Michael Banck
wrote:
> Hi,
>
> Am Freitag, den 29.03.2019, 16:52 +0100 schrieb Magnus Hagander:
> > On Fri, Mar 29, 2019 at 4:30 PM Stephen Frost
> wrote:
> > > * Magnus Hagander (mag...@hagander.net) wrote:
> > > > On Thu, Mar 28, 2019 at 10:19 PM Tomas Vondra <
Hi,
Am Freitag, den 29.03.2019, 16:52 +0100 schrieb Magnus Hagander:
> On Fri, Mar 29, 2019 at 4:30 PM Stephen Frost wrote:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> > > On Thu, Mar 28, 2019 at 10:19 PM Tomas Vondra
> > >
> > > wrote:
> > > > On Thu, Mar 28, 2019 at 01:11:40PM -0700,
On Fri, Mar 29, 2019 at 4:30 PM Stephen Frost wrote:
> Greetings,
>
> * Magnus Hagander (mag...@hagander.net) wrote:
> > On Thu, Mar 28, 2019 at 10:19 PM Tomas Vondra <
> tomas.von...@2ndquadrant.com>
> > wrote:
> >
> > > On Thu, Mar 28, 2019 at 01:11:40PM -0700, Andres Freund wrote:
> > > >Hi,
>
Hi,
On 2019-03-29 11:38:02 -0400, Stephen Frost wrote:
> The server-side function would essentially lock the page against i/o,
> re-read it off disk into an independent location, unlock the page, then
> calculate the checksum and report back?
Right. I think there's a few minor variations of how t
Greetings,
* Andres Freund (and...@anarazel.de) wrote:
> On 2019-03-29 11:30:15 -0400, Stephen Frost wrote:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> > > On Thu, Mar 28, 2019 at 10:19 PM Tomas Vondra
> > >
> > > wrote:
> > > > On Thu, Mar 28, 2019 at 01:11:40PM -0700, Andres Freund wr
Hi,
On 2019-03-29 11:30:15 -0400, Stephen Frost wrote:
> * Magnus Hagander (mag...@hagander.net) wrote:
> > On Thu, Mar 28, 2019 at 10:19 PM Tomas Vondra
> > wrote:
> > > On Thu, Mar 28, 2019 at 01:11:40PM -0700, Andres Freund wrote:
> > > >Hi,
> > > >
> > > >On 2019-03-28 21:09:22 +0100, Michael
Greetings,
* Magnus Hagander (mag...@hagander.net) wrote:
> On Thu, Mar 28, 2019 at 10:19 PM Tomas Vondra
> wrote:
>
> > On Thu, Mar 28, 2019 at 01:11:40PM -0700, Andres Freund wrote:
> > >Hi,
> > >
> > >On 2019-03-28 21:09:22 +0100, Michael Banck wrote:
> > >> I agree that the current patch mig
On Thu, Mar 28, 2019 at 10:19 PM Tomas Vondra
wrote:
> On Thu, Mar 28, 2019 at 01:11:40PM -0700, Andres Freund wrote:
> >Hi,
> >
> >On 2019-03-28 21:09:22 +0100, Michael Banck wrote:
> >> I agree that the current patch might have some corner-cases where it
> >> does not guarantee 100% accuracy in
On Thu, Mar 28, 2019 at 01:11:40PM -0700, Andres Freund wrote:
Hi,
On 2019-03-28 21:09:22 +0100, Michael Banck wrote:
I agree that the current patch might have some corner-cases where it
does not guarantee 100% accuracy in online mode, but I hope the current
version at least has no more false n
Hi,
On 2019-03-28 21:09:22 +0100, Michael Banck wrote:
> I agree that the current patch might have some corner-cases where it
> does not guarantee 100% accuracy in online mode, but I hope the current
> version at least has no more false negatives.
False positives are *bad*. We shouldn't integrate
Hi,
Am Donnerstag, den 28.03.2019, 18:19 +0100 schrieb Tomas Vondra:
> On Thu, Mar 28, 2019 at 05:08:33PM +0100, Michael Banck wrote:
> > I also fixed the two issues Andres reported, namely a zeroed-out
> > pageheader and a random LSN. The first is caught be checking for an all-
> > zero-page in t
On Thu, Mar 28, 2019 at 05:08:33PM +0100, Michael Banck wrote:
Hi,
I have rebased this patch now.
I also fixed the two issues Andres reported, namely a zeroed-out
pageheader and a random LSN. The first is caught be checking for an all-
zero-page in the way PageIsVerified() does. The second is c
Hi,
I have rebased this patch now.
I also fixed the two issues Andres reported, namely a zeroed-out
pageheader and a random LSN. The first is caught be checking for an all-
zero-page in the way PageIsVerified() does. The second is caught by
comparing the upper 32 bits of the LSN as well and deman
Hi,
Am Dienstag, den 26.03.2019, 10:30 -0700 schrieb Andres Freund:
> On 2019-03-26 18:22:55 +0100, Michael Banck wrote:
> > Am Dienstag, den 19.03.2019, 13:00 -0700 schrieb Andres Freund:
> > > CREATE TABLE corruptme AS SELECT g.i::text AS data FROM
> > > generate_series(1, 100) g(i);
> > >
On 2019-03-26 18:22:55 +0100, Michael Banck wrote:
> Hi,
>
> Am Dienstag, den 19.03.2019, 13:00 -0700 schrieb Andres Freund:
> > CREATE TABLE corruptme AS SELECT g.i::text AS data FROM generate_series(1,
> > 100) g(i);
> > SELECT pg_relation_size('corruptme');
> > postgres[22890][1]=# SELECT
Hi,
Am Dienstag, den 19.03.2019, 13:00 -0700 schrieb Andres Freund:
> CREATE TABLE corruptme AS SELECT g.i::text AS data FROM generate_series(1,
> 100) g(i);
> SELECT pg_relation_size('corruptme');
> postgres[22890][1]=# SELECT current_setting('data_directory') || '/' ||
> pg_relation_filepa
On Tue, Mar 19, 2019 at 02:44:52PM -0700, Andres Freund wrote:
> That's *PRECISELY* my point. I think it's a bad idea to do online
> checksumming from outside the backend. It needs to be inside the
> backend, and if there's any verification failures on a block, it needs
> to acquire the IO lock on
Hi,
On 2019-03-19 22:39:16 +0100, Michael Banck wrote:
> Am Dienstag, den 19.03.2019, 13:00 -0700 schrieb Andres Freund:
> > a) checks that the page is all zeroes if PageIsNew() (like
> >PageIsVerified() does for the backend). That avoids missing cases
> >where corruption just zeroed out t
Hi,
Am Dienstag, den 19.03.2019, 13:00 -0700 schrieb Andres Freund:
> On 2019-03-20 03:27:55 +0800, Stephen Frost wrote:
> > On Tue, Mar 19, 2019 at 23:59 Andres Freund wrote:
> > > On 2019-03-19 16:52:08 +0100, Michael Banck wrote:
> > > > Am Dienstag, den 19.03.2019, 11:22 -0400 schrieb Robert
On Tue, Mar 19, 2019 at 4:49 PM Andres Freund wrote:
> To demonstrate that I ran a loop that verified that a) a normal backend
> query using the tale detects the corruption b) pg_basebackup doesn't.
>
> i=0;
> while true; do
> i=$(($i+1));
> echo attempt $i;
> dd if=/dev/urandom of=/sr
On 2019-03-19 13:00:50 -0700, Andres Freund wrote:
> As it stands, the logic seems to give more false confidence than
> anything else.
To demonstrate that I ran a loop that verified that a) a normal backend
query using the tale detects the corruption b) pg_basebackup doesn't.
i=0;
while true; do
Hi,
On 2019-03-20 03:27:55 +0800, Stephen Frost wrote:
> On Tue, Mar 19, 2019 at 23:59 Andres Freund wrote:
> > On 2019-03-19 16:52:08 +0100, Michael Banck wrote:
> > > Am Dienstag, den 19.03.2019, 11:22 -0400 schrieb Robert Haas:
> > > > It's torn pages that I am concerned about - the server is
Greetings,
On Tue, Mar 19, 2019 at 23:59 Andres Freund wrote:
> Hi,
>
> On 2019-03-19 16:52:08 +0100, Michael Banck wrote:
> > Am Dienstag, den 19.03.2019, 11:22 -0400 schrieb Robert Haas:
> > > It's torn pages that I am concerned about - the server is writing and
> > > we are reading, and we ge
Hi,
On 2019-03-19 16:52:08 +0100, Michael Banck wrote:
> Am Dienstag, den 19.03.2019, 11:22 -0400 schrieb Robert Haas:
> > It's torn pages that I am concerned about - the server is writing and
> > we are reading, and we get a mix of old and new content. We have been
> > quite diligent about prote
Hi,
Am Dienstag, den 19.03.2019, 11:22 -0400 schrieb Robert Haas:
> It's torn pages that I am concerned about - the server is writing and
> we are reading, and we get a mix of old and new content. We have been
> quite diligent about protecting ourselves from such risks elsewhere,
> and checksum v
On Mon, Mar 18, 2019 at 2:38 AM Stephen Frost wrote:
> Sure the backend has those facilities since it needs to, but these
> frontend tools *don't* need that to *never* have any false positives, so
> why are we complicating things by saying that this frontend tool and the
> backend have to coordina
Greetings,
On Tue, Mar 19, 2019 at 04:15 Michael Banck
wrote:
> Am Montag, den 18.03.2019, 16:11 +0800 schrieb Stephen Frost:
> > On Mon, Mar 18, 2019 at 15:52 Michael Banck
> wrote:
> > > Am Montag, den 18.03.2019, 03:34 -0400 schrieb Stephen Frost:
> > > > Thanks for that. Reading through th
Hi,
Am Montag, den 18.03.2019, 16:11 +0800 schrieb Stephen Frost:
> On Mon, Mar 18, 2019 at 15:52 Michael Banck wrote:
> > Am Montag, den 18.03.2019, 03:34 -0400 schrieb Stephen Frost:
> > > Thanks for that. Reading through the code though, I don't entirely
> > > understand why we're making thin
On Mon, Mar 18, 2019 at 2:06 AM Michael Paquier wrote:
> The mentions on this thread that the server has all the facility in
> place to properly lock a buffer and make sure that a partial read
> *never* happens and that we *never* have any kind of false positives,
> directly preventing the set of
Greetings,
On Mon, Mar 18, 2019 at 15:52 Michael Banck
wrote:
> Hi.
>
> Am Montag, den 18.03.2019, 03:34 -0400 schrieb Stephen Frost:
> > * Michael Banck (michael.ba...@credativ.de) wrote:
> > > Am Montag, den 18.03.2019, 02:38 -0400 schrieb Stephen Frost:
> > > > * Michael Paquier (mich...@paqu
Hi.
Am Montag, den 18.03.2019, 03:34 -0400 schrieb Stephen Frost:
> * Michael Banck (michael.ba...@credativ.de) wrote:
> > Am Montag, den 18.03.2019, 02:38 -0400 schrieb Stephen Frost:
> > > * Michael Paquier (mich...@paquier.xyz) wrote:
> > > > On Mon, Mar 18, 2019 at 01:43:08AM -0400, Stephen Fr
Hi,
Am Montag, den 18.03.2019, 08:18 +0100 schrieb Michael Banck:
> I have now rebased that patch on top of the pg_verify_checksums ->
> pg_checksums renaming, see attached.
Sorry, I had missed some hunks in the TAP tests, fixed-up patch
attached.
Michael
--
Michael Banck
Projektleiter / Seni
Greetings,
* Michael Paquier (mich...@paquier.xyz) wrote:
> On Mon, Mar 18, 2019 at 02:38:10AM -0400, Stephen Frost wrote:
> > Uh, we are, of course, going to have partial reads- we just need to
> > handle them appropriately, and that's not hard to do in a way that we
> > never have false positive
Greetings,
* Michael Banck (michael.ba...@credativ.de) wrote:
> Am Montag, den 18.03.2019, 02:38 -0400 schrieb Stephen Frost:
> > * Michael Paquier (mich...@paquier.xyz) wrote:
> > > On Mon, Mar 18, 2019 at 01:43:08AM -0400, Stephen Frost wrote:
> > > > To be clear, I agree completely that we don'
On Mon, Mar 18, 2019 at 02:38:10AM -0400, Stephen Frost wrote:
> Uh, we are, of course, going to have partial reads- we just need to
> handle them appropriately, and that's not hard to do in a way that we
> never have false positives.
Ere, my apologies here. I meant the read of a torn page, not a
Hi,
Am Montag, den 18.03.2019, 02:38 -0400 schrieb Stephen Frost:
> * Michael Paquier (mich...@paquier.xyz) wrote:
> > On Mon, Mar 18, 2019 at 01:43:08AM -0400, Stephen Frost wrote:
> > > To be clear, I agree completely that we don't want to be reporting false
> > > positives or "this might mean c
Greetings,
* Michael Paquier (mich...@paquier.xyz) wrote:
> On Mon, Mar 18, 2019 at 01:43:08AM -0400, Stephen Frost wrote:
> > To be clear, I agree completely that we don't want to be reporting false
> > positives or "this might mean corruption!" to users running the tool,
> > but I haven't seen a
On Mon, Mar 18, 2019 at 01:43:08AM -0400, Stephen Frost wrote:
> To be clear, I agree completely that we don't want to be reporting false
> positives or "this might mean corruption!" to users running the tool,
> but I haven't seen a good explaination of why this needs to involve the
> server to avo
Greetings,
* Tomas Vondra (tomas.von...@2ndquadrant.com) wrote:
> If we want to run it from the server itself, then I guess a background
> worker would be a better solution. Incidentally, that's something I've
> been toying with some time ago, see [1].
So, I'm a big fan of this idea of having a b
Greetings,
* Tomas Vondra (tomas.von...@2ndquadrant.com) wrote:
> On 3/2/19 12:03 AM, Robert Haas wrote:
> > On Tue, Sep 18, 2018 at 10:37 AM Michael Banck
> > wrote:
> >> I have added a retry for this as well now, without a pg_sleep() as well.
> >> This catches around 80% of the half-reads, but
On Fri, Mar 8, 2019 at 6:50 PM Tomas Vondra
wrote:
>
> On 3/8/19 4:19 PM, Julien Rouhaud wrote:
> > On Thu, Mar 7, 2019 at 7:00 PM Andres Freund wrote:
> >>
> >> On 2019-03-07 12:53:30 +0100, Tomas Vondra wrote:
> >>>
> >>> But then again, we could just
> >>> hack a special version of ReadBuffer_
On 3/8/19 4:19 PM, Julien Rouhaud wrote:
> On Thu, Mar 7, 2019 at 7:00 PM Andres Freund wrote:
>>
>> On 2019-03-07 12:53:30 +0100, Tomas Vondra wrote:
>>>
>>> But then again, we could just
>>> hack a special version of ReadBuffer_common() which would just
>>
>>> (a) check if a page is in shared bu
On Thu, Mar 7, 2019 at 7:00 PM Andres Freund wrote:
>
> On 2019-03-07 12:53:30 +0100, Tomas Vondra wrote:
> >
> > But then again, we could just
> > hack a special version of ReadBuffer_common() which would just
>
> > (a) check if a page is in shared buffers, and if it is then consider the
> > chec
Hi,
Am Sonntag, den 03.03.2019, 11:51 +0100 schrieb Michael Banck:
> Am Samstag, den 02.03.2019, 11:08 -0500 schrieb Stephen Frost:
> > I'm not necessairly against skipping to the next file, to be clear,
> > but I think I'd be happier if we kept reading the file until we
> > actually get EOF.
>
>
Hi,
On 2019-03-07 12:53:30 +0100, Tomas Vondra wrote:
> On 3/6/19 6:42 PM, Andres Freund wrote:
> >
> > ...
> >
> > To me the right way seems to be to IO lock the page via PG after such a
> > failure, and then retry. Which should be relatively easily doable for
> > the basebackup case, but obvious
On 3/6/19 6:42 PM, Andres Freund wrote:
>
...
>
To me the right way seems to be to IO lock the page via PG after such a
failure, and then retry. Which should be relatively easily doable for
the basebackup case, but obviously harder for the pg_verify_checksums
case.
Actually, what do you mean
On Wed, Mar 06, 2019 at 08:53:57PM +0100, Tomas Vondra wrote:
> Not sure. AFAICS that would to require a single transaction, and if we
> happen to add some sort of throttling (which is a feature request I'd
> expect pretty soon to make it usable on live clusters) that might be
> quite long-running.
On 3/6/19 8:41 PM, Andres Freund wrote:
> Hi,
>
> On 2019-03-06 20:37:39 +0100, Tomas Vondra wrote:
>> Not sure how to integrate it into the CLI tool, though. Perhaps we it
>> could require connection info so that it can execute a function, when
>> executed in online mode?
>
> To me the right fix
Hi,
On 2019-03-06 20:37:39 +0100, Tomas Vondra wrote:
> Not sure how to integrate it into the CLI tool, though. Perhaps we it
> could require connection info so that it can execute a function, when
> executed in online mode?
To me the right fix would be to simply have this run as part of the
clus
On 3/6/19 6:42 PM, Andres Freund wrote:
> On 2019-03-06 12:33:49 -0500, Robert Haas wrote:
>> On Sat, Mar 2, 2019 at 5:45 AM Michael Banck
>> wrote:
>>> Am Freitag, den 01.03.2019, 18:03 -0500 schrieb Robert Haas:
On Tue, Sep 18, 2018 at 10:37 AM Michael Banck
wrote:
> I have ad
On 3/6/19 6:26 PM, Robert Haas wrote:
> On Sat, Mar 2, 2019 at 4:38 PM Tomas Vondra
> wrote:
>> FWIW I don't think this qualifies as torn page - i.e. it's not a full
>> read with a mix of old and new data. This is partial write, most likely
>> because we read the blocks one by one, and when we hit
On 2019-03-06 12:33:49 -0500, Robert Haas wrote:
> On Sat, Mar 2, 2019 at 5:45 AM Michael Banck
> wrote:
> > Am Freitag, den 01.03.2019, 18:03 -0500 schrieb Robert Haas:
> > > On Tue, Sep 18, 2018 at 10:37 AM Michael Banck
> > > wrote:
> > > > I have added a retry for this as well now, without a
On Sat, Mar 2, 2019 at 5:45 AM Michael Banck wrote:
> Am Freitag, den 01.03.2019, 18:03 -0500 schrieb Robert Haas:
> > On Tue, Sep 18, 2018 at 10:37 AM Michael Banck
> > wrote:
> > > I have added a retry for this as well now, without a pg_sleep() as well.
> > > This catches around 80% of the half
On Sat, Mar 2, 2019 at 4:38 PM Tomas Vondra
wrote:
> FWIW I don't think this qualifies as torn page - i.e. it's not a full
> read with a mix of old and new data. This is partial write, most likely
> because we read the blocks one by one, and when we hit the last page
> while the table is being ext
Greetings,
On Tue, Mar 5, 2019 at 18:36 Michael Paquier wrote:
> On Tue, Mar 05, 2019 at 02:08:03PM +0100, Tomas Vondra wrote:
> > Based on quickly skimming that thread the main issue seems to be
> > deciding which files in the data directory are expected to have
> > checksums. Which is a valid
On Tue, Mar 05, 2019 at 02:08:03PM +0100, Tomas Vondra wrote:
> Based on quickly skimming that thread the main issue seems to be
> deciding which files in the data directory are expected to have
> checksums. Which is a valid issue, of course, but I was expecting
> something about partial read/write
On 3/5/19 4:12 AM, Michael Paquier wrote:
> On Mon, Mar 04, 2019 at 03:08:09PM +0100, Tomas Vondra wrote:
>> I still don't understand what issue you see in how basebackup verifies
>> checksums. Can you point me to the explanation you've sent after 11 was
>> released?
>
> The history is mostly on t
1 - 100 of 180 matches
Mail list logo