On Sat, May 14, 2022 at 3:33 AM Robert Haas wrote:
> This seems fine, but I think you should add a non-trivial comment about it.
Thanks for looking. Done, and pushed. Let's see if 180s per query is enough...
On Thu, May 12, 2022 at 10:20 PM Thomas Munro wrote:
> As for skink failing, the timeout was hard coded 300s for the whole
> test, but apparently that wasn't enough under valgrind. Let's use the
> standard PostgreSQL::Test::Utils::timeout_default (180s usually), but
> reset it for each query we s
On Thu, May 12, 2022 at 4:57 PM Thomas Munro wrote:
> On Thu, May 12, 2022 at 3:13 PM Thomas Munro wrote:
> > error running SQL: 'psql::1: ERROR: source database
> > "conflict_db_template" is being accessed by other users
> > DETAIL: There is 1 other session using the database.'
>
> Oh, for thi
On Thu, May 12, 2022 at 3:13 PM Thomas Munro wrote:
> Chipmunk, another little early model Raspberry Pi:
>
> error running SQL: 'psql::1: ERROR: source database
> "conflict_db_template" is being accessed by other users
> DETAIL: There is 1 other session using the database.'
Oh, for this one I t
On Sat, May 7, 2022 at 9:37 PM Thomas Munro wrote:
> So far "grison" failed. I think it's probably just that the test
> forgot to wait for replay of CREATE EXTENSION before using pg_prewarm
> on the standby, hence "ERROR: function pg_prewarm(oid) does not exist
> at character 12". I'll wait for
On Tue, May 10, 2022 at 1:07 AM Robert Haas wrote:
> On Sun, May 8, 2022 at 7:30 PM Thomas Munro wrote:
> > LOG: still waiting for pid 1651417 to accept ProcSignalBarrier
> > STATEMENT: alter database mydb set tablespace ts1;
> This is a very good idea.
OK, I pushed this, after making the ere
On Sun, May 8, 2022 at 7:30 PM Thomas Munro wrote:
> Simple idea: how about logging the PID of processes that block
> progress for too long? In the attached, I arbitrarily picked 5
> seconds as the wait time between LOG messages. Also, DEBUG1 messages
> let you see the processing speed on eg bui
On Sat, May 7, 2022 at 4:52 PM Thomas Munro wrote:
> I think we'll probably also want to invent a way
> to report which backend is holding up progress, since without that
> it's practically impossible for an end user to understand why their
> command is hanging.
Simple idea: how about logging the
On Sat, May 7, 2022 at 4:52 PM Thomas Munro wrote:
> Done. Time to watch the build farm.
So far "grison" failed. I think it's probably just that the test
forgot to wait for replay of CREATE EXTENSION before using pg_prewarm
on the standby, hence "ERROR: function pg_prewarm(oid) does not exist
On Wed, May 4, 2022 at 2:23 PM Thomas Munro wrote:
> Assuming no
> objections or CI failures show up, I'll consider pushing the first two
> patches tomorrow.
Done. Time to watch the build farm.
It's possible that these changes will produce some blowback, now that
we're using PROCSIGNAL_BARRIER_
On Wed, May 4, 2022 at 8:53 AM Thomas Munro wrote:
> Got some off-list clues: that's just distracting Perl cleanup noise
> after something else went wrong (thanks Robert), and now I'm testing a
> theory from Andres that we're missing a barrier on the redo side when
> replaying XLOG_DBASE_CREATE_FI
On Wed, May 4, 2022 at 7:44 AM Thomas Munro wrote:
> It passes sometimes and fails sometimes. Here's the weird failure I
> need to debug:
>
> https://api.cirrus-ci.com/v1/artifact/task/6033765456674816/log/src/test/recovery/tmp_check/log/regress_log_032_relfilenode_reuse
>
> Right at the end, it
On Wed, May 4, 2022 at 6:36 AM Robert Haas wrote:
> On Fri, Apr 22, 2022 at 3:38 AM Thomas Munro wrote:
> > So, to summarise the new patch that I'm attaching to this email as 0001:
>
> This all makes sense to me, and I didn't see anything obviously wrong
> looking through the patch, either.
Than
On Fri, Apr 22, 2022 at 3:38 AM Thomas Munro wrote:
> So, to summarise the new patch that I'm attaching to this email as 0001:
This all makes sense to me, and I didn't see anything obviously wrong
looking through the patch, either.
> However it seems that I have something wrong, because CI is fa
On Wed, Apr 6, 2022 at 5:07 AM Robert Haas wrote:
> On Mon, Apr 4, 2022 at 10:20 PM Thomas Munro wrote:
> > > The checkpointer never takes heavyweight locks, so the opportunity
> > > you're describing can't arise.
> >
> > Hmm, oh, you probably meant the buffer interlocking
> > in SyncOneBuffer(
On Mon, Apr 4, 2022 at 10:20 PM Thomas Munro wrote:
> > The checkpointer never takes heavyweight locks, so the opportunity
> > you're describing can't arise.
>
> Hmm, oh, you probably meant the buffer interlocking
> in SyncOneBuffer(). It's true that my most recent patch throws away
> more requ
On Tue, Apr 5, 2022 at 10:24 AM Thomas Munro wrote:
> On Tue, Apr 5, 2022 at 2:18 AM Robert Haas wrote:
> > I'm not sure that it really matters, but with the idea that I proposed
> > it's possible to "save" a pending writeback, if we notice that we're
> > accessing the relation with a proper lock
On Tue, Apr 5, 2022 at 2:18 AM Robert Haas wrote:
> On Fri, Apr 1, 2022 at 5:03 PM Thomas Munro wrote:
> > Another idea would be to call a new function DropPendingWritebacks(),
> > and also tell all the SMgrRelation objects to close all their internal
> > state (ie the fds + per-segment objects)
On Fri, Apr 1, 2022 at 5:03 PM Thomas Munro wrote:
> Another idea would be to call a new function DropPendingWritebacks(),
> and also tell all the SMgrRelation objects to close all their internal
> state (ie the fds + per-segment objects) but not free the main
> SMgrRelationData object, and for go
On Sat, Apr 2, 2022 at 10:03 AM Thomas Munro wrote:
> Another idea would be to call a new function DropPendingWritebacks(),
> and also tell all the SMgrRelation objects to close all their internal
> state (ie the fds + per-segment objects) but not free the main
> SMgrRelationData object, and for g
On Sat, Apr 2, 2022 at 2:52 AM Robert Haas wrote:
> On Fri, Apr 1, 2022 at 2:04 AM Thomas Munro wrote:
> > The v1-0003 patch introduced smgropen_cond() to avoid the problem of
> > IssuePendingWritebacks(), which does desynchronised smgropen() calls
> > and could open files after the barrier but j
On Fri, Apr 1, 2022 at 2:04 AM Thomas Munro wrote:
> The v1-0003 patch introduced smgropen_cond() to avoid the problem of
> IssuePendingWritebacks(), which does desynchronised smgropen() calls
> and could open files after the barrier but just before they are
> unlinked. Makes sense, but...
>
> 1.
Some thoughts:
The v1-0003 patch introduced smgropen_cond() to avoid the problem of
IssuePendingWritebacks(), which does desynchronised smgropen() calls
and could open files after the barrier but just before they are
unlinked. Makes sense, but...
1. For that to actually work, we'd better call s
On Thu, Mar 3, 2022 at 1:28 PM Andres Freund wrote:
> > I can't remember that verify() is the one that accesses conflict.db large
> > while cause_eviction() is the one that accesses postgres.replace_sb for more
> > than like 15 seconds.
>
> For more than 15seconds? The whole test runs in a few sec
Hi,
On 2022-03-03 13:11:17 -0500, Robert Haas wrote:
> On Wed, Mar 2, 2022 at 3:00 PM Andres Freund wrote:
> > On 2022-03-02 14:52:01 -0500, Robert Haas wrote:
> > > - I am having some trouble understanding clearly what 0001 is doing.
> > > I'll try to study it further.
> >
> > It tests for the v
On Wed, Mar 2, 2022 at 3:00 PM Andres Freund wrote:
> On 2022-03-02 14:52:01 -0500, Robert Haas wrote:
> > - I am having some trouble understanding clearly what 0001 is doing.
> > I'll try to study it further.
>
> It tests for the various scenarios I could think of that could lead to FD
> reuse, t
On Wed, Mar 2, 2022 at 3:00 PM Andres Freund wrote:
> What I am stuck on is what we can do for the released branches. Data
> corruption after two consecutive ALTER DATABASE SET TABLESPACEs seems like
> something we need to address.
I think we should consider back-porting the ProcSignalBarrier stu
Hi,
On 2022-03-02 14:52:01 -0500, Robert Haas wrote:
> - I am having some trouble understanding clearly what 0001 is doing.
> I'll try to study it further.
It tests for the various scenarios I could think of that could lead to FD
reuse, to state the obvious ;). Anything particularly unclear.
>
On Tue, Feb 22, 2022 at 4:40 AM Andres Freund wrote:
> On 2022-02-22 01:11:21 -0800, Andres Freund wrote:
> > I've started to work on a few debugging aids to find problem like
> > these. Attached are two WIP patches:
>
> Forgot to attach. Also importantly includes a tap test for several of these
>
Hi,
On 2022-02-22 01:11:21 -0800, Andres Freund wrote:
> I've started to work on a few debugging aids to find problem like
> these. Attached are two WIP patches:
Forgot to attach. Also importantly includes a tap test for several of these
issues
Greetings,
Andres Freund
>From 0bc64874f8e5faae9a3
Hi,
On 2022-02-10 14:26:59 -0800, Andres Freund wrote:
> On 2022-02-11 09:10:38 +1300, Thomas Munro wrote:
> > It seems like I should go ahead and do that today, and we can study
> > further uses for PROCSIGNAL_BARRIER_SMGRRELEASE in follow-on work?
>
> Yes.
I wrote a test to show the problem. W
Hi,
On 2022-02-11 09:10:38 +1300, Thomas Munro wrote:
> I was about to commit that, because the original Windows problem it
> solved is showing up occasionally in CI failures (that is, it already
> solves a live problem, albeit a different and non-data-corrupting
> one):
+1
> It seems like I sho
Hi,
On 2022-02-10 13:49:50 -0500, Robert Haas wrote:
> I agree. While I feel sort of bad about missing this issue in review,
> I also feel like it's pretty surprising that there isn't something
> plugging this hole already. It feels unexpected that our FD management
> layer might hand you an FD th
On Thu, Feb 10, 2022 at 3:11 PM Thomas Munro wrote:
> On Fri, Feb 11, 2022 at 7:50 AM Robert Haas wrote:
> > The main question in my mind is who is going to actually make that
> > happen. It was your idea (I think), Thomas coded it, and my commit
> > made it a live problem. So who's going to get
On Fri, Feb 11, 2022 at 7:50 AM Robert Haas wrote:
> The main question in my mind is who is going to actually make that
> happen. It was your idea (I think), Thomas coded it, and my commit
> made it a live problem. So who's going to get something committed
> here?
I was about to commit that, beca
On Wed, Feb 9, 2022 at 5:00 PM Andres Freund wrote:
> The problem starts with
>
> commit aa01051418f10afbdfa781b8dc109615ca785ff9
> Author: Robert Haas
> Date: 2022-01-24 14:23:15 -0500
>
> pg_upgrade: Preserve database OIDs.
Well, that's sad.
> I think the most realistic way to address t
On Wed, Feb 09, 2022 at 02:00:04PM -0800, Andres Freund wrote:
> On linux we can do so by a) checking if readlink(/proc/self/fd/$fd) points to
> a filename ending in " (deleted)", b) doing fstat(fd) and checking if st_nlink
> == 0.
You could also stat() the file in proc/self/fd/N and compare st_in
Hi,
I was working on rebasing the AIO branch. Tests started to fail after, but it
turns out that the problem exists independent of AIO.
The problem starts with
commit aa01051418f10afbdfa781b8dc109615ca785ff9
Author: Robert Haas
Date: 2022-01-24 14:23:15 -0500
pg_upgrade: Preserve databas
38 matches
Mail list logo