On Sun, Jan 16, 2022 at 01:02:41PM -0800, Noah Misch wrote:
> My next steps:
>
> - Report a Debian bug for the sparc64+ext4 zeros problem.
Reported to Debian, then upstream:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1006157
https://marc.info/?t=16453926991
Last week, someone confirme
On Mon, Jan 24, 2022 at 12:02:43AM -0800, Noah Misch wrote:
> For 003_cic_2pc.pl, I'm
> fine using $TODO so we continue to run all test commands and quietly log their
> results. For 027_stream_regress.pl, which would need deep changes to use
> $TODO, it works to use any of todo_skip, skip, or skip
On Sun, Jan 23, 2022 at 06:34:32PM -0800, Andres Freund wrote:
> On 2022-01-23 18:10:07 -0800, Noah Misch wrote:
> > On Sun, Jan 23, 2022 at 05:40:54PM -0800, Andres Freund wrote:
> > > Test::more's description: "If it's something the programmer hasn't done
> > > yet,
> > > use TODO. This is for a
On 2022-01-23 21:25:04 -0500, Tom Lane wrote:
> Michael Paquier writes:
> > On Sun, Jan 23, 2022 at 06:10:07PM -0800, Noah Misch wrote:
> >> Could do that. Every run that doesn't get the flaky failure will print a
> >> message like "TODO passed: 3-5", though the test file could mitigate that
>
Hi,
On 2022-01-23 18:10:07 -0800, Noah Misch wrote:
> On Sun, Jan 23, 2022 at 05:40:54PM -0800, Andres Freund wrote:
> > Test::more's description: "If it's something the programmer hasn't done yet,
> > use TODO. This is for any code you haven't written yet, or bugs you have yet
> > to fix, but wan
Michael Paquier writes:
> On Sun, Jan 23, 2022 at 06:10:07PM -0800, Noah Misch wrote:
>> Could do that. Every run that doesn't get the flaky failure will print a
>> message like "TODO passed: 3-5", though the test file could mitigate that by
>> declaring the TODO only on configurations where we
On Sun, Jan 23, 2022 at 06:10:07PM -0800, Noah Misch wrote:
> Could do that. Every run that doesn't get the flaky failure will print a
> message like "TODO passed: 3-5", though the test file could mitigate that by
> declaring the TODO only on configurations where we expect a failure. The
> 027_s
On Sun, Jan 23, 2022 at 05:40:54PM -0800, Andres Freund wrote:
> On 2022-01-23 17:17:59 -0800, Noah Misch wrote:
> > On Sun, Jan 23, 2022 at 05:03:04PM -0800, Andres Freund wrote:
> > > On January 23, 2022 3:29:27 PM PST
> > > >(a) Modify the tests so the affected animals can skip affected tests by
Hi,
On 2022-01-23 17:17:59 -0800, Noah Misch wrote:
> On Sun, Jan 23, 2022 at 05:03:04PM -0800, Andres Freund wrote:
> > On January 23, 2022 3:29:27 PM PST
> > >(a) Modify the tests so the affected animals can skip affected tests by
> > >setting an environment variable, named PG_TEST_HAS_WAL_READ_
On Sun, Jan 23, 2022 at 05:03:04PM -0800, Andres Freund wrote:
> On January 23, 2022 3:29:27 PM PST
> >(a) Modify the tests so the affected animals can skip affected tests by
> >setting an environment variable, named PG_TEST_HAS_WAL_READ_BUG or similar.
>
> Why not just detect the problem in the t
On January 23, 2022 3:29:27 PM PST
>(a) Modify the tests so the affected animals can skip affected tests by
>setting an environment variable, named PG_TEST_HAS_WAL_READ_BUG or similar.
Why not just detect the problem in the tap test and skip, rather than requiring
multiple buildfarm configs to
Noah Misch writes:
> On Mon, Jan 24, 2022 at 12:49:16PM +1300, Thomas Munro wrote:
>> Trying out a new idea: what if we could tell the buildfarm website
>> that a certain test is currently expected to fail for reasons we can't
>> fix yet (configuration change needed but owner not responding, or
>>
On Mon, Jan 24, 2022 at 12:49:16PM +1300, Thomas Munro wrote:
> On Mon, Jan 24, 2022 at 12:29 PM Noah Misch wrote:
> > On Mon, Jan 24, 2022 at 09:42:13AM +1300, Thomas Munro wrote:
> > > I'm less
> > > sure it makes sense to do anything to support the presumed bogus
> > > zeroes bug for (probably)
On Mon, Jan 24, 2022 at 12:29 PM Noah Misch wrote:
> On Mon, Jan 24, 2022 at 09:42:13AM +1300, Thomas Munro wrote:
> > I'm less
> > sure it makes sense to do anything to support the presumed bogus
> > zeroes bug for (probably) no real users, especially before we've even
> > reported it and heard s
On Mon, Jan 24, 2022 at 09:42:13AM +1300, Thomas Munro wrote:
> I'm less
> sure it makes sense to do anything to support the presumed bogus
> zeroes bug for (probably) no real users, especially before we've even
> reported it and heard some analysis, for example acceptance that it's
> broken and co
Hi,
On 2022-01-24 09:42:13 +1300, Thomas Munro wrote:
> On Sun, Jan 23, 2022 at 7:52 AM Noah Misch wrote:
> > Future work can benchmark the new behavior and, if it performs well, make
> > it unconditional in v15+. I would expect performance to be unchanged or
> > slightly better, because the new
On Sun, Jan 23, 2022 at 7:52 AM Noah Misch wrote:
> Attached. With this, kittiwake has survived 8.5hr of 003_cic_2pc.pl. Without
> the patch, it failed many times, always within 1.3hr. For easier review, this
> patch uses the new behavior on all platforms. Before commit and back-patch, I
> pla
On Sun, Jan 16, 2022 at 01:02:41PM -0800, Noah Misch wrote:
> My next steps:
>
> - Report a Debian bug for the sparc64+ext4 zeros problem.
(Not done yet.)
> - Try to falsify the idea that "write only the not-already-written portion of
> a WAL block" is an effective workaround. Specifically, m
On Fri, Jan 21, 2022 at 08:34:22AM +1300, Thomas Munro wrote:
> On Mon, Jan 17, 2022 at 10:02 AM Noah Misch wrote:
> > - Report a Debian bug for the sparc64+ext4 zeros problem.
>
> I suspect that 027_stream_regress.pl hits this kernel bug with high
> probability[1]. I wonder if the owner of kitt
On Mon, Jan 17, 2022 at 10:02 AM Noah Misch wrote:
> - Report a Debian bug for the sparc64+ext4 zeros problem.
I suspect that 027_stream_regress.pl hits this kernel bug with high
probability[1]. I wonder if the owner of kittiwake and tadarida would
consider setting up an xfs file system? Or alt
Cancel that kernel upgrade idea. I no longer expect it to help...
On Sun, Jan 16, 2022 at 10:19:30PM +1300, Thomas Munro wrote:
> On Sun, Jan 16, 2022 at 8:12 PM Noah Misch wrote:
> > For specifics of the kernel bug, see the attached test program. In brief,
> > the
> > bug arises if one proces
On Sun, Jan 16, 2022 at 8:12 PM Noah Misch wrote:
> For specifics of the kernel bug, see the attached test program. In brief, the
> bug arises if one process is write()ing or pwrite()ing a file at about the
> same time that another process is read()ing or pread()ing the same. POSIX
> says the re
On Fri, Nov 19, 2021 at 09:18:23PM -0800, Noah Misch wrote:
> On Wed, Nov 17, 2021 at 11:05:06PM -0800, Noah Misch wrote:
> > On Wed, Nov 17, 2021 at 05:47:10PM -0500, Tom Lane wrote:
> > > Noah Misch writes:
> > > > Each of the three failures happened on a sparc64 Debian+gcc machine. I
> > > >
I wrote:
> snapper just exhibited the same failure, too:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2021-11-18%2016%3A09%3A49
I grepped the buildfarm logs for all recent (last 3 months) occurrences of
'could not read two-phase state'. Here's the results:
sysname |
Noah Misch writes:
> Tom Turelinckx, are you able to provide remote access to kittiwake or
> tadarida? I'd use it to attempt the above things. All else being equal,
> kittiwake is more relevant since it's still supported upstream.
snapper just exhibited the same failure, too:
https://buildfarm
On Wed, Nov 17, 2021 at 11:05:06PM -0800, Noah Misch wrote:
> On Wed, Nov 17, 2021 at 05:47:10PM -0500, Tom Lane wrote:
> > Noah Misch writes:
> > > Each of the three failures happened on a sparc64 Debian+gcc machine. I
> > > had
> > > tried ~8000 iterations on thorntail, another sparc64 Debian+
Andrey Borodin writes:
> Let's add more tests that check survival of 2PC through crash recovery? We do
> now only one restart. Maybe it worth to do 4 or 8?
That seems a little premature when we can't explain the failure
we have. Also, buildfarm cycles aren't free.
regar
> 18 нояб. 2021 г., в 12:05, Noah Misch написал(а):
>
> What else might help?
Let's add more tests that check survival of 2PC through crash recovery? We do
now only one restart. Maybe it worth to do 4 or 8?
Best regards, Andrey Borodin.
On Wed, Nov 17, 2021 at 05:47:10PM -0500, Tom Lane wrote:
> Noah Misch writes:
> > Each of the three failures happened on a sparc64 Debian+gcc machine. I had
> > tried ~8000 iterations on thorntail, another sparc64 Debian+gcc animal,
> > without reproducing this.
> # 'pgbench:
Noah Misch writes:
> Tom Lane reported another instance today:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tadarida&dt=2021-11-11%2013%3A29%3A58
> Each of the three failures happened on a sparc64 Debian+gcc machine. I had
> tried ~8000 iterations on thorntail, another sparc64 Debia
On Mon, Nov 08, 2021 at 01:42:46PM +0900, Michael Paquier wrote:
> On Sat, Nov 06, 2021 at 06:31:57PM -0700, Noah Misch wrote:
> > On Sun, Oct 24, 2021 at 04:35:02PM -0700, Noah Misch wrote:
> > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kittiwake&dt=2021-10-24%2012%3A01%3A10
> > > g
On Mon, Nov 08, 2021 at 01:42:46PM +0900, Michael Paquier wrote:
> Indeed. Looking closer, I think that we'd better improve
> DecodingContextFindStartpoint(),
> pg_logical_replication_slot_advance(), XLogSendLogical() as well as
> pg_logical_slot_get_changes_guts() to follow a format closer to wha
> 7 нояб. 2021 г., в 06:31, Noah Misch написал(а):
>
> As a first step, let's report the actual XLogReadRecord() error message.
> Attached. All the other sites that expect no error already do this.
BTW some time ago I've spotted a good number of related unreported errors [0].
[0]
https://
On Sat, Nov 06, 2021 at 06:31:57PM -0700, Noah Misch wrote:
> As a first step, let's report the actual XLogReadRecord() error message.
> Attached.
Good catch! This looks good.
> All the other sites that expect no error already do this.
Indeed. Looking closer, I think that we'd better improve
D
34 matches
Mail list logo