On 10/15/21 10:46 AM, Andrew Dunstan wrote:
> On 10/14/21 5:52 PM, Tom Lane wrote:
>> Andrew Dunstan writes:
>>> Yes, that's been puzzling me too. I've just been staring at it again and
>>> nothing jumps out. But maybe we can investigate that offline if this
>>> test is deemed not worth keeping.
On 10/14/21 5:52 PM, Tom Lane wrote:
> Andrew Dunstan writes:
>> Yes, that's been puzzling me too. I've just been staring at it again and
>> nothing jumps out. But maybe we can investigate that offline if this
>> test is deemed not worth keeping.
> As Mark says, it'd be interesting to know wheth
Andrew Dunstan writes:
> Yes, that's been puzzling me too. I've just been staring at it again and
> nothing jumps out. But maybe we can investigate that offline if this
> test is deemed not worth keeping.
As Mark says, it'd be interesting to know whether the use of
background_psql is related, bec
On Thu, Oct 14, 2021 at 2:31 PM Mark Dilger
wrote:
> I was just waiting a couple minutes to see if Andrew wanted to jump in.
> Having heard nothing, I guess you can revert it.
Okay. Pushed a commit removing the test case just now.
Thanks
--
Peter Geoghegan
On 10/14/21 5:09 PM, Tom Lane wrote:
> Andrew Dunstan writes:
>> On 10/14/21 12:15 AM, Tom Lane wrote:
>>> Looks like a transient/phase of the moon issue to me.
>> Bowerbird is having similar issues, so I don't think this is just a
>> transient.
> Yeah, I noticed that too today, and poked around
Mark Dilger writes:
>> On Oct 14, 2021, at 2:13 PM, Tom Lane wrote:
>> (b) Wouldn't finish()ing that connection cause the temp tables to be
>> dropped, negating the entire point of the test?
> The finish() would have to be the last line of the test.
> ...
> I'm curious if the test is indicating
> On Oct 14, 2021, at 2:28 PM, Peter Geoghegan wrote:
>
> I'm not sure what that means for the buildfarm. Are you suggesting
> that we leave things as-is pending an investigation on affected BF
> animals, or something else?
I was just waiting a couple minutes to see if Andrew wanted to jump i
On Thu, Oct 14, 2021 at 2:24 PM Mark Dilger
wrote:
> None of the "pride of ownership" type, but I would like to see something more
> about the limitations of background_psql().
I'm not sure what that means for the buildfarm. Are you suggesting
that we leave things as-is pending an investigation
> On Oct 14, 2021, at 2:21 PM, Peter Geoghegan wrote:
>
> I agree. I can go remove the whole file now, and will.
>
> Mark: Any objections?
None of the "pride of ownership" type, but I would like to see something more
about the limitations of background_psql(). It's the closest thing we hav
On Thu, Oct 14, 2021 at 2:13 PM Tom Lane wrote:
> TBH, I seriously doubt this test case is worth expending buildfarm
> cycles on forevermore. I'm more than a bit tempted to just drop
> it, rather than also expending developer time figuring out why it's
> not as portable as it looks.
I agree. I c
> On Oct 14, 2021, at 2:13 PM, Tom Lane wrote:
>
> (a) Isn't that just holding open one connection, not the whole instance?
Yes.
> (b) Wouldn't finish()ing that connection cause the temp tables to be
> dropped, negating the entire point of the test?
The finish() would have to be the last li
Mark Dilger writes:
> The pg_amcheck patch Peter committed for me adds a new test,
> src/bin/pg_amcheck/t/006_bad_targets.pl, which creates two PostgresNode
> objects (a primary and a standby) and uses PostgresNode::background_psql().
> It doesn't bother to "finish" the returned harness, which
Andrew Dunstan writes:
> On 10/14/21 12:15 AM, Tom Lane wrote:
>> Looks like a transient/phase of the moon issue to me.
> Bowerbird is having similar issues, so I don't think this is just a
> transient.
Yeah, I noticed that too today, and poked around a bit. But I don't
see what this test is do
> On Oct 14, 2021, at 1:50 PM, Andrew Dunstan wrote:
>
> Bowerbird is having similar issues, so I don't think this is just a
> transient.
The pg_amcheck patch Peter committed for me adds a new test,
src/bin/pg_amcheck/t/006_bad_targets.pl, which creates two PostgresNode objects
(a primary a
On 10/14/21 12:15 AM, Tom Lane wrote:
> Peter Geoghegan writes:
>> Any idea what the problems on drongo are?
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2021-10-14%2001%3A27%3A19
> It says
>
> # pg_ctl start failed; logfile:
> 2021-10-14 02:10:33.996 UTC [491848:1] LOG:
On Wed, Oct 13, 2021 at 9:15 PM Tom Lane wrote:
> Looks like a transient/phase of the moon issue to me.
Yeah, I noticed that drongo is prone to them, though only after I hit send.
Thanks
--
Peter Geoghegan
Peter Geoghegan writes:
> Any idea what the problems on drongo are?
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2021-10-14%2001%3A27%3A19
It says
# pg_ctl start failed; logfile:
2021-10-14 02:10:33.996 UTC [491848:1] LOG: starting PostgreSQL 14.0, compiled
by Visual C++
On Wed, Oct 13, 2021 at 2:09 PM Peter Geoghegan wrote:
> I just pushed v4, with the additional minor pg_amcheck documentation
> updates we talked about. No other changes.
Any idea what the problems on drongo are?
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2021-10-14%2001%3
On Mon, Oct 11, 2021 at 7:22 PM Mark Dilger
wrote:
> I agree that it is unlikely to make much difference in practice.
> I don't feel strongly about this. If you'd like me to remove those checks, I
> can do so. These are just my thoughts on the subject.
Okay. I don't feel strongly about it eit
> On Oct 11, 2021, at 5:37 PM, Peter Geoghegan wrote:
>
> Cool. I pushed just the amcheck changes a moment ago. I attach the
> remaining changes from your v3, with a new draft commit message (no
> real changes). I didn't push the rest (what remains in the attached
> revision) just yet because
On Mon, Oct 11, 2021 at 2:41 PM Mark Dilger
wrote:
> > Overall, your approach looks good to me. Will Robert take care of
> > committing this, or should I?
>
> I'd appreciate if you could fix up the in the docs and do the
> commit.
Cool. I pushed just the amcheck changes a moment ago. I attach t
> On Oct 11, 2021, at 2:33 PM, Peter Geoghegan wrote:
>
> On Mon, Oct 11, 2021 at 1:20 PM Mark Dilger
> wrote:
>> Ok, I went with this suggestion, and also your earlier suggestion to have a
>> in the pg_amcheck docs about using --parent-check and/or
>> --rootdescend against servers in reco
On Mon, Oct 11, 2021 at 1:20 PM Mark Dilger
wrote:
> Ok, I went with this suggestion, and also your earlier suggestion to have a
> in the pg_amcheck docs about using --parent-check and/or
> --rootdescend against servers in recovery.
My concern with --parent-check (and with --rootdescend) had l
> On Oct 11, 2021, at 12:25 PM, Mark Dilger
> wrote:
>
> Your proposal sounds good. Let me try it and get back to you shortly.
Ok, I went with this suggestion, and also your earlier suggestion to have a
in the pg_amcheck docs about using --parent-check and/or
--rootdescend against servers
> On Oct 11, 2021, at 11:53 AM, Peter Geoghegan wrote:
>
> On Mon, Oct 11, 2021 at 11:40 AM Peter Geoghegan wrote:
>> A NOTICE message is supposed to be surfaced to clients (but not stored
>> in the server log), pretty much by definition.
>>
>> It's not unreasonable to argue that I was mista
On Mon, Oct 11, 2021 at 11:40 AM Peter Geoghegan wrote:
> A NOTICE message is supposed to be surfaced to clients (but not stored
> in the server log), pretty much by definition.
>
> It's not unreasonable to argue that I was mistaken to ever think that
> about this particular message. In fact, I su
On Mon, Oct 11, 2021 at 11:37 AM Mark Dilger
wrote:
> The documentation for contrib/amcheck has a paragraph but not a warning box.
> Should that be changed also?
Maybe. I think that the pg_amcheck situation is a lot worse, because
users could easily interpret --parent-check as an additive thing
On Mon, Oct 11, 2021 at 11:26 AM Mark Dilger
wrote:
> That's fine by me, but I was under the impression that people wanted the
> extraneous noise removed.
A NOTICE message is supposed to be surfaced to clients (but not stored
in the server log), pretty much by definition.
It's not unreasonable
> On Oct 11, 2021, at 11:33 AM, Peter Geoghegan wrote:
>
> I definitely think that it warrants a warning box. This is a huge
> practical difference.
>
> Note that I'm talking about a standard thing, which there are
> certainly a dozen or more examples of in the docs already. Just grep
> for "
On Mon, Oct 11, 2021 at 11:29 AM Mark Dilger
wrote:
> The recently submitted patch already contains a short paragraph for each of
> these, but not a warning box. Should I reformat those as warning boxes? I
> don't know the current thinking on the appropriateness of that documentation
> style.
> On Oct 11, 2021, at 11:26 AM, Peter Geoghegan wrote:
>
> We should have a warning box about this in the pg_amcheck docs. Users
> should think carefully about ever using --parent-check, since it alone
> totally changes the locking requirements (actually --rootdescend will
> do that too, but o
On Mon, Oct 11, 2021 at 11:12 AM Peter Geoghegan wrote:
> Sure, the user might not be happy with --parent-check throwing an
> error on a replica. But in practice most users won't want to do that
> anyway. Even on a primary it's usually not possible as a practical
> matter, because the locking impl
> On Oct 11, 2021, at 11:12 AM, Peter Geoghegan wrote:
>
> What's the problem with just having pg_amcheck pass through the notice
> to the user, without it affecting anything else? Why should a simple
> notice message need to affect its return code, or anything else?
That's fine by me, but I
On Mon, Oct 11, 2021 at 10:46 AM Mark Dilger
wrote:
> > On Oct 11, 2021, at 10:10 AM, Peter Geoghegan wrote:
> > This mostly looks good to me. Just one thing occurs to me: I suspect
> > that we don't need to call pg_is_in_recovery() from SQL at all. What's
> > wrong with just letting verify_heapa
> On Oct 11, 2021, at 10:10 AM, Peter Geoghegan wrote:
>
> This mostly looks good to me. Just one thing occurs to me: I suspect
> that we don't need to call pg_is_in_recovery() from SQL at all. What's
> wrong with just letting verify_heapam() (the C function from amcheck
> proper) show those n
On Mon, Oct 11, 2021 at 9:53 AM Mark Dilger
wrote:
> > Right. I never meant anything like making a would-be
> > bt_index_parent_check() call into a bt_index_check() call, just
> > because of the state of the system (e.g., it's in recovery). That
> > seems awful, in fact.
>
> Please find attached t
> On Oct 6, 2021, at 4:12 PM, Peter Geoghegan wrote:
>
>>
>> Ok, excellent, that was probably the only thing that had me really hung up.
>> I thought you were still asking for pg_amcheck to filter out the
>> --parent-check option when in recovery, but if you're not asking for that,
>> then
On Wed, Oct 6, 2021 at 2:28 PM Pavel Borisov wrote:
>> It is the most consistent with the general design of the system, for
>> reasons that are pretty deeply baked into the system. I'm reminded of
>> the fact that REINDEX CONCURRENTLY's completion became blocked due to
>> similar trepidations. Und
On Wed, Oct 6, 2021 at 3:47 PM Mark Dilger wrote:
> I totally agree that the amcheck functions cannot prove the absence of
> corruption.
>
> I prefer not to even use language about proving the presence of corruption
> when discussing pg_amcheck.
I agree that it doesn't usually help. But sometim
> On Oct 6, 2021, at 3:20 PM, Peter Geoghegan wrote:
>
>> I think the disagreements are about something else.
>
> Informally speaking, you could say that pg_amcheck and amcheck verify
> relations. More formally speaking, both amcheck (whether called by
> pg_amcheck or some other thing) can on
On Wed, Oct 6, 2021 at 2:45 PM Mark Dilger wrote:
> I think the disagreements are about something else.
Informally speaking, you could say that pg_amcheck and amcheck verify
relations. More formally speaking, both amcheck (whether called by
pg_amcheck or some other thing) can only prove the prese
> On Oct 6, 2021, at 2:45 PM, Mark Dilger wrote:
>
> and db3 is in recovery.
> they're scattered across different databases, some in recovery, some not.
What I mean here is that, since pg_amcheck might run for many hours, and
database may start in recovery but then exit recovery, or may b
> On Oct 6, 2021, at 1:49 PM, Peter Geoghegan wrote:
>
>> The analogy here is: are we trying to verify that the relations are
>> valid? Or are we just trying to verify that they are as valid as we
>> can expect them to be?
>
> I think that we do the latter (or something much closer to the lat
>
>
> It is the most consistent with the general design of the system, for
> reasons that are pretty deeply baked into the system. I'm reminded of
> the fact that REINDEX CONCURRENTLY's completion became blocked due to
> similar trepidations. Understandably so.
I may mistake, but I recall the fac
On Wed, Oct 6, 2021 at 1:15 PM Robert Haas wrote:
> > Where I might go further than you or Mark (not sure) is on this: I
> > also think that it's the caller's job to not call the functions with
> > temp relations, or (in the case of the index verification stuff) with
> > !indisready or !indisvalid
On Wed, Oct 6, 2021 at 3:56 PM Peter Geoghegan wrote:
> I agree, with the stipulation that the caller (in this case
> pg_amcheck) is required to know certain basic things about the
> relation in order to get useful behavior. For example, if you use
> bt_index_check() with a GIN index, you're going
On Wed, Oct 6, 2021 at 12:36 PM Mark Dilger
wrote:
> The user may not know that the system has changed.
>
> For example, if I see errors in the logs suggesting corruption in a relation
> named "mark" and run pg_amcheck --relation=mark, I expect that to check the
> relation. If that relation is
On Wed, Oct 6, 2021 at 12:33 PM Robert Haas wrote:
> To me, it doesn't matter which specific option we're talking about. If
> I tell pg_amcheck to pass a certain flag to the underlying functions,
> then it should do that. If the behavior needs to be changed, it should
> be changed in those underly
> On Oct 6, 2021, at 12:28 PM, Peter Geoghegan wrote:
>
> I think that what I've said boils down to this:
>
> * pg_amcheck shouldn't attempt to verify temp relations, on the
> grounds that this is fundamentally not useful, and not something that
> could ever be sensibly interpreted as "just d
On Wed, Oct 6, 2021 at 2:56 PM Peter Geoghegan wrote:
> --heapallindexed doesn't complicate things for us at all. It changes
> nothing about the locking considerations. It's just an additive thing,
> some extra checks with the same basic underlying requirements. Maybe
> you meant to say --parent-c
On Wed, Oct 6, 2021 at 11:55 AM Peter Geoghegan wrote:
> I am pretty sure that I agree with you about all these details. We
> need to tease them apart some more.
I think that what I've said boils down to this:
* pg_amcheck shouldn't attempt to verify temp relations, on the
grounds that this is f
On Wed, Oct 6, 2021 at 11:32 AM Robert Haas wrote:
> All of the decisions we're talking about here really have to do with
> determining the user's intent. I think that if the user says
> pg_amcheck --all, there's a good argument that they don't want us to
> check unlogged relations on a standby wh
On Wed, Oct 6, 2021 at 1:57 PM Mark Dilger wrote:
> > To me #2 sounds like a tautology. It could almost be phrased as
> > "pg_amcheck does not check the things that it cannot check".
>
> I totally disagree. It is uncomfortable to me that `pg_amcheck
> --parent-check` will now silently not perfor
On Wed, Oct 6, 2021 at 10:57 AM Mark Dilger
wrote:
> > Clearly pg_amcheck never checked all relations, because it never
> > checked indexes that are not B-Tree indexes. I'm pretty sure that I
> > can poke big holes in almost any positivist statement like that with
> > little effort.
>
> There is a
> On Oct 6, 2021, at 10:39 AM, Peter Geoghegan wrote:
>
>> The differences in the upcoming version are
>>
>> 1) --all no longer means "all relations" but rather "all checkable relations"
>
> Clearly pg_amcheck never checked all relations, because it never
> checked indexes that are not B-Tre
On Wed, Oct 6, 2021 at 10:19 AM Mark Dilger
wrote:
> > A return value of 0 cannot be said to indicate that the database is
> > not corrupt;
>
> Nor can a non-zero value be said to indicate that the database is corrupt.
I never said otherwise. I think it's perfectly fine that there are
multiple no
> On Oct 6, 2021, at 10:16 AM, Peter Geoghegan wrote:
>
> A return value of 0 cannot be said to indicate that the database is
> not corrupt;
Nor can a non-zero value be said to indicate that the database is corrupt.
These invocations will still return a non-zero exit status:
pg_amch
On Wed, Oct 6, 2021 at 9:25 AM Mark Dilger wrote:
> Thanks for reviewing!
>
> I expect to post a new version shortly.
Not sure how much it matters, but I have some thoughts on the return
value of pg_amcheck. (I'm mostly going into this now because it seems
related to how we discuss these issues g
> On Oct 6, 2021, at 8:14 AM, Pavel Borisov wrote:
>
> We've looked through the initial patch and the exclusion of temporary tables
> from pg_amcheck seems the right thing. Also it is not the matter anyone
> disagrees here, and we propose to commit it alone.
Thanks for reviewing!
I expect
Hi, hackers!
We've looked through the initial patch and the exclusion of temporary
tables from pg_amcheck seems the right thing. Also it is not the matter
anyone disagrees here, and we propose to commit it alone.
Supplementary things/features might be left for further discussion but
refusing to ch
On Tue, Oct 5, 2021 at 11:00 PM Alexander Lakhin wrote:
> I think that ideally pg_amcheck should not fail on a live database, that
> does not contain corrupted data, and should not affect the database
> usage by other users (as it's "only a check").
I agree that that's ideal. As you said, one or
Hello Mark, Peter, Robert,
05.10.2021 20:22, Peter Geoghegan пишет:
> On Tue, Oct 5, 2021 at 10:03 AM Mark Dilger
> wrote:
>> I took no offense. Actually, I owe you a thank-you for having put so much
>> effort into debating the behavior with me. I think the patch to fix bug
>> #17212 will be b
On Tue, Oct 5, 2021 at 10:03 AM Mark Dilger
wrote:
> I took no offense. Actually, I owe you a thank-you for having put so much
> effort into debating the behavior with me. I think the patch to fix bug
> #17212 will be better for it.
Glad that you think so. And, thanks for working on the issue
> On Oct 5, 2021, at 9:58 AM, Peter Geoghegan wrote:
>
> I apologize to Mark.
I took no offense. Actually, I owe you a thank-you for having put so much
effort into debating the behavior with me. I think the patch to fix bug #17212
will be better for it.
(And thanks to Robert for the conc
On Tue, Oct 5, 2021 at 9:41 AM Robert Haas wrote:
> On Mon, Oct 4, 2021 at 7:10 PM Peter Geoghegan wrote:
> > All of the underlying errors are cases that were clearly intended to
> > catch user error -- every single one. But apparently pg_amcheck is
> > incapable of error, by definition. Like HAL
On Mon, Oct 4, 2021 at 7:10 PM Peter Geoghegan wrote:
> All of the underlying errors are cases that were clearly intended to
> catch user error -- every single one. But apparently pg_amcheck is
> incapable of error, by definition. Like HAL 9000.
After some thought, I agree with the idea that pg_a
On Mon, Oct 4, 2021 at 8:19 PM Mark Dilger wrote:
> I am changing pg_amcheck to filter out indexes as you say. Since the btree
> check should no longer error in these cases, the issue of pg_amcheck exit(2)
> sorts itself out without further code changes.
Cool.
> I am changing verify_heapam to
> On Oct 4, 2021, at 6:19 PM, Peter Geoghegan wrote:
>
> I don't see what the point of this example is.
It doesn't matter.
I am changing pg_amcheck to filter out indexes as you say. Since the btree
check should no longer error in these cases, the issue of pg_amcheck exit(2)
sorts itself o
On Mon, Oct 4, 2021 at 5:32 PM Mark Dilger wrote:
> Sorry, I realized after hitting that you might take it that way, but I
> mean the logs generally, not just postgres logs. If somebody runs "reindex
> concurrently" on all tables at midnight every night, and they see one morning
> in the (non
> On Oct 4, 2021, at 4:45 PM, Peter Geoghegan wrote:
>
> I'm guessing that you meant REINDEX CONCURRENTLY.
Yes.
> Since you're talking about the case where it has an error
Sorry, I realized after hitting that you might take it that way, but I
mean the logs generally, not just postgres log
On Mon, Oct 4, 2021 at 4:28 PM Mark Dilger wrote:
> I am concerned about giving the user the false impression that an index (or
> table) was checked when it was not. I don't see the logic in
>
> pg_amcheck -i idx1 -i idx2 -i idx3
>
> skipping all three indexes and then reporting success.
This
> On Oct 4, 2021, at 4:28 PM, Mark Dilger wrote:
>
> pg_amcheck -i idx1 -i idx2 -i idx3
I forgot to mention: There's a continuum between `pg_amcheck -a` which checks
everything in all databases of the cluster, and `pg_amcheck -i just_one_index`.
There are any number of combinations of obj
> On Oct 4, 2021, at 4:10 PM, Peter Geoghegan wrote:
>
> And I don't understand why you think that clearly-accidental
> implementation details (really just bugs) should be treated as
> axiomatic truths about how pg_amcheck must work. Should we now "fix"
> pg_dump so that it matches pg_amcheck?
On Mon, Oct 4, 2021 at 3:36 PM Mark Dilger wrote:
> >> I believe this is a bug in amcheck's btree checking functions. Peter, can
> >> you take a look?
> >
> > Why do you say that?
>
> Because REINDEX CONCURRENTLY and the bt_index_parent_check() function seem to
> have lock upgrade hazards that
> On Oct 4, 2021, at 10:58 AM, Peter Geoghegan wrote:
>
> On Mon, Oct 4, 2021 at 8:10 AM Mark Dilger
> wrote:
>>> There is another issue, that maybe should be discussed separately (or
>>> this thread could be renamed to "... on checking specific relations"),
>>> but the solution could be sim
On Mon, Oct 4, 2021 at 2:00 AM Alexander Lakhin wrote:
> There is another issue, that maybe should be discussed separately (or
> this thread could be renamed to "... on checking specific relations"),
> but the solution could be similar to that.
Thanks for the report!
I wonder if verify_heapam.c
On Mon, Oct 4, 2021 at 8:10 AM Mark Dilger wrote:
> > There is another issue, that maybe should be discussed separately (or
> > this thread could be renamed to "... on checking specific relations"),
> > but the solution could be similar to that.
> > pg_amcheck also fails on checking invalid indexe
> On Oct 4, 2021, at 2:00 AM, Alexander Lakhin wrote:
Thank you, Alexander, for these bug reports.
> There is another issue, that maybe should be discussed separately (or
> this thread could be renamed to "... on checking specific relations"),
> but the solution could be similar to that.
> pg
Hello Mark,
04.10.2021 01:20, Mark Dilger wrote:
> The attached patch includes a test case for this, which shows the problems
> against the current pg_amcheck.c, and a new version of pg_amcheck.c which
> fixes the bug. Could you review it?
>
> Thanks for bringing this to my attention.
There is
> On Oct 3, 2021, at 10:04 AM, Mark Dilger wrote:
>
>> On Oct 2, 2021, at 10:32 AM, Peter Geoghegan wrote:
>>
>> On Sat, Oct 2, 2021 at 4:49 AM PG Bug reporting form
>> wrote:
>>> Although you can add --exclude-relation=*.pg_temp*.*, this behaviour differs
>>> from the behaviour of pg_dump a
80 matches
Mail list logo