Re: Corrupt Index

2022-03-15 Thread Tom Lane
Jorel Casal writes: > During a Patroni leader switchover, we had a situation where a unique index > stopped "working", the index did not have records > that were in fact in the table and the table had duplicate records which > should have been prevented by said unique index. If the index was on a

Corrupt Index

2022-03-15 Thread Jorel Casal
Hi All, has anyone experienced corrupted unique indexes? During a Patroni leader switchover, we had a situation where a unique index stopped "working", the index did not have records that were in fact in the table and the table had duplicate records which should have been prevented by said unique

Re: Corrupt index stopping autovacuum system wide

2019-07-18 Thread shauncutts
On 17.07.2019 18:14, Andres Freund wrote: To me that means that we need prioritization across databases, and between tables, and probably by multiple criteria. I suspect there need to be multiple criteria how urgent vacuuming is, and autovacuum ought to try to make progress on all of them.

Re: Corrupt index stopping autovacuum system wide

2019-07-18 Thread Peter Geoghegan
On Thu, Jul 18, 2019 at 9:06 AM Aaron Pelz wrote: > It's a simple btree expression on a geometry(Point,4326) , no expression no > partial no composite. The cause of the corruption may be a bug in a Postgis B-Tree operator class. I reported a bug in the Geography type that could lead to corrupt B

Re: Corrupt index stopping autovacuum system wide

2019-07-18 Thread Aaron Pelz
> Can you tell us more about this index? Can you share its definition > (i.e. what does \d show in psql)? > Is it an expression index, or a partial index? A composite? What > datatypes are indexed? It's a simple btree expression on a geometry(Point,4326) , no expression no partial no composite.

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Andres Freund
Hi, On 2019-07-17 13:27:23 -0400, Tom Lane wrote: > My concern here is that if we have blinders on to the extent of only > processing that one table or DB, we're unnecessarily allowing bloat to > occur in other tables, and causing that missed vacuuming work to pile > up so that there's more of it

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Peter Geoghegan
On Wed, Jul 17, 2019 at 10:27 AM Peter Geoghegan wrote: > > It's possible that amcheck would have given you an accurate diagnosis > > of the problem -- especially if you used bt_index_parent_check(): > > > > https://www.postgresql.org/docs/current/amcheck.html > > BTW, be sure to use the 'heapalli

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Peter Geoghegan
On Wed, Jul 17, 2019 at 11:43 AM Alvaro Herrera wrote: > This might make things worse operationally, though. If searches aren't > failing but vacuum is, we'd break a production system that currently > works. If searches aren't failing and VACUUM works, then that's probably down to dumb luck. The

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Alvaro Herrera
On 2019-Jul-17, Tom Lane wrote: > Alvaro Herrera writes: > > On 2019-Jul-17, Peter Geoghegan wrote: > >> Maybe nbtree VACUUM should do something more aggressive than give up > >> when there is a "failed to re-find parent key" or similar condition. > >> Perhaps it would make more sense to make the

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Tom Lane
Alvaro Herrera writes: > On 2019-Jul-17, Peter Geoghegan wrote: >> Maybe nbtree VACUUM should do something more aggressive than give up >> when there is a "failed to re-find parent key" or similar condition. >> Perhaps it would make more sense to make the index inactive (for some >> value of "inac

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Alvaro Herrera
On 2019-Jul-17, Peter Geoghegan wrote: > Maybe nbtree VACUUM should do something more aggressive than give up > when there is a "failed to re-find parent key" or similar condition. > Perhaps it would make more sense to make the index inactive (for some > value of "inactive") instead of just compla

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Peter Geoghegan
On Wed, Jul 17, 2019 at 10:27 AM Tom Lane wrote: > Right, you're eventually going to get to a forced shutdown if vacuum never > succeeds on one table; no question that that's bad. It occurs to me that we use operator class/insertion scankey comparisons within page deletion, to relocate a leaf pag

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Tom Lane
being able to VACUUM one table in one database (because > it has a corrupt index) ultimately risks the availability of every > database in the cluster. Many installations receive little to no > supervision, so it may just be a matter of time there. That is > certainly a bad thing

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Peter Geoghegan
On Wed, Jul 17, 2019 at 9:21 AM Peter Geoghegan wrote: > It's possible that amcheck would have given you an accurate diagnosis > of the problem -- especially if you used bt_index_parent_check(): > > https://www.postgresql.org/docs/current/amcheck.html BTW, be sure to use the 'heapallindexed' opti

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Peter Geoghegan
hing workers into any other databases? That seems > like a bad thing; it's postponing work that will need to be done > eventually. I don't know exactly how the launcher would behave offhand, but it's clear that not being able to VACUUM one table in one database (because it has

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Tom Lane
Peter Geoghegan writes: > On Wed, Jul 17, 2019 at 8:54 AM Aaron Pelz wrote: >> To me it looks like a *single* corrupt index held up autovacuums across our >> entire server, even other in other databases on the same server. Am I >> interpreting this correctly? > Yes

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Aaron Pelz
ok into this, seems helpful. Thanks! On Wed, Jul 17, 2019 at 12:21 PM Peter Geoghegan wrote: > On Wed, Jul 17, 2019 at 8:54 AM Aaron Pelz wrote: > > To me it looks like a *single* corrupt index held up autovacuums across > our entire server, even other in other databases on the same s

Re: Corrupt index stopping autovacuum system wide

2019-07-17 Thread Peter Geoghegan
On Wed, Jul 17, 2019 at 8:54 AM Aaron Pelz wrote: > To me it looks like a *single* corrupt index held up autovacuums across our > entire server, even other in other databases on the same server. Am I > interpreting this correctly? Yes -- that is correct. What PostgreSQL version a

Corrupt index stopping autovacuum system wide

2019-07-17 Thread Aaron Pelz
the system and our XIDs started falling. To me it looks like a *single* corrupt index held up autovacuums across our entire server, even other in other databases on the same server. Am I interpreting this correctly? Would love guidance on diagnosing this type of thing and strategies for preventing it. Thanks, Aaron

Re: pg_basebackup from 9.4-bdr to 9.4 results in corrupt index

2019-05-25 Thread Jānis Pūris
Managed to find following in the announcement of BDR on 9.6: "BDR has always been an extension, but *on 9.4 it required a heavily patched PostgreSQL, one that isn’t fully on-disk-format compatible with stock community PostgreSQL 9.4.* The goal all along has been to allow it to run as an extension o

Re: pg_basebackup from 9.4-bdr to 9.4 results in corrupt index

2019-05-25 Thread Adrian Klaver
On 5/25/19 11:46 AM, Jānis Pūris wrote: Hi Ron, I can not reproduce this error on BDR_Node_2 (it is not BDR_Node_1 as stated before. Typo) I've been successful in transferring the data with pg_dump on BDR_Node_2 and then restoring it on Regular_Node_1. Then running "select * from informatio

Re: pg_basebackup from 9.4-bdr to 9.4 results in corrupt index

2019-05-25 Thread Jānis Pūris
Hi Ron, I can not reproduce this error on BDR_Node_2 (it is not BDR_Node_1 as stated before. Typo) I've been successful in transferring the data with pg_dump on BDR_Node_2 and then restoring it on Regular_Node_1. Then running "select * from information_schema.sequences;" all is OK. The problem w

Re: pg_basebackup from 9.4-bdr to 9.4 results in corrupt index

2019-05-25 Thread Jānis Pūris
Hi, Adrian Apologies, it is a typo. pg_basebackup was taken from BDR_Node_2, not BDR_Node_1 Thank you in advance. Best regards, Janis Puris On 25 May 2019 at 19:27:42, Adrian Klaver (adrian.kla...@aklaver.com) wrote: I am not clear about above: 1) You removed BDR_Node_2 from cluster 2) You t

Re: pg_basebackup from 9.4-bdr to 9.4 results in corrupt index

2019-05-25 Thread Adrian Klaver
On 5/25/19 9:49 AM, Jānis Pūris wrote: Hello, I'm working with a 9.4-bdr cluster and want to move away from BDR tech all together. So my idea was to follow instructions on http://bdr-project.org/docs/stable/ to first strip the node from BDR making it into "regular" node and then moving the da

Re: pg_basebackup from 9.4-bdr to 9.4 results in corrupt index

2019-05-25 Thread Ron
On 5/25/19 11:49 AM, Jānis Pūris wrote: Hello, I'm working with a 9.4-bdr cluster and want to move away from BDR tech all together. So my idea was to follow instructions on http://bdr-project.org/docs/stable/ to first strip the node from BDR making it into "regular" node and then moving the d

pg_basebackup from 9.4-bdr to 9.4 results in corrupt index

2019-05-25 Thread Jānis Pūris
Hello, I'm working with a 9.4-bdr cluster and want to move away from BDR tech all together. So my idea was to follow instructions on http://bdr-project.org/docs/stable/ to first strip the node from BDR making it into "regular" node and then moving the data from this node to official 9.4 instance.

Re: Possible corrupt index?

2019-04-17 Thread Adrian Klaver
On 4/17/19 7:28 AM, Zahir Lalani wrote: Any changes occur between the time it worked and the time it did not? Say a crash, change in schema, new data added and so on. Not as far as we can tell - one of my DB team noticed it day before yesterday - there were no tell tale signs of any issues b

RE: Possible corrupt index?

2019-04-17 Thread Zahir Lalani
>Any changes occur between the time it worked and the time it did not? >Say a crash, change in schema, new data added and so on. Not as far as we can tell - one of my DB team noticed it day before yesterday - there were no tell tale signs of any issues but we have seen issues yesterday with cer

Re: Possible corrupt index?

2019-04-17 Thread Adrian Klaver
On 4/17/19 12:42 AM, Zahir Lalani wrote: I am not following above. What is running version 10? Production is 9.6, others are 10 (as we are testing). Prod has always been fine, its been running for 2 years without a hitch. This issue has only occurred recently Any changes occur between the t

RE: Possible corrupt index?

2019-04-17 Thread Zahir Lalani
>I am not following above. >What is running version 10? Production is 9.6, others are 10 (as we are testing). Prod has always been fine, its been running for 2 years without a hitch. This issue has only occurred recently >Did you restore a dump from a version 10 onto a >version 9.6 machine? No

Re: Possible corrupt index?

2019-04-16 Thread Adrian Klaver
On 4/16/19 2:04 PM, Zahir Lalani wrote: Hi Is this the same for the other environments? We have recently upgraded others to 10 and are in testing before we upgrade production. However we still some 9.6 dev servers which are fine. I am not following above. What is running version 10? Did

RE: Possible corrupt index?

2019-04-16 Thread Zahir Lalani
Hi >Is this the same for the other environments? We have recently upgraded others to 10 and are in testing before we upgrade production. However we still some 9.6 dev servers which are fine. >What does: >SHOW lc_collate; en_US.UTF-8 >Are you doing the below through Navicat or psql? Using Na

Re: Possible corrupt index?

2019-04-16 Thread Adrian Klaver
On 4/16/19 10:16 AM, Zahir Lalani wrote: Which version? What are the queries you are running which give unexpected behavior? Have your run explain analyze on those to check >what plan is being used? Have your reindexed all or only the one you suspect? Hi Michael Version: PostgreSQL 9.6.12 o

RE: Possible corrupt index?

2019-04-16 Thread Zahir Lalani
>Which version? What are the queries you are running which give unexpected >behavior? Have your run explain analyze on those to check >what plan is being >used? Have your reindexed all or only the one you suspect? Hi Michael Version: PostgreSQL 9.6.12 on x86_64-pc-linux-gnu, compiled by gcc (G

Re: Possible corrupt index?

2019-04-16 Thread Adrian Klaver
On 4/16/19 10:02 AM, Zahir Lalani wrote: Hi All New on this list! We have an existing postgres deployment which is showing some odd behaviour on Live. We use Navicat to manage it, and when we open a What does Live refer to? specific table via navicat on live, it says we don’t have a primar

Re: Possible corrupt index?

2019-04-16 Thread Michael Lewis
Which version? What are the queries you are running which give unexpected behavior? Have your run explain analyze on those to check what plan is being used? Have your reindexed all or only the one you suspect? >

Possible corrupt index?

2019-04-16 Thread Zahir Lalani
Hi All New on this list! We have an existing postgres deployment which is showing some odd behaviour on Live. We use Navicat to manage it, and when we open a specific table via navicat on live, it says we don't have a primary index - although the design view does show it. We have auto deployme