On 4 October 2016 at 09:28, Benedikt Grundmann <bgrundm...@janestreet.com> wrote:
> > > On 4 October 2016 at 08:17, Benedikt Grundmann <bgrundm...@janestreet.com> > wrote: > >> >> On 3 October 2016 at 21:01, Tom Lane <t...@sss.pgh.pa.us> wrote: >> >>> Benedikt Grundmann <bgrundm...@janestreet.com> writes: >>> > proddb_testing=# SELECT >>> > conname,convalidated,conislocal,coninhcount,connoinherit >>> > proddb_testing-# FROM pg_constraint WHERE conrelid = >>> > 'js_activity_20110101'::regclass; >>> > conname | convalidated | >>> conislocal | >>> > coninhcount | connoinherit >>> > ---------------------------------------------+-------------- >>> +------------+-------------+-------------- >>> > seqno_not_null | f | t >>> | >>> > 1 | f >>> >>> After some tracing through the code, I think it's the combination of all >>> three of coninhcount>0, conislocal, and !convalidated that is producing >>> the problem, and even then possibly only in binary-upgrade mode. pg_dump >>> is jumping through some hoops to try to restore that state, and evidently >>> not getting it entirely right. >>> >>> Is there a reason you've left all these constraints in NOT VALID state? >>> They're kinda useless that way. >> >> >> Not at all. I consider the ability to add constraints in not validated >> form one of the 10 best things that happened in postgres in recent years. >> They helped us a lot when slowly improving our schemas. >> >> Often just preventing any new or modified rows to validate the constraint >> is really all we need or most that is needed. Which is the only thing I >> really care about in this case. And given the size of these tables and >> their importance validating the constraints during production hours is >> tricky. Which means to validate them one of us has to sacrifice part of >> their Saturday to do these and the marginal utility of having the >> constraint validated was just never worth it. But if that is what's >> required to do the upgrade we will do so (the upgrade itself we will have >> to do on a Saturday anyway). >> >> >> Probably if you updated them to be valid >>> (see ALTER TABLE ... VALIDATE CONSTRAINT), the upgrade would go through >>> without difficulty. >>> >>> I'm running all the upgrade attempts on our testing instance (which is >> nightly restored from the latest backup), it's not a problem to run the >> validate command there so I'll do that now and find out if you are right. >> > > It looks like you might be right but I don't know for sure yet. And it > will take me a long time to find out. Rationale: After validating > seqno_not_null I could proceed a bit further but failed at another > constraint like that (valid_counterparty). However that constraint > actually is violated by lots of rows in the past and we had no plans (or > easy way) to fix this. The constraint was put in like this to prevent > future rows. > > I guess I could drop the constraint do the restore and then put the > constraint in again. Sigh. This is all relatively sad. > > Yep I can confirm that after dropping a few more constraints and then doing the checkpoint_segments vs min_wal_size/max_wal_size foo in postgresql.conf I got the database up. So far everything seems otherwise fine. > >> I'll look into fixing this, but depending on how messy it turns out to be, >>> it might be something we choose to fix only in HEAD. >>> >>> regards, tom lane >>> >> >> >