On Sun, May 2, 2010 at 11:02 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:

>
>
> Hmm ... AFAICS the only way to get that message when the incoming TID's
> offsetNumber is only 2 is for the index page to be completely empty
> (not zeroes, else PageAddItem's sanity check would have triggered,
> but valid and empty).  What that smells like is a software bug, like
> failing to emit a WAL record in a case where it was necessary.  Can you
> identify which index this was?  (Look for relfilenode 48778276 in the
> database with OID 22362.)  If so, can you give us any hints about
> unusual things that might have been done with that index?
>

Interesting. There is no pg_class entry for 22362.  There is, however, an
entry for that filenode. It's an index I created Sat AM, about 6AM.

select oid, * from pg_class where relfilenode=48778276;
-[ RECORD 1 ]--+---------------------------
oid            | 48777488
relname        | cts_20100501_topic_date_nk
relnamespace   | 2200
reltype        | 0
relowner       | 16412
relam          | 403
relfilenode    | 48778276
reltablespace  | 48777166
relpages       | 2476
reltuples      | 58879
reltoastrelid  | 0
reltoastidxid  | 0
relhasindex    | f
relisshared    | f
relistemp      | f
relkind        | i
relnatts       | 2
relchecks      | 0
relhasoids     | f
relhaspkey     | f
relhasrules    | f
relhastriggers | f
relhassubclass | f
relfrozenxid   | 0
relacl         | null
reloptions     | null

Possibly relevant facts:

- The WSB server went active on Friday around 3:30PM
- On Friday evening, I added about 11 tablespaces.  I noted the new files on
the WSB, no problems.
- On Sat morning, I created a partitioned table cts_20100501 (inherits from
another table) and 4 indexes.
- This morning, I was doing some table maintenance on the master and
discovered I had created this table and its indexes in the wrong tablespace.
I wanted the table in ts29, but had it in ts30.  Vice versa for the
indexes.  So I moved them. This is from my command history:

alter index cts_20100501_natural_uk set tablespace ts30;
alter index cts_20100501_pkey set tablespace ts30;
alter index cts_20100501_topic_date_nk set tablespace ts30;
alter index cts_20100501_updated_nk set tablespace ts30;
alter table cts_20100501 set tablespace ts29;

These commands worked fine on the master, yet this seems suspiciously
relevant.


> > Any suggestions?
>
> As far as recovering goes, there's probably not much you can do except
> resync the standby from scratch.  But it would be nice to get to the
> bottom of the problem, so that we can fix the bug.  Have you got an
> archive of this xlog segment and the ones before it, and would you be
> willing to let a developer look at them?
>
>
Before I received your reply, I had already started the re-sync, and
unfortunately already deleted the wal logs in question.
If it happens again, I will certainly keep them, and would be happy to share
them.

Regards,
Gordon

Reply via email to