Andres Freund wrote:
> Hi,
> 
> On 2013-12-24 12:58:04 -0300, Alvaro Herrera wrote:
> > > Shortly after this patch was committed, buildfarm member locust (running
> > > Mac OS X 10.5 apparently) started failing the pg_upgrade check:
> > > 
> > > command: 
> > > "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_ctl"
> > >  -w -l "pg_upgrade_server.log" -D 
> > > "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data"
> > >  -o "-p 57632 -b -c synchronous_commit=off -c fsync=off -c 
> > > full_page_writes=off  -c listen_addresses='' -c 
> > > unix_socket_permissions=0700 -c 
> > > unix_socket_directories='/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade'"
> > >  start >> "pg_upgrade_server.log" 2>&1
> > > waiting for server to start....LOG:  database system was shut down at 
> > > 2013-12-19 12:51:16 CET
> > > LOG:  invalid primary checkpoint record
> > > LOG:  invalid secondary checkpoint link in control file
> > > PANIC:  could not locate a valid checkpoint record
> > 
> > Any comment on this problem?  Somehow ReadRecord is unable to find a
> > checkpoint, yet there's no error message to be seen anywhere, whereas
> > pg_resetxlog does report it:
> > 
> > > command: 
> > > "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_resetxlog"
> > >  -l 000000010000000000000009 
> > > "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data"
> > >  >> "pg_upgrade_utility.log" 2>&1
> > > pg_resetxlog: could not read from directory "pg_xlog": Invalid argument
> > 
> > I cannot but think xlogreader is at fault.
> > 
> > Regardless of the solution to the Mac OS X problem, ISTM this should be
> > fixed.
> 
> I didn't look at any code, and I won't today, but it doesn't look
> surprising - the report when starting the server above is presumable the
> one in ReadCheckpoint() (or similar) and it probably just reports that
> ReadRecord() didn't return a record.

How is this not surprising?  Surely failing to find a checkpoint record
is not a problem to be taken lightly.

> pg_resetxlog (which doesn't use xlogreader!) reports that it couldn't
> read from directory "pg_xlog", so there's something wonky independently
> from xlogreader.

Yes, most likely there is.  My point is that the LOG messages above
should have logged the system error that caused the checkpoint record to
be unfindable.

> I'd guess that xlog.c read_page callback errors out without reporting
> an error. IIRC we're logging some failures as DEBUG there, because
> they really aren't unexpected, and normally just signal the end of
> wal.

Hmm?  At least, I recall something like a "unexpected pageaddr" message
is sometimes logged when end-of-wal is found.  Why would other error
messages be hidden?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to