Hi Tom,

On 12/4/17 3:15 PM, Tom Lane wrote:
> While working through Michael Paquier's patch to clean up inconsistent
> usage of AllocateDir(), I noticed that ResetUnloggedRelations and its
> subroutines are not consistent about whether a directory open failure
> results in erroring out or just emitting a LOG message and continuing.
> ResetUnloggedRelations itself throws a hard error if it fails to open
> pg_tblspc, but all the rest of reinit.c thinks a LOG message is
> sufficient.

By a strange coincidence I spent a while today reading through this code...

> My first thought was to change ResetUnloggedRelations to match the
> rest, but on reflection I'm less sure about that.  What we've got
> at the moment is that a possibly-transient directory open failure
> can result in failure to reset an unlogged relation to empty,
> which to me amounts to data corruption.  

I'm wondering how this transient directory open failure is going to
happen without a bunch of other things going wrong, but I agree that if
it happens then corruption would be the likely result.

> If the contents of the
> unlogged relation are inconsistent, which is plenty likely after
> a crash, we could end up crashing later because of that; and in
> any case the user would not see what they expect in the tables.

Agreed.

> So now I'm thinking we should do the reverse and change these functions
> to give a hard error on AllocateDir failure.  That would result in
> startup-process failure if we are unable to scan the database, which is
> not great, but there's certainly something badly wrong if we can't.

+1.  If a tablespace or database directory cannot be opened then I don't
think it makes any sense to continue.

Regards,
-- 
-David
da...@pgmasters.net

Reply via email to