Re: Behavior of PL/pgSQL function following drop and re-create of a table that it uses

Bryn Llewellyn Wed, 08 Mar 2023 13:29:47 -0800

> t...@sss.pgh.pa.us wrote:
> 
>> david.g.johns...@gmail.com wrote:
>> 
>> So I found where this difference in behavior is at least explicitly noted:
>> 
>> /*
>> * If it's a named composite type (or domain over one), find the typcache
>> * entry and record the current tupdesc ID, so we can detect changes
>> * (including drops). We don't currently support on-the-fly replacement
>> * of non-composite types, else we might want to do this for them too.
>> */
> 
> I'm not quite sure that that's related, really. That code is concerned with 
> detecting changes to an already-identified type (that is, type OID NNN has 
> different details now than it did before). It seemed to me that Bryn's 
> question was more about reacting to cases where a given string of source code 
> would resolve to a different type OID than it did a moment ago. We don't have 
> a great story on that, I'll agree. You can get into that sort of problem 
> without anywhere near the amount of complexity embodied in this example --- 
> for instance, I'm pretty sure we don't re-parse type references just because 
> somebody else executed an ALTER TYPE RENAME somewhere.


I tried a new test, inspired by what Tom wrote:

create table s.t(k int primary key, c1 int, c2 int, c3 int);
insert into s.t(k, c1, c2, c3) values(1, 17, 42, 57);
create type s.x as (c1 int, c2 int, c3 int);

create function s.f()
  returns text
  security definer
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  r s.x;
begin
  r  := (select (a.c1, a.c2, a.c3)::s.x from s.t as a where a.k = 1);
  return r::text;
end;
$body$;

select s.f();

It produced the expected result:

 (17,42,57)

Then I did this (still in the same session):

alter type s.x drop attribute c3 cascade;
select s.f();

It produced this new result (with no reported error):

 (17,42)

Then I reconnected as the same user to the same database to force a fresh 
analysis of the the source code of "s.f()" on it's first execution:

\c - :the_user
select s.f();

Now I got a  42846 error, "cannot cast type record to s.x", with the detail 
"Input has too many columns".

Here's my conclusion. It's for the scenario that you have PL/pgSQL subprograms 
among the objects that your client-side app uses. It's rather obvious.

(1) If you do any DDLs that affect any of the objects that an application uses, 
then you should exit all of the client sessions (presumably this means stopping 
the connection pool for most apps) before you do the patching. The reasoning is 
simple. A few spot tests show how things can go wrong if you don't do this. And 
there's no doc to tell you what, if any, DDLs you might safely do without 
stopping all but the session(s) that do the patching.

(2) You have to take full responsibility for the impact analysis so that you 
can make all the changes that are needed to take you from the pre-patch 
mutually consistent state of all objects to the new post-patch mutually 
consistent state during the window when only the session(s) doing the patching 
are active. Native PG doesn't provide much metadata or tooling to help you 
here. You need your own reliable humanly written external doc of your system.

(3) The same general thinking extends to client-side code. Carefully specified 
and executed testing, using a dedicated and realistic test env,  is critical.

Re: Behavior of PL/pgSQL function following drop and re-create of a table that it uses

Reply via email to