On Mon, Oct 12, 2015 at 10:15 PM, Adrian Klaver <adrian.kla...@aklaver.com> wrote:
> On 10/12/2015 06:53 AM, Tom Lane wrote: > >> Andres Freund <and...@anarazel.de> writes: >> >>> On 2015-10-09 14:32:44 +0800, Victor Blomqvist wrote: >>> >>>> CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS >>>> $$ >>>> BEGIN >>>> RETURN QUERY SELECT * FROM users WHERE id = id_; >>>> END; >>>> $$ LANGUAGE plpgsql; >>>> >>> >> My guess is that the problem here is that table level locking prevents >>> modification of the "users" type when the table is used, but there's no >>> locking preventing the columns to be dropped while the function is >>> used. So what happens is that 1) the function is parsed & planned 2) >>> DROP COLUMN is executed 3) the contained statement is executed 4) a >>> mismatch between the contained statement and the function definition is >>> detected. >>> >> >> The query plan as such does get refreshed, I believe. The problem is that >> plpgsql has no provision for the definition of a named composite type to >> change after a function's been parsed. This applies to variables of named >> composite types for sure, and based on this example I think it must apply >> to the function result type as well, though I'm too lazy to go check the >> code right now. >> > > That makes sense. The problem is that I cannot square that with Albe's > example, which I tested also: > > " > Session 1: > > test=> CREATE TABLE users (id integer PRIMARY KEY, name varchar NOT NULL, > to_be_removed integer NOT NULL); > CREATE TABLE > test=> CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS > $$BEGIN RETURN QUERY SELECT * FROM users WHERE id = id_; END;$$ > LANGUAGE plpgsql; > CREATE FUNCTION > > Session 2: > > test=> SELECT id, name FROM select_users(18); > id | name > ----+------ > (0 rows) > > Ok, now the plan is cached. > > Now in Session 1: > > test=> ALTER TABLE users DROP COLUMN to_be_removed; > ALTER TABLE > > Session2: > > test=> SELECT id, name FROM select_users(18); > id | name > ----+------ > (0 rows) > > No error. This is 9.4.4. > " > > >> We have had past discussions about fixing this. I believe it would >> require getting rid of use of plpgsql's "row" infrastructure for named >> composites, at least in most cases, and going over to the "record" >> infrastructure instead. In the past the conversations have stalled as >> soon as somebody complained that that would probably make some operations >> slower. I don't entirely understand that objection, since (a) some other >> operations would probably get faster, and (b) performance does not trump >> correctness. But that's where the discussion stands at the moment. >> >> regards, tom lane >> >> >> > > -- > Adrian Klaver > adrian.kla...@aklaver.com > In case any of you are interested of recreating this problem, I today had the time to create a short example that reproduce the error every time I try. 1. Create table and function create table a( id serial primary key, x integer ); create or replace function select_a() returns setof a AS $$ begin return query select a.* from a; end; $$ language plpgsql; 2. Create loop_alter.sql with this content #!/usr/bin/env bash for i in {0..1000}; do echo "alter table a add column y text; alter table a drop column y;" done; 3. Create loop_select.sql with this content #!/usr/bin/env bash for i in {0..100000} do echo "select * from select_a() limit 1;" done; 4. Run the files from 2 and 3 simultaneous with psql: In one terminal: ./loop_alter.sql | psql In another: ./loop_select.sql | psql (Note that you need to drop and recreate the table after each run since it reaches the column limit otherwise) /Victor