limits, indexes, views and query planner

2022-08-15 Thread Marc Mamin

hello,
in the example below, we can see that the view test_ab prevents the usage of 
the index to retrieve the top last rows.
This is a special case, as the where clause excludes data from the second 
table, and the explain output do not references it at all.
I wonder if the planner could be able to exclude the table_b earlier in its 
plan and to possibly fallback to a plan equivalent to the first one.
with a view on a single table (test_av), the index is used.

An oddity in the plan is the expected row count in "Append"(11) ...

( tested on postgres 14 )

Regards,
Marc Mamin


create temp table table_a as (select * from generate_series(1,10)x);
create temp table table_b as (select * from generate_series(1,10)x);
create index i_a on table_a (x);
create index i_b on table_b (x);

analyze table_a;
analyze table_b;

CREATE VIEW test_ab AS

  select 'a' as src, x from table_a
  UNION
  select 'b' as src, x from table_b
;


explain analyze select * from table_a order by x desc limit 10;
Limit  (cost=0.29..0.60 rows=10 width=4) (actual 
time=0.056..0.060 rows=10 loops=1)
   ->  Index Only Scan Backward using i_a on table_a  
(cost=0.29..3050.29 rows=10 width=4) (actual time=0.055..0.058 rows=10 
loops=1)
Heap Fetches: 10


explain analyze select * from test_ab where src='a' order by x desc limit 10;

Limit  (cost=17895.92..17895.94 rows=10 width=36) (actual time=89.678..89.681 
rows=10 loops=1)
  ->  Sort  (cost=17895.92..18145.92 rows=11 width=36) (actual 
time=89.677..89.679 rows=10 loops=1)
Sort Key: table_a.x DESC
Sort Method: top-N heapsort  Memory: 25kB
->  Unique  (cost=13984.92..14734.92 rows=11 width=36) (actual 
time=47.684..75.574 rows=10 loops=1)
  ->  Sort  (cost=13984.92..14234.92 rows=11 width=36) (actual 
time=47.682..60.869 rows=10 loops=1)
Sort Key: ('a'::text), table_a.x
Sort Method: external merge  Disk: 1768kB
->  Append  (cost=0.00..2943.01 rows=11 width=36) 
(actual time=0.012..21.268 rows=10 loops=1)
  ->  Seq Scan on table_a  (cost=0.00..1443.00 
rows=10 width=36) (actual time=0.011..14.078 rows=10 loops=1)
  ->  Result  (cost=0.00..0.00 rows=0 width=36) (actual 
time=0.001..0.002 rows=0 loops=1)
One-Time Filter: false
Planning Time: 0.107 ms
Execution Time: 90.139 ms

CREATE VIEW test_av AS
  select 'a' as src, x from table_a;

explain analyze select * from test_av order by x desc limit 10;
 ->  Index Only Scan Backward using i_a on table_a  (cost=0.29..3050.29 
rows=10 width=36) (actual time=0.017..0.019 rows=10 loops=1)



removing "serial" from table definitions.

2021-06-24 Thread Marc Mamin
Hi,
Is there  a way to change a data type from serial to int?
I tried with :
  ALTER TABLE "admin".db_jobs ALTER COLUMN id TYPE int USING id::int;
But this seems not to change anything, as if Posgres woud consider the 
statement as a no-op.

My problem is that "serial" is not exported with pg_dump.
Creating a db from the dump will hence result into a different table definition 
(which is equivalent tough)

We are trying a transfer/migration tool on Azure, that check the table 
definitions between the source and target before starting the data transfer, 
and it blocks on that difference.

best regards,
Marc Mamin



RE: removing "serial" from table definitions.

2021-06-24 Thread Marc Mamin
From: Joe Conway [mailto:m...@joeconway.com] 
>Sent: Donnerstag, 24. Juni 2021 14:47
>To: Marc Mamin ; pgsql-general 
>
>Subject: Re: removing "serial" from table definitions.
>
>On 6/24/21 8:33 AM, Marc Mamin wrote:
>> Hi,
>> 
>> Is there  a way to change a data type from serial to int?
>> 
>> I tried with :
>> 
>>ALTER TABLE "admin".db_jobs ALTER COLUMN id TYPE int USING id::int;
>> 
>> But this seems not to change anything, as if Posgres woud consider the 
>> statement as a no-op.
>
>serial is not an actual data type -- it is essentially an integer with a 
>default and an automatically created sequence. See:
>
>https://www.postgresql.org/docs/13/datatype-numeric.html#DATATYPE-SERIAL

Yes, I undersand that serial is just a hint at table creation time, but is 
there a place in catalog where we can see if the table was created using 
'serial' ?
if yes, I'm looking for a way to remove that.

Another cause for my problem may be in the way how the default value 
information is stored in pg_attrdef.
The difference we see between the source and target database is that a schema 
prefix is displayed with the sequence on one side, and not on the other..
I'm not sure yet if this really come directly from the catalog or from the way 
how the client read the table definition, maybe along with some search_path 
differences
Were there any change in this area between PG 9.6 and PG 11 ?

example:

The Default value of column 'id' in table 'db_jobs_history' in database 
'oms_db' is different on source and target servers. 
It's 'nextval('admin.db_jobs_history_id_seq'::regclass)' on source 
 and 'nextval('db_jobs_history_id_seq'::regclass)' on target.
 
 Thanks
 
 Marc


RE: removing "serial" from table definitions.

2021-06-24 Thread Marc Mamin
>serial is not an actual data type -- it is essentially an integer with a 
>default and an automatically created sequence. See:
>>
>>https://www.postgresql.org/docs/13/datatype-numeric.html#DATATYPE-SERIA
>>L
>
>Yes, I undersand that serial is just a hint at table creation time, but is 
>there a place in catalog where we can see if the table was created using 
>'serial' ?
>if yes, I'm looking for a way to remove that.
>
>Another cause for my problem may be in the way how the default value 
>information is stored in pg_attrdef.
>The difference we see between the source and target database is that a schema 
>prefix is displayed with the sequence on one side, and not on the other..
>I'm not sure yet if this really come directly from the catalog or from the way 
>how the client read the table definition, maybe along with some search_path 
>differences Were there any change in this area between PG 9.6 and PG 11 ?
>
>example:
>
>The Default value of column 'id' in table 'db_jobs_history' in database 
>'oms_db' is different on source and target servers. 
>It's 'nextval('admin.db_jobs_history_id_seq'::regclass)' on source  and 
>'nextval('db_jobs_history_id_seq'::regclass)' on target.

I've probably found the origin of our problem:

https://www.postgresql.org/docs/9.3/release-8-1.html
 => Add proper dependencies for arguments of sequence functions (Tom)
 
But I won't be able to check that in the next few days..

best regards,
marc Mamin


schema prefixes in default values (was RE: removing "serial" from table definitions).

2021-06-28 Thread Marc Mamin
Marc Mamin  writes:
> > Yes, I undersand that serial is just a hint at table creation time, but is 
> > there a place in catalog where we can see if the table was created using 
> > 'serial' ?
> 
> No.  Where the docs say "these are equivalent", they mean that very literally.
> 
> > The difference we see between the source and target database is that a 
> > schema prefix is displayed with the sequence on one side, and not on the 
> > other..
> 
> This likely has to do with the search_path settings being different in the 
> sessions inspecting the two DBs.  I do not think it is related to serial-ness 
> at all, it's just the normal behavior of regclass_out for the OID constant 
> that's the argument of nextval().
> 
>   regards, tom lane

Hello,
it seems that our problem had nothing to do with serial, but with the way 
schema prefixes are handled in column default values.

pg_attrdef.adsrc:
  filled when the defaut value is defined. contains a schema prefix only when 
required at this creation time. Is constant afterwards.

pg_get_expr(adbin, adrelid)
   the returned expession is dynamic: the schema prefix is returned only  when 
the sequence schema is not part of the current search_path.
   
This behavior is understandable but it make it uncomfortable to compare table 
definitions between different sources.
Moreover a pg_dump->restore might in some cases modify the value of 
pg_attrdef.adsrc

best regards,

Marc Mamin

   
as test:   

 set search_path='admin';

 create table foo1 (n1 serial);

 set search_path='oms';

 create table admin.foo2 (n2 serial);

 select   a.attname, ad.adsrc, pg_get_expr(adbin, adrelid)
 FROM pg_attribute a
 JOIN  pg_attrdef ad ON (a.attnum=ad.adnum and a.attrelid=ad.adrelid)
 WHERE a.attrelid IN (Select oid from pg_class where relname 
in('foo1','foo2'));

 n1   nextval('foo1_n1_seq'::regclass)  
nextval('admin.foo1_n1_seq'::regclass)
 n2   nextval('admin.foo2_n2_seq'::regclass) 
nextval('admin.foo2_n2_seq'::regclass)
 

 set search_path='admin';

 select   a.attname, ad.adsrc, pg_get_expr(adbin, adrelid)
 FROM pg_attribute a
 JOIN  pg_attrdef ad ON (a.attnum=ad.adnum and a.attrelid=ad.adrelid)
 WHERE a.attrelid IN (Select oid from pg_class where relname 
in('foo1','foo2'));
 
 n1   nextval('foo1_n1_seq'::regclass)  
nextval('foo1_n1_seq'::regclass)
 n2   nextval('admin.foo2_n2_seq'::regclass)
nextval('foo2_n2_seq'::regclass)




What is the best way to redefine a trigger? (lock issue)

2021-12-10 Thread Marc Mamin
What is the best way to redefine a trigger? (lock issue)


Hello,
I have deployment/migration scripts that require to be idempotent.

When (re)defining or deleting triggers,  I've lately observed locked statements 
that seemed never to release (waited for a few hours).
affected version: PG 10 (and probably PG 12 ?)

My case is similar to that old description and I wonder if the recommendation 
to first change the trigger function to a no-op function still make sense.

https://stackoverflow.com/questions/24738354/how-to-drop-a-trigger-in-a-resilient-manner-in-postgresql.


In the first observed case, with a test db, I did kill all  existing 
connections to the db and tried to drop the trigger with a fresh new connection.
This again resulted in a long lasting lock and I gave up, tipping on a db 
corruption.

What does happen in the background, that can make a trigger deletion fail? 
Are there situation where row level locks instead of table level locks are 
acquired?
Coul background processeslike vacuumplay a role here?

As I've observed this problem only a very few times, I guess it is not easily 
reproducable.

attached is an picture of pg_stat_activity during such a lock,

thanks,
Marc Mamin

here an example of a such a deployment/migration script, all of these scripts 
are applied  sequentially in separate transactions:
===
SET client_min_messages=error;

CREATE OR REPLACE FUNCTION block_item_cancel()
  RETURNS TRIGGER AS
$BODY$
DECLARE
blockedItemLevel int;
client int;

BEGIN
WITH RECURSIVE rec as
(
SELECT s.id as clientref, s."parentRef", 
a."fruitRef" 
FROM "ClientDO" s LEFT JOIN "Fruit2ClientDO" a 
ON (s.id=a."clientRef" and a."fruitRef" = NEW."fruitRef")
WHERE s.id = (select "clientRef" from "SeenDO" 
where "id" = NEW."SeenRef")
UNION ALL
SELECT s2.id as clientref, s2."parentRef", 
a2."fruitRef" 
FROM rec 
JOIN "ClientDO" s2 on (s2.id=rec."parentRef")
LEFT JOIN LATERAL (select"fruitRef" from 
"Fruit2ClientDO" ax WHERE rec."parentRef"=ax."clientRef" and ax."fruitRef" = 
NEW."fruitRef")  a2 
   ON TRUE
WHERE rec."parentRef" IS NOT NULL
--Only first matching client should be used
AND rec."fruitRef" IS NULL
)
SELECT clientref
FROM rec 
WHERE "fruitRef" is not null
INTO client;

blockedItemLevel = (NEW."quantitySeened" - NEW."quantityCanceled");
IF blockedItemLevel > 0 THEN

UPDATE "BlockedItemAO" SET
"blockedItem" = blockedItemLevel,
"modificationDate" = now()
WHERE "SeenPosRef" = NEW."id";
ELSE
DELETE FROM "BlockedItemAO" WHERE "SeenPosRef" = NEW."id";
END IF;
RETURN NEW;
END;
$BODY$
  LANGUAGE plpgsql
  COST 100;


DROP TRIGGER IF EXISTS block_item_cancel ON "SeenPosDO";

CREATE TRIGGER block_item_cancel
  AFTER UPDATE OF "quantityCanceled"
  ON "SeenPosDO"
  FOR EACH ROW
  WHEN ( NEW."providerRef" <> 1
 AND 
   (
  NEW."quantityCanceled" IS DISTINCT FROM OLD."quantityCanceled"
  
   )
   )
  EXECUTE PROCEDURE block_item_cancel();




RE: What is the best way to redefine a trigger? (lock issue)

2021-12-10 Thread Marc Mamin


   >> Hello,
   >> I have deployment/migration scripts that require to be idempotent.
   >>
   >> When (re)defining or deleting triggers,  I've lately observed locked 
statements that seemed never to release (waited for a few hours).
   >> affected version: PG 10 (and probably PG 12 ?)
   >>
   >> My case is similar to that old description and I wonder if the 
recommendation to first change the trigger function to a no-op function still 
make sense.
   >>
   >> 
https://stackoverflow.com/questions/24738354/how-to-drop-a-trigger-in-a-resilient-manner-in-postgresql.
   >>
   >>
   >> In the first observed case, with a test db, I did kill all  existing 
connections to the db and tried to drop the trigger with a fresh new connection.
   >> This again resulted in a long lasting lock and I gave up, tipping on a db 
corruption.
   >>
   >> What does happen in the background, that can make a trigger deletion fail?
   >A DROP TRIGGER will try to acquire an AccessExclusiveLock on the table, 
which conflicts with any table level lock (e.g a select acquires an access 
share lock, so it would cause the DROP TRIGGER to wait.
   >
   >Unfortunately I don't see this in the official docs: 
https://www.postgresql.org/docs/11/explicit-locking.html .
   >
   >> Are there situation where row level locks instead of table level locks 
are acquired?
   >> Coul background processeslike vacuumplay a role here?
   >>
   >> As I've observed this problem only a very few times, I guess it is not 
easily reproducable.
 
   >It is very easily reproducible. begin; select .. in one session, begin; 
drop trigger in a second session.

You can see in the attachment, that the lock exists without any other apparent 
conflicting  session.

   >Do you need to drop/create the trigger or a CREATE OR REPLACE function 
would suffice?

There are different use cases. Sometimes I only need to drop a trigger or 
modify its definition (not the function)


   >>
   >> attached is an picture of pg_stat_activity during such a lock,
   >>
   >> thanks,
   >> Marc Mamin
   >>
   >> here an example of a such a deployment/migration script, all of these 
scripts are applied  sequentially in separate transactions:
   >> ===
   >> SET client_min_messages=error;
   >>
   >> CREATE OR REPLACE FUNCTION block_item_cancel()
   >>RETURNS TRIGGER AS
   >> $BODY$
   >> DECLARE
   >>   blockedItemLevel int;
   >>   client int;
   >>
   >> BEGIN
   >>   WITH RECURSIVE rec as
   >>   (
   >>   SELECT s.id as clientref, s."parentRef", 
a."fruitRef"
   >>   FROM "ClientDO" s LEFT JOIN "Fruit2ClientDO" a 
ON (s.id=a."clientRef" and a."fruitRef" = NEW."fruitRef")
   >>   WHERE s.id = (select "clientRef" from "SeenDO" 
where "id" = NEW."SeenRef")
   >>   UNION ALL
   >>   SELECT s2.id as clientref, s2."parentRef", 
a2."fruitRef"
   >>   FROM rec
   >>   JOIN "ClientDO" s2 on (s2.id=rec."parentRef")
   >>   LEFT JOIN LATERAL (select"fruitRef" from 
"Fruit2ClientDO" ax WHERE rec."parentRef"=ax."clientRef" and ax."fruitRef" = 
NEW."fruitRef")  a2
   >>  ON TRUE
   >>   WHERE rec."parentRef" IS NOT NULL
   >>   --Only first matching client should be used
   >>   AND rec."fruitRef" IS NULL
   >>   )
   >>   SELECT clientref
   >>   FROM rec
   >>   WHERE "fruitRef" is not null
   >>   INTO client;
   >>
   >>   blockedItemLevel = (NEW."quantitySeened" - NEW."quantityCanceled");
   >>   IF blockedItemLevel > 0 THEN
   >>
   >>   UPDATE "BlockedItemAO" SET
   >>   "blockedItem" = blockedItemLevel,
   >>   "modificationDate" = now()
   >>   WHERE "SeenPosRef" = NEW."id";
   >>   ELSE
   >>   DELETE FROM "BlockedItemAO" WHERE "SeenPosRef" = NEW."id";
   >>   END IF;
   >>   RETURN NEW;
   >> END;
   >> $BODY$
   >>LANGUAGE plpgsql
   >>COST 100;
   >>
   >>
   >> DROP TRIGGER IF EXISTS block_item_cancel ON "SeenPosDO";
   >>
   >> CREATE TRIGGER block_item_cancel
   >>AFTER UPDATE OF "quantityCanceled"
   >>ON "SeenPosDO"
   >>FOR EACH ROW
   >>WHEN ( NEW."providerRef" <> 1
   >>   AND
   >> (
   >>NEW."quantityCanceled" IS DISTINCT FROM 
OLD."quantityCanceled"
   >> )
   >> )
   >>EXECUTE PROCEDURE block_item_cancel();
   >>
   >>
   >
   >
   >-- 
   >Achilleas Mantzios
   >DBA, Analyst, IT Lead
   >IT DEPT
   >Dynacom Tankers Mgmt
   >
   >
   >
   >


RE: What is the best way to redefine a trigger? (lock issue)

2021-12-10 Thread Marc Mamin

   >
   >-Original Message-
   >From: Achilleas Mantzios  
   >Sent: Freitag, 10. Dezember 2021 11:36
   >To: pgsql-general@lists.postgresql.org
   >Subject: Re: What is the best way to redefine a trigger? (lock issue)
   >
   >On 10/12/21 12:20 μ.μ., Marc Mamin wrote:
   >>
   >> >> Hello,
   >> >> I have deployment/migration scripts that require to be idempotent.
   >> >>
   >> >> When (re)defining or deleting triggers,  I've lately observed 
locked statements that seemed never to release (waited for a few hours).
   >> >> affected version: PG 10 (and probably PG 12 ?)
   >> >>
   >> >> My case is similar to that old description and I wonder if the 
recommendation to first change the trigger function to a no-op function still 
make sense.
   >> >>
   >> >> 
https://stackoverflow.com/questions/24738354/how-to-drop-a-trigger-in-a-resilient-manner-in-postgresql.
   >> >>
   >> >>
   >> >> In the first observed case, with a test db, I did kill all  
existing connections to the db and tried to drop the trigger with a fresh new 
connection.
   >> >> This again resulted in a long lasting lock and I gave up, tipping 
on a db corruption.
   >> >>
   >> >> What does happen in the background, that can make a trigger 
deletion fail?
   >> >A DROP TRIGGER will try to acquire an AccessExclusiveLock on the 
table, which conflicts with any table level lock (e.g a select acquires an 
access share lock, so it would cause the DROP TRIGGER to wait.
   >> >
   >> >Unfortunately I don't see this in the official docs: 
https://www.postgresql.org/docs/11/explicit-locking.html .
   >> >
   >> >> Are there situation where row level locks instead of table level 
locks are acquired?
   >> >> Coul background processeslike vacuumplay a role here?
   >> >>
   >> >> As I've observed this problem only a very few times, I guess it is 
not easily reproducable.
   >>   
   >> >It is very easily reproducible. begin; select .. in one session, 
begin; drop trigger in a second session.
   >>
   >> You can see in the attachment, that the lock exists without any other 
apparent conflicting  session.
   >It takes two or more to tango. Next time it happens query the pg_locks 
view, it contains info about locks on objects.
   >The holding lock is shown as granted, the waiting lock as not granted.
   
Yes, It looks for me like a situation which should not be possible at all.
I'll try to get more informations if the problem pop up again.
Thanks.
   
   >> >Do you need to drop/create the trigger or a CREATE OR REPLACE 
function would suffice?
   >>
   >> There are different use cases. Sometimes I only need to drop a trigger or 
modify its definition (not the function)
   >Are you using pgbouncer or some other means of suspending traffic into the 
DB?

No. In my tries to repare the situation, there were no other clients but psql 
and DBaever.

   >> >>
   >> >> attached is an picture of pg_stat_activity during such a lock,
   >> >>
   >> >> thanks,
   >> >> Marc Mamin
   >> >>
   >> >> here an example of a such a deployment/migration script, all of 
these scripts are applied  sequentially in separate transactions:
   >> >> ===
   >> >> SET client_min_messages=error;
   >> >>
   >> >> CREATE OR REPLACE FUNCTION block_item_cancel()
   >> >>RETURNS TRIGGER AS
   >> >> $BODY$
   >> >> DECLARE
   >> >>blockedItemLevel int;
   >> >>client int;
   >> >>
   >> >> BEGIN
   >> >>WITH RECURSIVE rec as
   >> >>(
   >> >>SELECT s.id as clientref, 
s."parentRef", a."fruitRef"
   >> >>FROM "ClientDO" s LEFT JOIN 
"Fruit2ClientDO" a ON (s.id=a."clientRef" and a."fruitRef" = NEW."fruitRef")
   >> >>WHERE s.id = (select "clientRef" from 
"SeenDO" where "id" = NEW."SeenRef")
   >> >>UNION ALL
   >> >>SELECT s2.id as clientref, 
s2."parentRef", a2."fruitRef"
   >> >>