Re: Locks analysis after-the-fact

2018-05-01 Thread Merlin Moncure
On Fri, Apr 27, 2018 at 2:55 AM, Olivier Macchioni
 wrote:
> Hello all,
>
> PostgreSQL version 9.4.17
>
> We have a number of queries running on the same DB from many systems. Among 
> other things, we need to INSERT / UPDATE on a table based on external events 
> - this is done via a pool of 5 SQL connections which process an average of 
> 500 events / second. The table is used by other systems as well...
>
> We have more and more locks on this table, which prevents it to be updated as 
> it should - and I'm trying to see how to improve things without setting up an 
> external queue system just to manage those events.
>
> I've enabled "log_lock_waits = on" in the configuration, which generates the 
> following kind of log entries:
>
> 2018-04-27 07:48:07 CEST [1419.269] "[unknown]" xx@yy LOG:  process 1419 
> still waiting for ExclusiveLock on advisory lock 
> [16417,1653339453,2672182422,2] after 1000.166 ms
> 2018-04-27 07:48:07 CEST [1419.270] "[unknown]" xx@yy DETAIL:  Process 
> holding the lock: 16418. Wait queue: 1419.
> 2018-04-27 07:48:07 CEST [1419.271] "[unknown]" xx@yy STATEMENT:  SELECT 
> pg_advisory_xact_lock(1653339453,  -1622784874);
> 2018-04-27 07:48:07 CEST [1419.272] "[unknown]" xx@yy LOG:  process 1419 
> acquired ExclusiveLock on advisory lock [16417,1653339453,2672182422,2] after 
> 1318.721 ms
> 2018-04-27 07:48:07 CEST [1419.273] "[unknown]" xx@yy STATEMENT:  SELECT 
> pg_advisory_xact_lock(1653339453,  -1622784874);
>
> I can easily identify the session owner of the different processes, but the 
> process holding the lock belong to an ERP which generates a lot of different 
> kind of queries - I'd like to identify precisely which one is causing this 
> long-lasting lock.
>
> It doesn't look like this is possible via the PostgreSQL logging features, 
> and doing the usual lock troubleshooting "on the spot" using pg_locks is not 
> practical due to the short duration and odd timing of those locks.
>
> Does anyone have an idea on how to process in such a case?

Here, the locking query is obvious: "SELECT pg_advisory_xact_lock".
If this is an ERP application I bet advisory locks are being used to
simulate pessimistic locking in some way.   Unfortunately this is no
help as the real operation is elsewhere in this transaction.   Echoing
the advice above you need to enable full statement logging and make
sure your log_line_prefix contains pid in some easily searched for way
such as [%p].  Once you have that, you can trace back in the file
based on the locking pid and get to the bottom of the situation pretty
easily.

Be mindful statement logging, it can consume disk space really quickly
and add a lot of overhead to the database.

merlin



Surprised by index choice for count(*)

2018-05-01 Thread Rob Sargent
Should I be?  I would have thought the pk would have been chosen v. 
function index?


explain analyse select count(*) from bc.segment s;
QUERY PLAN
---
 Finalize Aggregate  (cost=4428009.24..4428009.25 rows=1 width=8) 
(actual time=14786.395..14786.395 rows=1 loops=1)
   ->  Gather  (cost=4428009.03..4428009.24 rows=2 width=8) (actual 
time=14786.358..14786.386 rows=3 loops=1)

 Workers Planned: 2
 Workers Launched: 2
 ->  Partial Aggregate  (cost=4427009.03..4427009.04 rows=1 
width=8) (actual time=14782.167..14782.167 rows=1 loops=3)
   ->  Parallel Index Only Scan using fpv on segment s  
(cost=0.57..4210177.14 rows=86732753 width=0) (actual 
time=0.061..11352.855 rows=69386204 loops=3)

 Heap Fetches: 1780
 Planning time: 0.221 ms
 Execution time: 14815.939 ms
(9 rows)

\d bc.segment
  Table "bc.segment"
 Column |  Type   | Modifiers
+-+
 id | uuid    | not null
 chrom  | integer | not null
 markerset_id   | uuid    | not null
 probandset_id  | uuid    | not null
 startbase  | integer | not null
 endbase    | integer | not null
 firstmarker    | integer | not null
 lastmarker | integer | not null
 events_less    | bigint  | not null default 0
 events_equal   | bigint  | not null default 0
 events_greater | bigint  | not null default 0
Indexes:
    "segment_pkey" PRIMARY KEY, btree (id)
    "useg" UNIQUE, btree (probandset_id, chrom, startbase, endbase)
    "fpv" btree (pv(events_less, events_equal, events_greater, 0))
    "segment_markerset_id_probandset_id_idx" btree (markerset_id, 
probandset_id)


create or replace function public.pv(l bigint, e bigint, g bigint, o int)
returns float
as
$$
select 1.0*(g+e+o)/(l+e+g+o)::float;
$$
language sql
;





Re: Surprised by index choice for count(*)

2018-05-01 Thread David G. Johnston
On Tue, May 1, 2018 at 8:46 AM, Rob Sargent  wrote:

> Should I be?  I would have thought the pk would have been chosen v.
> function index?
> Indexes:
> "segment_pkey" PRIMARY KEY, btree (id)
> "useg" UNIQUE, btree (probandset_id, chrom, startbase, endbase)
> "fpv" btree (pv(events_less, events_equal, events_greater, 0))
> "segment_markerset_id_probandset_id_idx" btree (markerset_id,
> probandset_id)
>

​Haven't looked at the code but among those 4 indexes the one that was
chosen is the one that comes first alphabetically.​  Ideally it would avoid
multi-column indexes since they are larger; and I believe that a float is
smaller than a uuid so not only alphabetically but the fpv index wins on
size too.

That the index is functional is immaterial here.

David J.


Re: Surprised by index choice for count(*)

2018-05-01 Thread Tom Lane
Rob Sargent  writes:
> Should I be?  I would have thought the pk would have been chosen v. 
> function index?

If I'm reading this correctly, the PK index contains uuids while
the fpv index contains float4s, meaning the latter is probably half
the size.  So scanning it is a lot cheaper, at least according to
the planner's cost model.

regards, tom lane



Re: Surprised by index choice for count(*)

2018-05-01 Thread Rob Sargent
Thank you both.  Simple, as expected.  And I’m easily surprised.  
Version 10 (perhaps obviously) for those scoring at home.

> On May 1, 2018, at 10:11 AM, Tom Lane  wrote:
> 
> Rob Sargent  writes:
>> Should I be?  I would have thought the pk would have been chosen v. 
>> function index?
> 
> If I'm reading this correctly, the PK index contains uuids while
> the fpv index contains float4s, meaning the latter is probably half
> the size.  So scanning it is a lot cheaper, at least according to
> the planner's cost model.
> 
>   regards, tom lane




extension dependencies with 'requires'

2018-05-01 Thread Eric Hanson
Greetings,

I'm trying to author an extension and running into troubles with
dependencies.  The extension depends on the uuid-ossp, pgcrypto and
postgres_fdw extensions, but I would like the dependencies to be installed
in the public schema, though the extension itself lives in its own schema.
Is there a way to use CREATE EXTENSION ... CASCADE and specify in the
control file which schema the dependencies are installed in?  How does the
extension know where to install its dependencies?

Thanks,
Eric
--
Eric Hanson
CEO, Aquameta Labs
503-929-1073
www.aquameta.com

-- 
-- Eric Hanson CEO, Aquameta 503-929-1073


Re: extension dependencies with 'requires'

2018-05-01 Thread Tom Lane
Eric Hanson  writes:
> I'm trying to author an extension and running into troubles with
> dependencies.  The extension depends on the uuid-ossp, pgcrypto and
> postgres_fdw extensions, but I would like the dependencies to be installed
> in the public schema, though the extension itself lives in its own schema.
> Is there a way to use CREATE EXTENSION ... CASCADE and specify in the
> control file which schema the dependencies are installed in?

Afraid not.  CASCADE will pass down the same target-schema option to the
child CREATE EXTENSION operations that appeared in the initial command.

regards, tom lane



Re: extension dependencies with 'requires'

2018-05-01 Thread Eric Hanson
Ok thanks.

FWIW, this pretty profoundly limits what you can do with dependencies in
extensions that define their own schema:  When two extensions that both
define their own schema need to share the same dependency, sounds like that
is impossible, because the dependency extensions can't be installed in two
schemas at the same time.  Which ever extension is installed first will
create the dep in the target schema, and then the second extension will try
to install the dep and fail due to it being already installed.

Thanks,
Eric

--
Eric Hanson
CEO, Aquameta Labs
503-929-1073
www.aquameta.com


On Tue, May 1, 2018 at 11:00 AM, Tom Lane  wrote:

> Eric Hanson  writes:
> > I'm trying to author an extension and running into troubles with
> > dependencies.  The extension depends on the uuid-ossp, pgcrypto and
> > postgres_fdw extensions, but I would like the dependencies to be
> installed
> > in the public schema, though the extension itself lives in its own
> schema.
> > Is there a way to use CREATE EXTENSION ... CASCADE and specify in the
> > control file which schema the dependencies are installed in?
>
> Afraid not.  CASCADE will pass down the same target-schema option to the
> child CREATE EXTENSION operations that appeared in the initial command.
>
> regards, tom lane
>