from:"Laurenz Albe"

Re: Bad estimates

2017-11-22 Thread Laurenz Albe

Artur Zając wrote:
> We have table created like this:
> 
> CREATE TABLE xyz AS SELECT generate_series(1,1000,1) AS gs;
> 
> Now:
> 
> explain analyze select * from xyz where gs&1=1;

>  Seq Scan on xyz  (cost=0.00..260815.38 rows=68920 width=4)
>   (actual time=0.044..2959.728 rows=500 loops=1)
>Filter: ((gs & 1) = 1)
>Rows Removed by Filter: 500
[...]
> And one more clause:
> 
> explain analyze select * from xyz where gs&1=1 and gs&2=2 and gs&4=4;

>  Seq Scan on xyz  (cost=0.00..398655.62 rows=2 width=4)
>   (actual time=0.052..3329.422 rows=125 loops=1)
>Filter: (((gs & 1) = 1) AND ((gs & 2) = 2) AND ((gs & 4) = 4))
>Rows Removed by Filter: 875

> As we can see estimates differs significally from the actual records count -
> only three clauses are reducing estimated number of records from 1000 to
> 2.
> 
> I noticed that each additional clause reduces the number about 200 times and
> define DEFAULT_NUM_DISTINCT is responsible for this behaviur.
> 
> I think that this variable should be lower or maybe estimation using
> DEFAULT_NUM_DISTTINCT should be done once per table.

The problem is that the expression "gs & 1" is a black box for the
optimizer; it cannot estimate how selective the condition is and falls
back to a default value that is too low.

You can create an index to
a) improve the estimate
and
b) speed up the queries:

CREATE INDEX ON xyz ((gs & 1), (gs & 2), (gs & 4));

Don't forget to ANALYZE afterwards.

Yours,
Laurenz Albe

Re: Half billion records in one table? RDS

2017-11-27 Thread Laurenz Albe

Jean Baro wrote:
> Hi there,
> 
> We are creating a new DB which will behave most like a file system,
> I mean, there will be no complex queries or joins running in the DB.
> The idea is to grab the WHOLE set of messages for a particular user
> and then filter, order, combine or full text search in the function itself 
> (AWS Lambda).
> The maximum number of messages is limited to 1.000 messages per user.
> So we expect Postgres to have an amazing performance for this scenario.
> 
[...]
>  
> Sizing and worst case scenario:
>  
> · 500MM messages in the main DB
> · 4K queries per second (by UserID) – Max time of 500ms per query. 
> Simples SELECT,
>   with no ORDER, WHERE OR GROUP BY. Just grab all the messages for a 
> particular user. MAX 1000 messages per USER.
> · 1K inserts per second on average (So that in 1 hour we can insert 
> around 3MM messages)
> · 1K deletes per second on average (So that in 1 hour we can remove 
> around 3MM messages)
> 
> My question is:
> Can we use any kind of compression for PostgreSQL which would result in 
> reduced IO and disk size?
> We are not relying on any kind of table partitioning, is that the best 
> approach for this scenario?
> Is PG on RDS capable of delivering this type of performance while requiring 
> low maintenance?
> What about Auto Vacuum? Any suggestion how to optimize it for such a work load
> (we will insert and delete millions of rows every day).

It sounds like your JSON data, which are your chief concern, are
not processed inside the database.  For that, the type "json" is best.
Such data are automatically stored in a compressed format if their
size exceeds 2KB. The compression is not amazingly good, but fast.

If your application removes data by deleting them from the
table, partitioning won't help.  It is useful if data get removed
in bulk, e.g. if you want to delete all yesterday's data at once.

The workload does not sound amazingly large, so I'd expect PostgreSQL
to have no problems with it with decent storage and CPU power,
but you'd have to test that.

Tune autovacuum if it cannot keep up (tables get bloated).
The first knob to twiddle is probably lowering "autovacuum_vacuum_cost_delay".
Autovacuum might be your biggest problem (only guessing).

Yours,
Laurenz Albe

Re: vacuum after truncate

2017-12-05 Thread Laurenz Albe

Mariel Cherkassky wrote:
> Hi,
> I have a big function that includes many truncates on different tables.
> In the documentation is is written that truncates creates a new file
> and resign the old filenode to the new filenode and the old file
> (old data of the table) is deleted in commit. 
> 
> In order to execute my function I run psql -d 'aa' -U 'bb' -C "select 
> function()";
> 
> I have a few questions about it ?
> 
> 1.When I start the function, it means that the whole function is one big 
> transaction right ?

Right.

> 2..Because the truncates I preform are part of a transaction it means that 
> only at the end
>   of the transaction that space will be freed ? Which mean only when the 
> function is finished?

Exactly.  The old file has to be retained, because there could be a ROLLBACK.

> 3..Does running vacuum at the end of the function on the tables that were 
> truncated and
>   then populated with data will have any impact or is it better just to 
> analyze them ?

For up-to-date statistics, ANALYZE is enough.
If you want to set hint bits so that the first reader doesn't have to do it,
VACUUM will help.  But that is not necessary.

Yours,
Laurenz Albe

Re: Different plan chosen when in lateral subquery

2017-12-06 Thread Laurenz Albe

Alex Reece wrote:
> I get very different plan chosen when my query is in a lateral subquery vs 
> standalone --
> it doesn't use a key when joining on a table, instead opting to do a hash 
> join. Here is the query:
> 
>   select distinct on (sub.entity_id, sub.note_id, sub.series_id)
>  entity_id, note_id, series_id
>   from
>   (
>   select alloc.entity_id, alloc.note_id, alloc.series_id, 
> alloc.amount, inv.name
>   from public.portfolio_allocations alloc
>   JOIN contributions contrib on contrib.id = alloc.note_id
>   JOIN investments inv on inv.id = contrib.investment_id
>   where entity_id = '\x5787f132f50f7b03002cf835' and 
>   alloc.allocated_on <= dates.date
>   ) sub
> 
> And wrapped inside the lateral:
> 
> explain analyze
> select *
> from generate_series('2017-03-14 20:59:59.999'::TIMESTAMPTZ,  
>   current_timestamp::TIMESTAMP + INTERVAL '1 day', '24 hours') dates,
> LATERAL (
>   ...  ...
> ) lat
> 
> Run by itself injecting a hard coded value for dates.date, I get the expected 
> plan which uses a key index on contributions:

[...]

>  ->  Nested Loop  (cost=0.17..14.23 rows=2 width=52) 
> (actual time=0.022..0.028 rows=2 loops=1)
>->  Index Scan using 
> portfolio_allocations_entity_id_allocated_on_idx on portfolio_allocations 
> alloc  (cost=0.09..6.05 rows=2 width=39) (actual time=0.012..0.014 
>  Index Cond: ((entity_id = 
> '\x5787f132f50f7b03002cf835'::bytea) AND (allocated_on <= '2017-03-14 
> 20:59:59.999+00'::timestamp with time zone))
>->  Index Scan using 
> contributions_id_accrue_from_idx on contributions contrib  (cost=0.08..4.09 
> rows=1 width=26) (actual time=0.005..0.005 rows=1 loops=2)
>  Index Cond: (id = alloc.note_id)

[...]

> But run in the lateral, it doesn't use the index:

[...]

> ->  Hash Join  (cost=10775.83..20355.61 rows=5724 
> width=52) (actual time=1.657..5.980 rows=6713 loops=267)
>Hash Cond: (alloc.note_id = contrib.id)
>->  Bitmap Heap Scan on portfolio_allocations 
> alloc  (cost=69.82..9628.13 rows=5724 width=39) (actual time=1.010..2.278 
> rows=6713 loops=267)
>  Recheck Cond: ((entity_id = 
> '\x5787f132f50f7b03002cf835'::bytea) AND (allocated_on <= date(dates.dates)))
>  Heap Blocks: exact=118074
>  ->  Bitmap Index Scan on 
> portfolio_allocations_entity_id_allocated_on_idx  (cost=0.00..69.53 rows=5724 
> width=0) (actual time=0.956..0.956 rows=6713 lo
>Index Cond: ((entity_id = 
> '\x5787f132f50f7b03002cf835'::bytea) AND (allocated_on <= date(dates.dates)))
>->  Hash  (cost=9464.85..9464.85 rows=354617 
> width=26) (actual time=169.792..169.792 rows=354617 loops=1)
>  Buckets: 524288  Batches: 1  Memory Usage: 
> 24296kB
>  ->  Seq Scan on contributions contrib  
> (cost=0.00..9464.85 rows=354617 width=26) (actual time=0.007..83.246 
> rows=354617 loops=1)

[...]

> I have a few questions here:
>   - Why doesn't it use the primary key index in either case?

I don't know about the first query; perhaps the primary key index is fragmented.
Compare the size of the indexes on disk.
In the second query a sequential scan is used because PostgreSQL chooses a hash 
join.
That choice is made because the index scans returns 6713 rows rather than the 2
from the first query, probably because the date is different.

>   - Why isn't it choosing portfolio_allocations_pnsa, which seems like it 
> would prevent it from having to sort?

In a bitmap index scan, the table is scanned in physical order, so the result
is not sorted in index order.
I don't know if PostgreSQL is smart enough to figure out that it could use an 
index
scan and preserve the order through the joins to obviate the sort.
You could try to set enable_bitmapscan=off and see if things are different then.
Perhaps the slower index scan would outweigh the advantage of avoiding the sort.

Yours,
Laurenz Albe

Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-12-07 Thread Laurenz Albe

Gunther wrote:
> Something is wrong with the dump thing. And no, it's not SSL or whatever,
> I am doing it on a local system with local connections. Version 9.5 something.

That's a lot of useful information.

Try to profile where the time is spent, using "perf" or similar.

Do you connect via the network, TCP localhost or UNIX sockets?
The last option should be the fastest.

Yours,
Laurenz Albe

Re: CPU 100% usage caused by unknown postgres process..

2017-12-13 Thread Laurenz Albe

Dinesh Chandra 12108 wrote:
> My CPU utilization is going to 100% in PostgreSQL because of one unknown 
> process /x330341 is running from postgres user.
>  
> PID   USER  PR  NIVIRTRESSHR S %CPU %MEM   TIME+   COMMAND
> 19885 postgres 20   0  192684   3916   1420 S 99.3  0.1   5689:04  
> x330341 
>  
> The same file is automatically created in Postgres Cluster also. I am using 
> Postgresql-9.3.
>  
> Kindly suggest how can I resolve this issue.

I don't know, but the same problem has been reported on Stackoverflow:
https://stackoverflow.com/q/46617329/6464308

If your queries look similar, then you might indeed be the victim of an attack.

Figure out where the function and the executable come from.

In case of doubt, disconnect the server from the network.

Yours,
Laurenz Albe

Re: Autoanalyze CPU usage

2017-12-19 Thread Laurenz Albe

Habib Nahas wrote:
> The CPU spike occurred between 13:05 - 13:15. last_autoanalyze for the table
> shows a time of 12:49; last_autovacuum does not show any activity around
> this time for any table. Checkpoint logs are also normal around this time.
> I'd like to understand if there are any other sources of activity I
> should be checking for that would account for the spike.

last_autoanalyze is set after autoanalyze is done, so that would suggest
that autoanalyze is not the problem.

It can be tough to figure out where the activity is coming from unless
cou can catch it in the act.  You could log all statements (though the amount
of log may be prohibitive and can cripple performance), you could log
just long running statements in the hope that these are at fault, you
could log connections and disconnections and hope to find the problem
that way.  Maybe logging your applications can help too.

Yours,
Laurenz Albe

Re: Performance of a Query

2018-01-10 Thread Laurenz Albe

Scott Marlowe wrote:
> On Tue, Jan 9, 2018 at 2:18 PM, Kumar, Virendra
>  wrote:
> > Hello Gurus,
> > 
> > I am struggling to tune a query which is doing join on top of aggregate for
> > around 3 million rows. The plan and SQL is attached to the email.
> > 
> > Below is system Details:
> > 
> > PGSQL version – 10.1
> > 
> > OS – RHEL 3.10.0-693.5.2.el7.x86_64
> > 
> > Binary – Dowloaded from postgres.org compiled and installed.
> > 
> > Hardware – Virtual Machine with 8vCPU and 32GB of RAM, on XFS filesystem.
> 
> I uploaded your query plan here: https://explain.depesz.com/s/14r6
> 
> The most expensive part is the merge join at the end.
> 
> Lines like this one: "Buffers: shared hit=676 read=306596, temp
> read=135840 written=135972"
> 
> Tell me that your sorts etc are spilling to disk, so the first thing
> to try is upping work_mem a bit. Don't go crazy, as it can run your
> machine out of memory if you do. but doubling or tripling it and
> seeing the effect on the query performance is a good place to start.
> 
> The good news is that most of your row estimates are about right, so
> the query planner is doing what it can to make the query fast, but I'm
> guessing if you get the work_mem high enough it will switch from a
> merge join to a hash_join or something more efficient for large
> numbers of rows.

Looking at the plan, I'd guess that the following index could be helpful:

CREATE INDEX ON ap.site_exposure(portfolio_id, peril_id, account_id);

Don't know how much it would buy you, but you could avoid the
sequential scan and the sort that way.

Yours,
Laurenz Albe

Re: Unable to connect Postgres using psql while postgres is already running.

2018-01-10 Thread Laurenz Albe

Dinesh Chandra 12108 wrote:
> After restarting PostgreSQL Server, I am unable to connect postgres from 
> putty, I am getting error
>  
> psql: could not connect to server: No such file or directory
> Is the server running locally and accepting
> connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
>  
> While postgres is already running and also I am able to connect databases 
> from PGAdmin tool but not from command prompt.

You know that a local connection only works when you are logged in
on the database machine, right?

Is your database listening on port 5432?

Connect as user "postgres" and run the following queries:

   SHOW port;
   SHOW unix_socket_directories;

That will show the port and the directories where UNIX sockets are created.

You can use a socket directory name with the -h option of psql.

Yours,
Laurenz Albe

Re: View preformance oracle to postgresql

2018-01-10 Thread Laurenz Albe

Pavan Reddygari wrote:
> A view got converted to postgresql, performance while querying the view in 
> postgresql is 10X longer compared to oracle.
> Hardware resources are matching between oracle and postgresql.
>  
> V_item_attributes view code as below, same in oracle and postgresql.
> -
> SELECT a.iav_id,
> a.iav_itm_id,
> a.iav_iat_id,
> a.iav_value,
> a.iav_version,
> a.iav_approved,
> a.iav_create_date,
> a.iav_created_by,
> a.iav_modify_date,
> a.iav_modified_by,
> item_attribute.iat_id,
> item_attribute.iat_name,
> item_attribute.iat_type,
> item_attribute.iat_status,
> item_attribute.iat_requires_approval,
> item_attribute.iat_multi_valued,
> item_attribute.iat_inheritable,
> item_attribute.iat_create_date,
> item_attribute.iat_created_by,
> item_attribute.iat_modify_date,
> item_attribute.iat_modified_by,
> item_attribute.iat_translated
>FROM (item_attribute_value a
>  JOIN item_attribute ON ((a.iav_iat_id = item_attribute.iat_id)))
>   WHERE (a.iav_version = ( SELECT max(b.iav_version) AS max
>FROM item_attribute_value b
>   WHERE ((b.iav_itm_id = a.iav_itm_id) AND (b.iav_iat_id = 
> a.iav_iat_id;
>  
>  
> Oracle is using push predicate of IAV_ITM_ID column wherever 
> item_attribute_values table being used.
> Any alternatives available to reduce view execution time in postgresql 
> database or any hints, thoughts would be appreciated.

If (iav_version, iav_itm_id, iav_iat_id) is unique, you could use

   SELECT DISTINCT ON (a.iav_itm_id, a.iav_iat_id)
   ...
   FROM item_attribute_value a JOIN item_attribute b ON ...
   ORDER BY a.iav_version DESC;

Yours,
Laurenz Albe

Re: need help on memory allocation

2018-01-19 Thread Laurenz Albe

Rambabu V wrote:
> we are seeing idle sessions consuming memory in our database, could you 
> please help me
> how much memory an idle session can use max and how can we find how much 
> work_mem
> consuming for single process.
> 
> we are getting out of memory error,for this i'm asking above questions.

Are you sure that you see the private memory of the process and not the
shared memory common to all processes?

An "idle" connection should not hav a lot of private memory.

If you get OOM on the server, the log entry with the memory context dump
might be useful information.

Yours,
Laurenz Albe

Re: need help on memory allocation

2018-01-23 Thread Laurenz Albe

On Tue, 2018-01-23 at 19:29 +0530, Rambabu V wrote:
> Any Update, this is continuously hitting our production database.
> 
> > OOM error not recording in server level, it is only recording in our 
> > database logs.
> > 
> > below is the error message:
> > 
> > cat PostgreSQL-2018-01-23_06.csv|grep FATAL
> > 2018-01-23 06:08:01.684 
> > UTC,"postgres","rpx",68034,"[local]",5a66d141.109c2,2,"authentication",2018-01-23
> >  06:08:01 UTC,174/89066,0,FATAL,28000,"Peer authentication failed for user 
> > ""postgres""","Connection matched pg_hba.conf line 5: ""local   all 
> > all peer map=supers"""""
> > 2018-01-23 06:25:52.286 
> > UTC,"postgres","rpx",22342,"[local]",5a66d570.5746,2,"authentication",2018-01-23
> >  06:25:52 UTC,173/107122,0,FATAL,28000,"Peer authentication failed for user 
> > ""postgres""","Connection matched pg_hba.conf line 5: ""local   all 
> > all peer map=supers"""""
> > 2018-01-23 06:37:10.916 
> > UTC,"portal_etl_app","rpx",31226,"10.50.13.151:41052",5a66d816.79fa,1,"authentication",2018-01-23
> >  06:37:10 UTC,,0,FATAL,53200,"out of memory","Failed on request of size 
> > 78336."""
> > 
> > $ free -mh
> >  total   used   free sharedbuffers cached
> > Mem:   58G58G   358M16G   3.6M41G
> > -/+ buffers/cache:16G42G
> > Swap: 9.5G   687M   8.9G
> > 
> > postgresql.conf parametes:
> > =
> > work_mem = 256MB# min 64kB
> > maintenance_work_mem = 256MB# min 1MB
> > shared_buffers = 16GB   # min 128kB
> > temp_buffers = 16MB # min 800kB
> > wal_buffers = 64MB
> > effective_cache_size = 64GB
> > max_connections = 600

It would be interesting to know the output from

   sysctl vm.overcommit_memory
   sysctl vm.overcommit_ratio

Also interesting:

   sar -r 1 1

I think that max_connections = 600 is way to high.

Are you running large, complicated queries on that machine? That could
be a problem with such a high connection limit.

Is the machine dedicated to PostgreSQL?

Yours,
Laurenz Albe

Re: pg_xlog unbounded growth

2018-01-25 Thread Laurenz Albe

Stefan Petrea wrote:
> During some database imports(using pg_restore), we're noticing fast
> and unbounded growth of pg_xlog up to the point where the
> partition(280G in size for us) that stores it fills up and PostgreSQL
> shuts down.

What do you see in pg_stat_archiver?

Yours,
Laurenz Albe

Re: Nested Loops

2018-01-31 Thread Laurenz Albe

Kumar, Virendra wrote:
> Can somebody help me avoid nested loops in below query:
> --
> ap_poc_db=# explain (analyze,buffers)
> ap_poc_db-# select site_id, account_id FROM ap.site_exposure se
> ap_poc_db-# WHERE se.portfolio_id=-1191836
> ap_poc_db-# AND EXISTS (select 1 from ap.catevent_flood_sc_split 
> sp where sp.migration_sourcename= 'KatRisk_SC_Flood_2015_v9' AND 
> ST_Intersects(se.shape, sp.shape))
> ap_poc_db-# group by site_id, account_id;
>   
> QUERY PLAN
> --
[...]
>Buffers: shared hit=172041
>->  Gather Merge  (cost=23479854.04..23479879.04 rows=205 width=16) 
> (actual time=1387.823..1388.676 rows=1532 loops=1)
>  Workers Planned: 5
>  Workers Launched: 5
>  Buffers: shared hit=172041
[...]
>  ->  Nested Loop Semi Join  (cost=4.53..23478852.87 
> rows=41 width=16) (actual time=34.772..1345.489 rows=255 loops=6)
>Buffers: shared hit=864235
>->  Append  (cost=0.00..156424.56 rows=123645 
> width=48) (actual time=1.011..204.748 rows=102990 loops=6)
>  Buffers: shared hit=154879
>  ->  Parallel Seq Scan on 
> site_exposure_1191836 se  (cost=0.00..156424.56 rows=123645 width=48) (actual 
> time=1.004..187.702 rows=102990 loops=6)
>Filter: (portfolio_id = 
> '-1191836'::integer)
>Buffers: shared hit=154879
>->  Bitmap Heap Scan on catevent_flood_sc_split sp 
>  (cost=4.53..188.54 rows=15 width=492) (actual time=0.007..0.007 rows=0 
> loops=617937)
>  Recheck Cond: (se.shape && shape)
>  Filter: ((migration_sourcename = 
> 'KatRisk_SC_Flood_2015_v9'::bpchar) AND _st_intersects(se.shape, shape))
>  Rows Removed by Filter: 0
>  Heap Blocks: exact=1060
>  Buffers: shared hit=709356
>  ->  Bitmap Index Scan on 
> catevent_flood_sc_split_shape_mig_src_gix  (cost=0.00..4.52 rows=45 width=0) 
> (actual time=0.005..0.005 rows=0 loops=617937)
>Index Cond: (se.shape && shape)
>Buffers: shared hit=691115
> Planning time: 116.141 ms
> Execution time: 1391.785 ms

With a join condition like that (using on a function result),
only a nested loop join is possible.

I don't know how selective sp.migration_sourcename= 'KatRisk_SC_Flood_2015_v9'
is; perhaps an index on the column can help a little.

But you won't get around the 617937 loops, which is the cause of the
long query duration.  I don't think there is a lot of potential for 
optimization.

Yours,
Laurenz Albe

Re: Query optimiser is not using 'not null' constraint when 'order by nulls last' clause is used

2018-02-02 Thread Laurenz Albe

On Thu, 2018-02-01 at 20:00 +0530, Nandakumar M wrote:
> Hi,
> 
> I am using Postgres version 9.4.4 on a Mac machine.
> I have 2 queries that differ only in the order by clause.
> One of it has 'nulls last' and the other one does not have it.
> The performance difference between the two is considerable.
> 
> The slower of the two queries is
> 
> SELECT [...]
> FROM   workorder wo
> left join  workorder_fields wof
> ON wo.workorderid=wof.workorderid
> left join  servicecatalog_fields scf
> ON wo.workorderid=scf.workorderid
[...]
> ORDER BY   7 DESC nulls last limit 25
> 
> 
> 
> On removing 'nulls last' from the order by clause the query becomes very fast.
> I have attached the query plan for both the queries.

In the above case, the optimizer does not know that it will get the rows
in the correct order: indexes are sorted ASC NULLS LAST by default,
so a backwards index scan will produce the results NULLS FIRST,
which is the default for ORDER BY ... DESC.

If you want the nulls last, PostgreSQL has to retrieve *all* the rows and sort
them rather than using the first 25 results it gets by scanning then indexes.

To have the above query perform fast, add additional indexes with either
ASC NULLS FIRST or DESC NULLS LAST for all used keys.

Yours,
Laurenz Albe

Re: need advice to tune postgresql

2018-02-23 Thread Laurenz Albe

Darius Pėža wrote:
> I have issue that update queries is slow, I need some advice how improve 
> speed. I don't have much control to change queries. But I can change 
> postresql server configuration
> 
> query example:
> 
> UPDATE "project_work" SET "left" = ("project_work"."left" + 2) WHERE 
> ("project_work"."left" >= 8366)
> 
> sometimes updated lines count is up to 10k
> 
> postgresql version 9.3
> 
> postgresl.conf
> max_connections = 100 
> shared_buffers = 6GB  # min 128kB
> work_mem = 100MB  # min 64kB
> 
> all other values are default
> 
> server hardware
> Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz
> 16GB RAM
> disk is HDD
> 
> about half of resource I can dedicate for postgresql server.

If the number of updated lines is that big, you should try to
get HOT updates as much as possible.

For that, make sure that there is *no* index on the column,
and that the fillfactor for the table is suitably low (perhaps 50).

During a HOT update, when the new row version fits into the same
page as the old one, the indexes don't have to be updated.
That will speed up the UPDATE considerably.

On the other hand, an UPDATE like yours would then always use a
sequential scan, but that may still be a net win.

Other than that, setting checkpoint_segments high enough that
you don't get too many checkpoints can help.

Of course, more RAM and fast storage are always good.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Performance degrade in Planning Time to find appropriate Partial Index

2018-03-01 Thread Laurenz Albe

Meenatchi Sandanam wrote:
> I have created a table with 301 columns(ID, 150 BIGINT, 150 TEXT). The table 
> contains
> multiple form data differentiated by ID range. Hence a column contains more 
> than one form data.
> To achieve Unique Constraint and Indexing per form, I chose PostgreSQL 
> Partial Indexes
> which suits my requirement. I have created Partial Indexes with ID Range as 
> criteria and
> it provides Uniqueness and Indexing per form basis as expected. But DML 
> operations on a
> particular form scans all the Indexes created for the entire table instead of 
> scanning
> the Indexes created for that particular form ID Range. This degrades Planner 
> Performance
> and Query Time more than 10 times as below, 
> 
> Query Result for the table with 3000 Partial Indexes(15 Indexes per form) : 

It is crazy to create 3000 partial indexes on one table.

No wonder planning and DML statements take very long, they have to consider all 
the
indexes.

> explain analyse select id from form_data_copy where id between 3001 and 4000 
> and bigint50=789;

Use a single index on (bigint50, id) for best performance.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: GIST index (polygon, point)

2018-03-06 Thread Laurenz Albe

ghiureai wrote:
> I have a short description bellow from Dev team regarding the behaviour of 
> gist index on the polygon column, looking to get some  feedback  from you:
>
> " I was expecting the <@(point,polygon) and @>(polygon,point) to be 
> indexable but they are not. see bellow query output ,
> the column is a polygon and the index is a gist index on the polygon column; 
> my understanding of the above query is that it says which operators would 
> cause that index to be used
>
> This SQL shows which operators are indexable:SELECT
>  pg_get_indexdef(ss.indexrelid, (ss.iopc).n, TRUE) AS index_col,
>  amop.amopopr::regoperator AS indexable_operator
> FROM pg_opclass opc, pg_amop amop,
>  (SELECT indexrelid, information_schema._pg_expandarray(indclass) AS iopc
>   FROM pg_index
>   WHERE indexrelid = 'caom2.Plane_energy_ib'::regclass) ss
> WHERE amop.amopfamily = opc.opcfamily AND opc.oid = (ss.iopc).x
> ORDER BY (ss.iopc).n, indexable_operator;
>
> We run  the SQL  in PG 9.5.3 and PG 10.2 we  the same result: only polygon vs 
> polygon is indexable (except the last entry which is distance operator).
> The work around for us was to change interval-contains-value from 
> polygon-contains-point (@> or <@ operator) to
> polygn-intersects-really-small-polygon (&&) in order to use the index, but I 
> was quite surprised that contains operators are not indexable!
> Note that this is using the built in polygon and not pgsphere (spoly)"

That sounds about right.

You could use a single-point polygon like '((1,1))'::polygon
and the <@ or && operator.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Too many .history file in pg_xlog takes lots of space

2018-03-13 Thread Laurenz Albe

彭昱傑 wrote:
> My postgre version is 9.4.9, and I face a space issue.
> 
> Every time I restart postgre server, it generates a new history file:
> 156A.history =>  156B.history
> 
> Now it takes a lot of space about 800MB (5787 history file):
> [...]
> Is  file 1570.history important?

A new history file is created when a new timeline is opened,
which happens after point-in-time-recovery or promotion of
a physical standby server.

There must be something weird in the way you start PostgreSQL.
Examine the start script, maybe you can fix the problem.

These files are only necessary for point-in-time-recovery,
so you don't have to retain them any longer than you retain
your WAL archives.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Should from_collapse be switched off? (queries 10 times faster)

2018-03-23 Thread Laurenz Albe

Peter wrote:
> My queries get up to 10 times faster when I disable from_collapse
> (setting from_collapse_limit=1).
> 
> After this finding, The pramatic solution is easy: it needs to be
> switched off.
> 
> BUT:
> I found this perchance, accidentally (after the queries had been
> running for years). And this gives me some questions about
> documentation and best practices.
> 
> I could not find any documentation or evaluation that would say
> that from_collapse can have detrimental effects. Even less, which
> type of queries may suffer from that.

https://www.postgresql.org/docs/current/static/explicit-joins.html
states towards the end of the page that the search tree grows
exponentially with the number of relations, and from_collapse_limit
can be set to control that.

> In my case, planning uses 1 or 2% of the cycles needed for
> execution; that seems alright to me. 
> And, as said above, I cannot see why my queries might be an
> atypical case (I don't think they are).
> 
> If somebody would like to get a hands-on look onto the actual
> case, I'd be happy to put it online.

It seems like you are barking up the wrong tree.

Your query does not take long because of the many relations in the
FROM list, but because the optimizer makes a wrong choice.

If you set from_collapse_limit to 1, you force the optimizer to
join the tables in the order in which they appear in the query, and
by accident this yields a better plan than the one generated if the
optimizer is free to do what it thinks is best.

The correct solution is *not* to set from_collapse_limit = 1, but
to find and fix the problem that causes the optimizer to make a
wrong choice.

If you send the query and the output of
EXPLAIN (ANALYZE, BUFFERS) SELECT ...
we have a chance of telling you what's wrong.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: functions: VOLATILE performs better than STABLE

2018-03-24 Thread Laurenz Albe

Peter wrote:
> Given an arbitrary function fn(x) returning numeric.
> 
> Question: how often is the function executed?
> [...]
> C.
> select v.v,v.v from (select fn('const') as v) as v;
> 
> Answer:
> Once if declared VOLATILE.
> Twice if declared STABLE.
> 
> Now this IS a surprize. It is clear that the system is not allowed to
> execute the function twice when declared VOLATILE. It IS ALLOWED to
> execute it twice when STABLE - but to what point, except prolonging
> execution time?
> 
> Over all, VOLATILE performs better than STABLE.

The reason is that the subquery with the VOLATILE function can be
flattened; see the EXPLAIN (VERBOSE) output.

There is not guarantee that less volatility means better performance.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Should from_collapse be switched off? (queries 10 times faster)

2018-03-24 Thread Laurenz Albe

Peter wrote:
> On Fri, Mar 23, 2018 at 10:14:19AM -0400, Tom Lane wrote:
> 
> ! It's conceivable that the OP's problem is actually planning time
> ! (if the query joins sufficiently many tables) and that restricting
> ! the cost of the join plan search is really what he needs to do.
> 
> Negative. Plnning time 10 to 27 ms. Execution time 600 to 6300 ms.
> 
> ! Lacking any further information about the problem, we can't say.
> ! We can, however, point to
> ! https://wiki.postgresql.org/wiki/Slow_Query_Questions
> ! concerning how to ask this type of question effectively.
> 
> I strongly hope the data that I sent as followup will now 
> suffice Your expectations.

Your reported execution times don't match the time reported in the
EXPLAIN output...

The cause of the long execution time is clear:

The row count of the join between "places" (WHERE platz = 'WAEHR'),
"wpnames" and "places AS places_1" is underestimated by a factor of 10
(1 row instead of 10).

The nested loop join that is chosen as a consequence is now executed
10 times instead of the estimated 1 time, which is where almost all the
execution time is spent.

The question how to fix that is more complicated, and I cannot solve
it off-hand with a complicated query like that.

Setting "enable_nestloop = off" is as coarse as forcing "from_collapse = 1"
and will negatively impact other queries - if it helps at all.

You'll probably have to rewrite the query.
Sorry that I cannot be of more help.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Should from_collapse be switched off? (queries 10 times faster)

2018-03-26 Thread Laurenz Albe

Peter wrote:
> ! Your reported execution times don't match the time reported in the
> ! EXPLAIN output...
> 
> Should these match? 
> It seems the EXPLAIN (ANALYZE, BUFFERS) does additional things, not 
> just execute the query.

True.
I had assumed you were speaking about the duration of the EXPLAIN (ANALYZE).
 
> ! Setting "enable_nestloop = off" is as coarse as forcing "from_collapse = 1"
> ! and will negatively impact other queries - if it helps at all.
> 
> Since this query is already put into a function, I found I can easily
> set from_collapse=1 only for this function, by means of "ALTER
> FUNCTION ... SET ...", so it does only influence this query. 
> It seems this is the most straight-forward solution here.

It is an option, although not one that makes one happy.

You might have to revisit the decision if the data distribution changes
and the chosen query plan becomes inefficient.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: View taking time to show records

2022-03-25 Thread Laurenz Albe

On Thu, 2022-03-24 at 15:59 +, Kumar, Mukesh wrote:
> We have recently migrated from Oracle to PostgreSQL on version 11.4 on azure 
> postgres PaaS instance.
>  
> There is 1 query which is taking approx. 10 secs in Oracle and when we ran 
> the same query it is taking approx. 1 min
>  
> Can anyone suggest to improve the query as from application end 1 min time is 
> not accepted by client.
>  
> Please find the query and explain analyze report from below link
>  
> https://explain.depesz.com/s/RLJn#stats

I would split the query in two parts: the one from line 3 to line 49 of your 
execution plan,
and the rest.  The problem is the bad estimate of that first part, so execute 
only that, write
the result to a temporary table and ANALYZE that.  Then execute the rest of the 
query using that
temporary table.

Perhaps it is also enough to blindly disable nested loop joins for the whole 
query, rather than
doing the right thing and fixing the estimates:

BEGIN;
SET LOCAL enable_nestloop = off;
SELECT ...;
COMMIT;

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: View taking time to show records

2022-03-25 Thread Laurenz Albe

On Fri, 2022-03-25 at 14:07 +, Kumar, Mukesh wrote:

> > [recommendation to fix the estimate]
> >
> > Perhaps it is also enough to blindly disable nested loop joins for the 
> > whole query,
> > rather than doing the right thing and fixing the estimates:
> >
> > BEGIN;
> > SET LOCAL enable_nestloop = off;
> > SELECT ...;
> > COMMIT;
> 
> Thanks for the below suggestion , When I ran the query with the parameter , 
> it is taking only 1 sec.
> 
> So could you please let me know if I can put this parameter to OFF . at 
> database and it will not
> create any issues to queries running in database.

That will very likely cause problems in your database, because sometimes a 
nested loop join
is by far the most efficient way to run a query.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Performance for SQL queries on Azure PostgreSQL PaaS instance

2022-04-13 Thread Laurenz Albe

On Tue, 2022-04-12 at 09:10 +, Kumar, Mukesh wrote:
> We have recently done the migration from Oracle Database Version 12C to Azure
> PostgreSQL PaaS instance version 11.4 and most of the application 
> functionality
> testing has been over and tested successfully 
>  
> However, there is 1 process at application level which is taking approx. 10 
> mins
> in PostgreSQL and in oracle it is taking only 3 mins.
>  
> After investigating further we identified that process which is executed from
> application end contains 500 to 600 no of short SQL queries into the database.
> We tried to run the few queries individually on database and they are taking
> less than sec in Postgres Database to execute, and we noticed that in Oracle
> taking half of the time as is taking in PostgreSQL. for ex . in oracle same
> select statement is taking 300 millisecond and in PostgreSQL it is taking
> approx. 600 millisecond which over increases the execution of the process.
>  
> Oracle Database are hosted on ON- Prem DC with dedicated application server on
> OnPrem and same for PostgreSQL.

How can a database hosted with Microsoft be on your permises?

Apart from all other things, compare the network latency.  If a single request
results in 500 database queries, you will be paying 1000 times the network
latency per request.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: DB connection issue suggestions

2022-05-10 Thread Laurenz Albe

On Wed, 2022-05-11 at 00:59 +0800, Sudhir Guna wrote:
> We have recently upgraded Postgresql 9.4 standalone server to Postgresql 11.2 
> with High Availability (2 servers : Master and Standby).
> 
> While trying to test using ETL applications and reports, we observe that the 
> ETL jobs fails with below error,
> 
> 2022/05/06 16:27:36 - Error occurred while trying to connect to the database
> 2022/05/06 16:27:36 - Error connecting to database: (using class 
> org.postgresql.Driver)
> 2022/05/06 16:27:36 - FATAL: Sorry, too many clients already
> 
> We have increased the max_connections = 1000 in postgresql.conf file.
> 
> It worked ok for a day and later we get the same error message.
> 
> Please help to advise on any additional settings required. The prior 
> Postgresql 9.4 had the default max_connections = 100and the applications 
> worked fine.

Some application that uses the database has a connection leak: it opens new 
connections
without closing old ones.  Examine "pg_stat_activity" to find out which 
application is
at fault, and then go and fix that application.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Selecting RAM and CPU based on max_connections

2022-05-20 Thread Laurenz Albe

On Fri, 2022-05-20 at 12:15 +0200, Andreas Kretschmer wrote:
> On 20 May 2022 10:27:50 CEST, aditya desai  wrote:
> > One of our applications needs 3000 max_connections to the database.
> > Connection pooler like pgbouncer or pgpool is not certified within the
> > organization yet. So they are looking for setting up high configuration
> > Hardware with CPU and Memory. Can someone advise how much memory and CPU
> > they will need if they want max_conenction value=3000.
> 
> Pgbouncer would be the best solution. CPU: number of concurrent connections.
> RAM: shared_buffer + max_connections * work_mem + maintenance_mem + operating 
> system + ...

Right.  And then hope and pray that a) the database doesn't get overloaded
and b) you don't hit any of the database-internal bottlenecks caused by many
connections.

I also got the feeling that the Linux kernel's memory accounting somehow lags.
I have seen cases where every snapshot of "pg_stat_activity" I took showed
only a few active connections (but each time different ones), but the
amount of allocated memory exceeded what the currently active sessions could
consume.  I may have made a mistake, and I have no reproducer, but I would
be curious to know if there is an explanation for that.
(I am aware that "top" shows shared buffers multiple times).

Yours,
Laurenz Albe

Re: Need help on Query Tunning and Not using the Index Scan

2022-05-20 Thread Laurenz Albe

On Fri, 2022-05-20 at 07:37 +, Kumar, Mukesh wrote:
> We are facing an issue in running the query which takes at least 30 sec to 
> run in PostgreSQL.
>  
> We have tried to create the indexes and done the maintenance and still that 
> query is taking same time.
>  
> Below are the explain plan for the query.
>  
> https://explain.depesz.com/s/sPo2#html
>  
> We have noticed that maximum time it is takin is do a Seq Scan on Table 
> ps_delay_statistic which consist of approx. 35344812 records .
>  
> Can anyone please help on the above issue.

The problem is probably here:

->  GroupAggregate  (cost=0.57..18153.25 rows=2052 width=23) (actual 
time=13.764..13.765 rows=1 loops=1)
  Group Key: ds_1.fleet_object_number_f"
  ->  Index Scan using ndx_delay_stat_equipment on ps_delay_statistic ds_1  
(cost=0.57..18050.67 rows=16412 width=23) (actual time=0.026..10.991 rows=18180 
loops=1)
Index Cond: (fleet_object_number_f = 
(COALESCE(NULLIF('40001000277313'::text, ''::text)))::numeric)
Filter: (activity_code_f IS NOT NULL)

which comes from this subquery:

SELECT max(dp1.daily_production_id) prodId
 FROM ps_daily_production_v dp1
WHERE dp1.fleet_object_number = cast(coalesce(nullif 
(cast(40001000277313 AS varchar), ''), NULL) AS numeric)
  AND dp1.activity_code IS NOT NULL
GROUP BY dp1.fleet_object_number

Remove the superfluous GROUP BY clause that confuses the optimizer.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Oracle_FDW table performance issue

2022-07-11 Thread Laurenz Albe

On Mon, 2022-07-11 at 17:38 +0530, aditya desai wrote:
> I have one Oracle fdw table which is giving performance issue when joined
> local temp table gives performance issue.
> 
> select * from oracle_fdw_table where transaction_id in ( select 
> transaction_id from temp_table)
>   54 seconds. Seeing HASH SEMI JOIN  in EXPLAIN PLAN. temp_table has only 
> 74 records.
> 
> select * from from oracle_fdw_table where transaction_id in ( 
> 1,2,3,.,75)--- 23ms.
> 
> Could you please help me understand this drastic behaviour change?

The first query joins a local table with a remote Oracle table.  The only way 
for
such a join to avoid fetching the whole Oracle table would be to have the 
foreign scan
on the inner side of a nested loop join.  But that would incur many round trips 
to Oracle
and is therefore perhaps not a great plan either.

In the second case, the whole IN list is shipped to the remote side.

In short, the queries are quite different, and I don't think it is possible to 
get
the first query to perform as well as the second.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Select on partitioned table is very slow

2022-08-25 Thread Laurenz Albe

On Thu, 2022-08-25 at 11:10 +0200, hubert depesz lubaczewski wrote:
> Hi,
> 
> On Thu, Aug 25, 2022 at 10:49:51AM +0200, Jose Osinde wrote:
> > select logical_identifier, version_id, lastproduct
> >    from test_product_ui_partition.product_ui pui
> >     where pui.mission_id='urn:esa:psa:context:investigation:mission.em16'
> >   and 
> > pui.logical_identifier='urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'
> 
> > EXPLAIN ANALYZE FROM PGADMIN
> > 
> > Index Scan using product_ui_em16_logical_identifier_idx on
> > product_ui_em16 pui  (cost=0.69..19.75 rows=7 width=112) (actual
> > time=0.133..0.134 rows=1 loops=1)
> > [...]  Index Cond: (logical_identifier =
> > 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::citext)"
> > [...]  Filter: (mission_id =
> > 'urn:esa:psa:context:investigation:mission.em16'::citext)"
> > Planning Time: 0.237 ms
> > Execution Time: 0.149 ms
> 
> I really wish you didn't butcher explains like this, but we can work
> with it.
> 
> Please note that the condition for filter is:
> 
> mission_id = 'urn:esa:psa:context:investigation:mission.em16'::citext
> 
> Specifically, column mission_id (which is partition key) is compared
> with some value that is in citext type - same as column.
> This means that pg can take this value, compare with partitioning
> schema, and pick one partition.
> 
> Now look at the explain from java:
> 
> >   Filter: (((mission_id)::text =
> > 'urn:esa:psa:context:investigation:mission.em16'::text) AND
> 
> The rest is irrelevant.
> 
> The important part is that java sent query that doesn't compare value of
> column mission_id with some value, but rather compares *cast* of the
> column.
> 
> Since it's not column value, then partitioning can't check what's going
> on (cast can just as well make it totally different value), and it also
> can't really use index on mission_id.
> 
> Why it happens - no idea, sorry, I don't grok java.
> 
> But you should be able to test/work on fix with simple, non-partitioned
> table, just make there citext column, and try searching for value in it,
> and check explain from the search. If it will cast column - it's no
> good.
> 
> Sorry I can't tell you what to fix, but perhaps this will be enough for
> you to find solution.

Quite so.

You are probably using a prepared statement in JDBC.

You probably have to use explicit type casts, like:

select logical_identifier, version_id, lastproduct 
   from test_product_ui_partition.product_ui pui 
where pui.mission_id = ? :: citext
  and pui.logical_identifier = ? :: citext

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Milions of views - performance, stability

2022-09-16 Thread Laurenz Albe

On Sat, 2022-09-17 at 01:05 +0200, Hubert Rutkowski wrote:
> Hello! I have written python program to benchmark view efficiency, because in 
> our platform
> they have a role to play and we noticed the performance is less than expected.

If your platform plans to use millions of views, you should revise your design. 
 As you
see, that is not going to fly.  And no, I don't consider that a bug.

> Basically, benchmark creates table:
> 
> CREATE TABLE IF NOT EXISTS foobar ( id int, text varchar(40) );
> 
> for i in range(1200300):
> INSERT INTO foobar (id, text) VALUES ({i}, 'some string');
> CREATE VIEW foobar_{i} as select * from foobar where id={i};
> 
> Couldn't be any simpler. 
> [general slowness]
> 
> What's even stranger is dropping performance: DROP TABLE foobar CASCADE;. 
> First of all, had to
> increase locks to allow it to finish, otherwise it was quickly bailing 
> because of "too little shared memory".
> alter system set  max_locks_per_transaction=4;
> 
> But even after that, it took almost 7 hours and crashed:
> 
> 2022-09-13 23:16:31.113 UTC [1] LOG:  server process (PID 404) was terminated 
> by signal 9: Killed
> 
> After updating Postgres to 14.5, it crashed in a bit different way:
> 
> 2022-09-15 19:20:26.000 UTC [67] LOG:  checkpoints are occurring too 
> frequently (23 seconds apart)
> 2022-09-15 19:20:26.000 UTC [67] HINT:  Consider increasing the configuration 
> parameter "max_wal_size".
> 2022-09-15 19:20:39.058 UTC [1] LOG:  server process (PID 223) was terminated 
> by signal 9: Killed
> 2022-09-15 19:20:39.058 UTC [1] DETAIL:  Failed process was running: drop 
> table foobar cascade;
> 
> Wihout the views, table can be dropped in 20ms. 

You misconfigured your operating system and didn't disable memory overcommit, 
so you got killed
by the OOM killer.  Basically, the operation ran out of memory.

Yours,
Laurenz Albe

Re: time sorted UUIDs

2022-12-15 Thread Laurenz Albe

On Thu, 2022-12-15 at 10:56 +1300, Tim Jones wrote:
> could someone please comment on this article 
> https://vladmihalcea.com/uuid-database-primary-key/
> specifically re the comments (copied below) in regards to a Postgres database.
> 
> ...
> But, using a random UUID as a database table Primary Key is a bad idea for 
> multiple reasons.
> First, the UUID is huge. Every single record will need 16 bytes for the 
> database identifier,
> and this impacts all associated Foreign Key columns as well.
> Second, the Primary Key column usually has an associated B+Tree index to 
> speed up lookups or
> joins, and B+Tree indexes store data in sorted order.
> However, indexing random values using B+Tree causes a lot of problems:
>  * Index pages will have a very low fill factor because the values come 
> randomly. So, a page
>of 8kB will end up storing just a few elements, therefore wasting a lot of 
> space, both
>on the disk and in the database memory, as index pages could be cached in 
> the Buffer Pool.
>  * Because the B+Tree index needs to rebalance itself in order to maintain 
> its equidistant
>tree structure, the random key values will cause more index page splits 
> and merges as
>there is no pre-determined order of filling the tree structure.

I'd say that is quite accurate.

Yours,
Laurenz Albe

Re: Domain check taking place unnecessarily?

2023-02-08 Thread Laurenz Albe

On Wed, 2023-02-08 at 18:01 +, Mark Hills wrote:
> I've ruled out waiting on a lock; nothing is reported with 
> log_lock_waits=on. This is a test database with exclusive access (2.5 
> million rows):
> 
> This is PostgreSQL 14.5 on Alpine Linux. Thanks.
> 
> CREATE DOMAIN hash AS text
>     CHECK (VALUE ~ E'^[a-zA-Z0-9]{8,32}$');
>  
> devstats=> ALTER TABLE invite ADD COLUMN test text;
> ALTER TABLE
> Time: 8.988 ms
>  
> devstats=> ALTER TABLE invite ADD COLUMN test hash;
> ALTER TABLE
> Time: 30923.380 ms (00:30.923)
>  
> devstats=> ALTER TABLE invite ADD COLUMN test hash DEFAULT NULL;
> ALTER TABLE
> Time: 30344.272 ms (00:30.344)
>  
> devstats=> ALTER TABLE invite ADD COLUMN test hash DEFAULT '123abc123'::hash;
> ALTER TABLE
> Time: 67439.232 ms (01:07.439)

It takes 30 seconds to schan the table and determine that all existing rows
satisky the constraint.

The last example is slower, because there is actually a non-NULL value to check.

If that were not a domain, but a normal check constraint, you could first add
the constraint as NOT VALID and later run ALTER TABLE ... VALIDATE CONSTRAINT 
...,
which takes a while too, but does not lock the table quite that much.
But I don't think there is a way to do that with a domain.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: max_wal_senders

2023-02-08 Thread Laurenz Albe

On Wed, 2023-02-08 at 18:07 -0500, Rick Otten wrote:
> I've been thinking about the max_wal_senders parameter lately and wondering 
> if there
> is any harm in setting it too high.

No, there isn't, except that if you end up having too many *actual* WAL 
senders, it
will cause load.  A high limit is no problem as such.

> The documentation mentions an orphaned connection slot that may take a while 
> to time out.
> How can I tell if I have any of those?  I was looking for a `pg_wal_slots` 
> table
> similar to the `pg_replication_slots` table, but don't see anything obvious 
> in the catalog.

The view is "pg_stat_replication", but you won't see there if an entry is
abandoned before PostgreSQL does and terminates it.  You can set 
"tcp_keepalived_idle"
low enough so that the kernel will detect broken connections early on.

Yours,
Laurenz Albe

Re: Performance of UPDATE operation

2023-02-13 Thread Laurenz Albe

On Mon, 2023-02-13 at 16:09 +0100, Mkrtchyan, Tigran wrote:
> Typically we expect that UPDATE is a slow operation in PostgreSQL, however,
> there are cases where it's hard to understand why. In particular, I have a 
> table like
> 
> ```
> CREATE SEQUENCE t_inodes_inumber_seq
>     START WITH 1
>     INCREMENT BY 1
>     NO MINVALUE
>     NO MAXVALUE
>     CACHE 1;
> 
> 
> CREATE TABLE t_inodes (
>     inumber bigint PRIMARY KEY,
>     icrtime timestamp with time zone NOT NULL,
>     igeneration bigint NOT NULL
> );
> ```
> 
> and a transaction that inserts and update an entry in that table:
> 
> ```
> BEGIN;
> INSERT INTO t_inodes (inumber, icrtime, igeneration)
>    VALUES (nextval('t_inodes_inumber_seq'), now(), 0) RETURNING inumber \gset
> 
> UPDATE t_inodes SET igeneration = igeneration + 1 where  inumber = :inumber;
> END;
> ```
> 
> The pgbench shows the following result:
> 
> ```
> $ pgbench -h localhost -n -r  -f update.sql -t 1 -c 64 -j 64 testdb
> pgbench (15.0 (Debian 15.0-1.pgdg110+1))
> transaction type: update.sql
> scaling factor: 1
> query mode: simple
> number of clients: 64
> number of threads: 64
> maximum number of tries: 1
> number of transactions per client: 1
> number of transactions actually processed: 64/64
> number of failed transactions: 0 (0.000%)
> latency average = 11.559 ms
> initial connection time = 86.038 ms
> tps = 5536.736898 (without initial connection time)
> statement latencies in milliseconds and failures:
>  0.524   0  BEGIN;
>  0.819   0  INSERT INTO t_inodes (inumber, icrtime, 
> igeneration)
>  0.962   0  UPDATE t_inodes SET igeneration = igeneration + 1 
> where  inumber = :inumber;
>  9.203   0  END;
> ```
> 
> My naive expectation will be that updating the newly inserted record should 
> cost nothing... Are there ways
> to make it less expensive?

Updating a newly inserted row is about as expensive as inserting the row in the 
first place.

You can reduce the overall impact somewhat by creating the table with a 
"fillfactor" below
100, in your case 90 would probably be enough.  That won't speed up the UPDATE 
itself, but
it should greatly reduce the need for VACUUM.

Yours,
Laurenz Albe

Re: multicolumn partitioning help

2023-03-14 Thread Laurenz Albe

On Sun, 2023-03-12 at 13:59 -0400, James Robertson wrote:
> I am having issues with multicolumn partitioning. For reference I am using 
> the following link as my guide:
> https://www.postgresql.org/docs/devel/sql-createtable.html
> 
> To demonstrate my problem, I created a simple table called humans. I want to 
> partition by the year
> of the human birth and then the first character of the hash. So for each year 
> I'll have year*16 partitions. (hex)
> 
> CREATE TABLE humans (
>     hash bytea,
>     fname text,
>     dob date
>     )PARTITION BY RANGE (EXTRACT(YEAR FROM dob),substring(hash::text, 1, 1));
>     
> Reading the documentation:   "When creating a range partition, the lower 
> bound specified with
> FROM is an inclusive bound, whereas the upper bound specified with TO is an 
> exclusive bound".
> 
> However I can't insert any of the following after the first one, because it 
> says it overlaps.
> Do I need to do anything different when defining multi-column partitions?
> 
> 
> This works:
> CREATE TABLE humans_1968_0 PARTITION OF humans FOR VALUES FROM (1968, '0') TO 
> (1969, '1');
> 
> 
> These fail: 
> CREATE TABLE humans_1968_1 PARTITION OF humans FOR VALUES FROM (1968, '1') TO 
> (1969, '2');

Justin has explained what the problem is, let me supply a solution.

I think you want subpartitioning, like

  CREATE TABLE humans (
 hash bytea,
 fname text,
 dob date
  ) PARTITION BY LIST (EXTRACT (YEAR FROM dob));

  CREATE TABLE humans_2002
 PARTITION OF humans FOR VALUES IN (2002)
 PARTITION BY HASH (hash);

  CREATE TABLE humans_2002_0
 PARTITION OF humans_2002 FOR VALUES WITH (MODULUS 26, REMAINDER 0);

  [...]

  CREATE TABLE humans_2002_25
 PARTITION OF humans_2002 FOR VALUES WITH (MODULUS 26, REMAINDER 25);

and so on for the other years.

Yours,
Laurenz Albe

Re: multicolumn partitioning help

2023-03-15 Thread Laurenz Albe

On Tue, 2023-03-14 at 19:33 -0400, James Robertson wrote:
> Laurenz your solution is how I thought I would work around my (lack of) 
> understanding
> of partitioning. (nested partitions).
> I was hesitant because I didn't know what sort of performance problems I 
> would create for myself.
> 
> [...] more performance [...]

If you are thinking of subpartitioning primarily in terms of boosting 
performance,
you should know that you only get performance benefits from partitioning with
very special queries that effectively have to be designed together with the
partitioning strategy.  Other statements typically become somewhat slower
through partitioning.

So it is really impossible to discuss performance benefits without knowing
the exact query.  It may be best if you build a play database with realistic 
amounts
of test data and use EXPLAIN and EXPLAIN (ANALYZE) to see the effects that
partitioning has on your queries.

Yours,
Laurenz Albe

Re: Performance issues in query with multiple joins

2023-04-30 Thread Laurenz Albe

On Fri, 2023-04-28 at 15:19 +0300, Παρασκευη Πασσαρη wrote:
> We are facing a performance issue with the following query. Executing this 
> query takes about 20 seconds.
> (the database version is 14.1)

The execution plan seems to be incomplete.

Yours,
Laurenz Albe

Re: thousands of CachedPlan entry per backend

2023-05-31 Thread Laurenz Albe

On Thu, 2023-06-01 at 03:36 +, James Pang (chaolpan) wrote:
> PG V14.8-1 , client using Postgresql JDBC driver we found 40MB process memory 
> per
> backend, from Operating system and memorycontext dump “Grand total:”, both 
> mached.
> But from details, we found almost of entry belong to  “CacheMemoryContext”,
> from this line  CacheMemoryContext: 8737352 total in 42 blocks; 1021944 free 
> (215 chunks); 7715408 used,
> but there are thousands of lines of it’s child, the sum of blocks much more 
> than “8737352” total in 42 blocks
> 
> Our application use Postgresql JDBC driver with default 
> parameters(maxprepared statement 256),
> there are many triggers, functions in this database, and a few functions run 
> sql by an extension
> pg_background.  We have thousands of connections and have big concern why 
> have thousands of entrys
> of cached SQL ?  that will consume huge memory ,  anyway to limit the cached 
> plan entry to save memory
> consumption?  Or it looks like an abnormal behavior or bug to see so many 
> cached plan lines.

If you have thousands of connections, that's your problem.  You need effective 
connection pooling.
Then 40MB per backend won't be a problem at all.  Having thousands of 
connections will cause
other, worse, problems for you.

See for example 
https://www.cybertec-postgresql.com/en/tuning-max_connections-in-postgresql/

If you want to use functions, but don't want to benefit from plan caching, you 
can set
the configuration parameter "plan_cache_mode" to "force_custom_plan".

Yours,
Laurenz Albe

Re: thousands of CachedPlan entry per backend

2023-06-01 Thread Laurenz Albe

On Thu, 2023-06-01 at 08:50 +, James Pang (chaolpan) wrote:
> we found thousands of cached plan , since JDBC driver only allow max 256 
> cached
> prepared statements, how backend cache so many sql plans. If we have one 
> function,
> when application call that function will make backend to cache  every SQL 
> statement
> plan in that function too?   and for table triggers, have similar caching 
> behavior ?

Yes, as long as the functions are written in PL/pgSQL.
It only affects static SQL, that is, nothing that is run with EXECUTE.

Yours,
Laurenz Albe

Re: Index on (fixed size) bytea value

2023-06-19 Thread Laurenz Albe

On Tue, 2023-06-20 at 08:13 +0200, Les wrote:
> I'm aware of the TOAST, and how it works. I was referring to it ("I think 
> that it should
> be as large as possible, without hitting the toast. ") I have designed a 
> separate "block"
> table specifically to avoid storing binary data in the TOAST. So my plan is 
> not going to
> involve out-of-band storage.
> 
> Just to make this very clear: a record in the block table would store a 
> block, not the
> whole file. My question is to finding the optimal block size (without hitting 
> the toast),
> and finding the optimal hash algorithm for block de-duplication.

Then you would ALTER the column and SET STORAGE MAIN, so that it does not ever 
use TOAST.

The size limit for a row would then be 8kB minus page header minus row header, 
which
should be somewhere in the vicinity of 8140 bytes.

If you want your block size to be a power of two, the limit would be 4kB, which 
would waste
almost half your storage space.

Yours,
Laurenz Albe

Re: TOAST Fields serialisation/deserialization performance

2023-07-26 Thread Laurenz Albe

On Wed, 2023-07-26 at 18:15 +0530, Piyush Katariya wrote:
> I have a few queries regarding the TOAST Fields serialisation/deserialization 
> performance.
> 
> The use case i am trying to solve here is to have millions of partitions and 
> aggregate the data in array field.
> 
> I wish to know if i declare certain column in table as "array of UDT/JSONB" 
> and enable
> either lz4 or zstd compression on it, does appending or prepending to that 
> array or even
> changing the intermediate fields of UDT/JSONB objects. in that array has a 
> runtime cost
> of full array data de-serialization every single time. If i perform any 
> UPDATE operation
> on its elements or add/remove new elements from any position, does PG 
> rewrites the new
> version of the column value regardless of its size.

Updating even a small part of a large JSONB value requires that the entire 
thing is
read and written, causing a lot of data churn.

This is inefficient, and you shouldn't use large JSONB values if you plan to do 
that.

If the data have a regular structure, use a regular relational data model.
Otherwise, one idea might be to split the JSONB in several parts and store each
of those parts in a different table row.  That would reduce the impact.

Yours,
Laurenz Albe

Re: Table copy with SERIALIZABLE is incredibly slow

2023-07-30 Thread Laurenz Albe

On Sun, 2023-07-30 at 23:00 -0600, peter plachta wrote:
> Background is we're trying a pg_repack-like functionality to compact a 
> 500Gb/145Gb
> index (x2) table from which we deleted 80% rows. Offline is not an option. 
> The table
> has a moderate (let's say 100QPS) I/D workload running.
> 
> The typical procedure for this type of thing is basically CDC:
> 
> 1. create 'log' table/create trigger
> 2. under SERIALIZABLE: select * from current_table insert into new_table
> 
> What we're finding is that for the 1st 30 mins the rate is 10Gb/s, then it 
> drops to
> 1Mb/s and stays there and 22 hours later the copy is still going and now 
> the log
> table is huge so we know the replay will also take a very long time.
> 
> ===
> 
> Q: what are some ways in which we could optimize the copy?
> 
> Btw this is Postgres 9.6
> 
> (we tried unlogged table (that did nothing), we tried creating indexes after
> (that helped), we're experimenting with RRI)

Why are you doing this the hard way, when pg_squeeze or pg_repack could do it?

You definitely should not be using PostgreSQL 9.6 at this time.

Yours,
Laurenz Albe

Re: Planning time is time-consuming

2023-09-10 Thread Laurenz Albe

On Fri, 2023-09-08 at 18:51 +0800, Mikhail Balayan wrote:
> I have three tables:
>     - test_db_bench_1
>     - test_db_bench_tenants
>     - test_db_bench_tenant_closure
> 
> And the query to join them:
> SELECT "test_db_bench_1"."id" id, "test_db_bench_1"."tenant_id"
>   FROM "test_db_bench_1"
>   JOIN "test_db_bench_tenants" AS "tenants_child" ON (("tenants_child"."uuid" 
> = "test_db_bench_1"."tenant_id") 
>                                                  AND 
> ("tenants_child"."is_deleted" != true))
>   JOIN "test_db_bench_tenant_closure" AS "tenants_closure" ON 
> (("tenants_closure"."child_id" = "tenants_child"."id")
>                                                           AND 
> ("tenants_closure"."barrier" <= 0))
>   JOIN "test_db_bench_tenants" AS "tenants_parent" ON (("tenants_parent"."id" 
> = "tenants_closure"."parent_id")
>                                                   AND 
> ("tenants_parent"."uuid" IN ('4c79c1c5-21ae-45a0-8734-75d67abd0330'))
>                                                   AND 
> ("tenants_parent"."is_deleted" != true))
>  LIMIT 1
> 
> 
> With following execution plan:
> 
>                                                                               
>                        QUERY PLAN
> --
> ---
>  Limit  (cost=1.56..1.92 rows=1 width=44) (actual time=0.010..0.011 rows=0 
> loops=1)
>    ->  Nested Loop  (cost=1.56..162.42 rows=438 width=44) (actual 
> time=0.009..0.009 rows=0 loops=1)
>          ->  Nested Loop  (cost=1.13..50.27 rows=7 width=36) (actual 
> time=0.008..0.009 rows=0 loops=1)
>                ->  Nested Loop  (cost=0.84..48.09 rows=7 width=8) (actual 
> time=0.008..0.009 rows=0 loops=1)
>                      ->  Index Scan using test_db_bench_tenants_uuid on 
> test_db_bench_tenants tenants_parent  (cost=0.41..2.63 rows=1 width=8) 
> (actual time=0.008..0.008 rows=0 loops=1)
>                            Index Cond: ((uuid)::text = 
> '4c79c1c5-21ae-45a0-8734-75d67abd0330'::text)
>                            Filter: (NOT is_deleted)
>                      ->  Index Scan using test_db_bench_tenant_closure_pkey 
> on test_db_bench_tenant_closure tenants_closure  (cost=0.42..45.06 rows=40 
> width=16) (never executed)
>                            Index Cond: (parent_id = tenants_parent.id)
>                            Filter: (barrier <= 0)
>                ->  Index Scan using test_db_bench_tenants_pkey on 
> test_db_bench_tenants tenants_child  (cost=0.29..0.31 rows=1 width=44) (never 
> executed)
>                      Index Cond: (id = tenants_closure.child_id)
>                      Filter: (NOT is_deleted)
>          ->  Index Scan using test_db_bench_1_idx_tenant_id_3 on 
> acronis_db_bench_heavy  (cost=0.43..14.66 rows=136 width=44) (never executed)
>                Index Cond: ((tenant_id)::text = (tenants_child.uuid)::text)
>  Planning Time: 0.732 ms
>  Execution Time: 0.039 ms
> 
> 
> Where the planning time gets in the way as it takes an order of magnitude 
> more time than the actual execution.
> 
> Is there a possibility to reduce this time? And, in general, to understand 
> why planning takes so much time.

You could try to VACUUM the involved tables; indexes with many entries pointing 
to dead tuples
can cause a long planing time.

Also, there are quite a lot of indexes on "test_db_bench_1".  On a test 
database, drop some
indexes and see if that makes a difference.

Finally, check if "default_statistics_target" is set to a high value, or if the 
"Stats target"
for some column in the "\d+ tablename" output is set higher than 100.

Yours,
Laurenz Albe

Re: Fwd: Planning time is time-consuming

2023-09-11 Thread Laurenz Albe

On Mon, 2023-09-11 at 12:57 +0800, Mikhail Balayan wrote:
> Thanks for the idea. I was surprised to find that this is not the way it 
> works and the planning time remains the same.

To benefit from the speed gains of a prepared statement, you'd have to execute 
it
at least seven times.  If a generic plan is used (which should happen), you will
see $1 instead of the literal argument in the execution plan.

Prepared statements are probably your best bet.

Yours,
Laurenz Albe

Re: Dirty reads on index scan,

2023-09-21 Thread Laurenz Albe

On Thu, 2023-09-21 at 17:05 +0200, Koen De Groote wrote:
> I'm doing the following query:
> select * from my_table where hasbeenchecked = true and hasbeenverified = true 
> and insert_timestamp <= '2023-09-01 00:00:00.000' limit 1000;
> 
> The date is an example, it is the format that is used in the query.
> 
> The table has 81M rows. Is 50GB in size. And the index is 34MB
> 
> The index is as follows:
> btree (insert_timestamp DESC) WHERE hasbeenchecked = true AND hasbeenverified 
> = true
> 
> I'm seeing a slow query first, then a fast one, and if I move the date, a 
> slow query again.
> 
> What I'm seeing is:
> Attempt 1:
> Hit: 5171(40MB)
> Read: 16571(130MB)
> Dirtied: 3940(31MB)
> 
> Attempt 2:
> Hit: 21745 (170MB)
> Read: Nothing
> Dirtied: Nothing.
> 
> It's slow once, then consistently fast, and then slow again if I move the 
> date around.
> And by slow I mean: around 60 seconds. And fast is below 1 second.

That's normal behavior: after the first execution, the data are cached, so the 
query
becomes much faster.

Dirtying pages happens because the first reader has to set hint bits, which is 
an extra
chore.  You can avoid that if you VACUUM the table before you query it.

Yours,
Laurenz Albe

Re: Dirty reads on index scan,

2023-09-22 Thread Laurenz Albe

On Fri, 2023-09-22 at 10:35 +0200, Koen De Groote wrote:
> On Thu, Sep 21, 2023 at 9:30 PM Laurenz Albe  wrote:
> > On Thu, 2023-09-21 at 17:05 +0200, Koen De Groote wrote:
> > > I'm doing the following query:
> > > select * from my_table where hasbeenchecked = true and hasbeenverified = 
> > > true and insert_timestamp <= '2023-09-01 00:00:00.000' limit 1000;
> > > 
> > > The date is an example, it is the format that is used in the query.
> > > 
> > > The table has 81M rows. Is 50GB in size. And the index is 34MB
> > > 
> > > The index is as follows:
> > > btree (insert_timestamp DESC) WHERE hasbeenchecked = true AND 
> > > hasbeenverified = true
> > > 
> > > I'm seeing a slow query first, then a fast one, and if I move the date, a 
> > > slow query again.
> > > 
> > > What I'm seeing is:
> > > Attempt 1:
> > > Hit: 5171(40MB)
> > > Read: 16571(130MB)
> > > Dirtied: 3940(31MB)
> > > 
> > > Attempt 2:
> > > Hit: 21745 (170MB)
> > > Read: Nothing
> > > Dirtied: Nothing.
> > > 
> > > It's slow once, then consistently fast, and then slow again if I move the 
> > > date around.
> > > And by slow I mean: around 60 seconds. And fast is below 1 second.
> > 
> > That's normal behavior: after the first execution, the data are cached, so 
> > the query
> > becomes much faster.
> > 
> > Dirtying pages happens because the first reader has to set hint bits, which 
> > is an extra
> > chore.  You can avoid that if you VACUUM the table before you query it.
>
> So, if I want to speed up the query, apart from trying to vacuum it 
> beforehand, I suspect
> I've hit the limit of what this query can do?
>
> Because, the table is just going to keep growing. And it's a usually a query 
> that runs one
> time per day, so it's a cold run each time.
>
> Is this just going to get slower and slower and there's nothing that can be 
> done about it?

Essentially yes.

If the table does not have too many columns, or you can be more selective than 
"SELECT *",
you could use an index-only scan with an index like

  CREATE INDEX ON my_table (insert_timestamp)
 INCLUDE (/* all the columns in the SELECT list */)
 WHERE hasbeenchecked AND hasbeenverified;

  VACUUM my_table;

You need to configure autovacuum so that it vacuums the table often enough if 
you want
an efficient index-only scan.

If that is not feasible, you can gain speed by clustering the table.  For that, 
you need
a different index:

  CREATE INDEX ckuster_idx ON my_table (hasbeenchecked, hasbeenverified, 
insert_timestamp);

  CLUSTER my_table USING cluster_idx;  -- attention: rewrites the table

That should speed up the query considerably, because it will have to read way 
fewer pages
from disk.  However, CLUSTER is not without problems.  Look at the 
documentation for the
caveats.

Yours,
Laurenz Albe

Re: GIN JSONB path index is not always used

2023-10-17 Thread Laurenz Albe

On Tue, 2023-10-17 at 15:48 +0200, Tomasz Szymański wrote:
> Hello! We have an issue with database planner choosing really expensive 
> sequence scan instead of an index scan in some cases.

To analyze that, we'd need the output from EXPLAIN (ANALYZE, BUFFERS) SELECT ...
Plain text format please, no JSON.

Yours,
Laurenz Albe

Re: Performance down with JDBC 42

2023-11-04 Thread Laurenz Albe

On Sat, 2023-11-04 at 19:08 +, Abraham, Danny wrote:
> Asking for help with a JDBC related issue.
> Environment: Linux 7.9 PG 14.9 , very busy PG Server.
> 
> A big query - 3 unions and about 10 joins runs :
> - 70ms on psql , DBeaver with JDBC 42  and  in our Server using old JDBC 9.2
> - 2500 ms in our Server using new JDBC 42 driver. ( and  this is running many 
> times) 
> 
> Question: Is there a structured way to identify optimization setup ( Planner 
> Method s ) changes?
> Are there any known changes specific to JDBC 42. 

What I would do is enable auto_explain and look at the execution plan
when the statement is run by the JDBC driver.  Then you can compare the
execution plans and spot the difference.

Yours,
Laurenz Albe

Re: Slow query in table where many rows were deleted. VACUUM FULL fixes it

2024-01-30 Thread Laurenz Albe

On Tue, 2024-01-30 at 11:40 +0200, Pavlos Kallis wrote:
> I have the following table:
> 
> CREATE TABLE IF NOT EXISTS public.shortened_url
> (
>     id character varying(12) COLLATE pg_catalog."default" NOT NULL,
>     created_at timestamp without time zone,
>     expires_at timestamp without time zone,
>     url text COLLATE pg_catalog."default" NOT NULL,
>     CONSTRAINT shortened_url_pkey PRIMARY KEY (id)
> )
> 
> The table contains only the following index on PRIMARY KEY:
> 
> CREATE UNIQUE INDEX IF NOT EXISTS shortened_url_pkey
>     ON public.shortened_url USING btree
>     (id COLLATE pg_catalog."default" ASC NULLS LAST)
>     TABLESPACE pg_default;
> 
> This table has approximately 5 million rows of expired URLs (expires_at < 
> now()), and 5 thousand rows of non-expired URLs (expires_at > now())
> 
> I deleted all expired URLs with this query:
> 
> DELETE FROM shortened_url WHERE expires_at < now().
> 
> Then, I tried to query the table for expired URLs:
> 
> SELECT * FROM shortened_url WHERE expires_at < now();
> 
> This query was very slow. It took around 1-2 minutes to run, while it had to 
> fetch only 5000 rows (the non-expired URLs, since the other ones were 
> deleted).
> 
> After that, I tried to run VACUUM ANALYZE and REINDEX to the table.
> The query was still slow.
> 
> Finally, I ran VACUUM FULL and re-executed the query. Only then, it started 
> running fast (1-2 seconds).
> 
> Do you have observed a similar behavior with VACUUM ANALYZE / VACUUM FULL and 
> why this can happen? 
> Is this because data is compacted after VACUUM FULL and sequential disk reads 
> are faster? 
> Shouldn't VACUUM ANALYZE reclaim the disk space and make the query run fast?
> Is this because RDS might do some magic? Is it something I am missing?

There are too many unknowns here.  Please enable "track_io_timing" and send us
the output of EXPLAIN (ANALYZE, BUFFERS) for the slow statements.

One theory could be that there was a long running transaction or something else
that prevented VACUUM from cleaning up.  For that, the output of
"VACUUM (VERBOSE) shortened_url" would be interesting.

> Additional details
> PostgreSQL version: 14.7 on db.t3.micro RDS
> PG configuration: Default of RDS

We can only speak about real PostgreSQL...

Yours,
Laurenz Albe

Re: Weird performance differences between cloud vendors

2024-02-01 Thread Laurenz Albe

On Thu, 2024-02-01 at 10:23 +0100, Dirk Krautschick wrote:
> I have run a test with pgbench against two cloud vendors (settings, 
> parameters almost the same).
> Both Postgres (or whatever they do internally when they call it as Postgres 
> offering, NOT Aurora or so :-) )
> 
> I have got a strange result that cloud vendor 1 is performing almost 
> everywhere better in matter of
> read and write but except in the init phase of pgbench it took almost double 
> the time.

Nobody except those vendors could tell you for certain, but perhaps on the one
system the initial data load is fast, because you have not yet exceeded your 
I/O quota,
and then I/O is throttled.

Yours,
Laurenz Albe

Re: huge SubtransSLRU and SubtransBuffer wait_event

2024-02-01 Thread Laurenz Albe

On Thu, 2024-02-01 at 11:50 +, James Pang (chaolpan) wrote:
> We have a Postgresqlv14.8 server, client use Postgresql JDBC connections, 
> today,
> our server see a lot of  “SubtransBuffer” and “SubtransSLRU” wait_event.
> Could you help direct me what’s the possible cause and how to resolve this 
> waits ?

Today, the only feasible solution is not to create more than 64 subtransactions
(savepoints or PL/pgSQL EXCEPTION clauses) per transaction.

Don't use extensions or the JDBC driver option to simulate statement level 
rollback,
that is the road to hell.

Yours,
Laurenz Albe

Re: huge SubtransSLRU and SubtransBuffer wait_event

2024-02-02 Thread Laurenz Albe

On Fri, 2024-02-02 at 02:04 -0800, Nikolay Samokhvalov wrote:
> On Thu, Feb 1, 2024 at 04:42 Laurenz Albe  wrote:
> > Today, the only feasible solution is not to create more than 64 
> > subtransactions
> > (savepoints or PL/pgSQL EXCEPTION clauses) per transaction.
> 
> I think 64+ nesting level is quite rare

It doesn't have to be 64 *nested* subtransactions.  This is enough:

CREATE TABLE tab (x integer);

DO
$$DECLARE
   i integer;
BEGIN
   FOR i IN 1..70 LOOP
  BEGIN
 INSERT INTO tab VALUES (i);
  EXCEPTION
 WHEN unique_violation THEN
NULL; -- ignore
  END;
   END LOOP;
END;$$;

Yours,
Laurenz Albe

Re: PostgreSQL doesn't use index-only scan if there is an expression in index

2024-02-15 Thread Laurenz Albe

On Thu, 2024-02-15 at 17:37 +0300, Pavel Kulakov wrote:
> Hello,
>  
> PostgreSQL doesn't use 'Index Only Scan' if there is an expression in index.
>  
> The documentation says that PostgreSQL's planner considers a query to be 
> potentially
> executable by index-only scan only when all columns needed by the query are 
> available from the index. 
> I think an example on 
> https://www.postgresql.org/docs/16/indexes-index-only-scans.html :
>  
> SELECT f(x) FROM tab WHERE f(x) < 1;
>  
> is a bit confusing. Even the following query does not use 'Index Only Scan'
>  
> SELECT 1 FROM tab WHERE f(x) < 1;
>  
> Demonstration:
> ---
> drop table if exists test;
>  
> create table test(s text);
> create index ix_test_upper on test (upper(s));
> create index ix_test_normal on test (s);
>  
> insert into test (s)
> select 'Item' || t.i
> from pg_catalog.generate_series(1, 10, 1) t(i);
>  
> analyze verbose "test";
>  
> explain select 1 from test where s = 'Item123';
> explain select 1 from test where upper(s) = upper('Item123');
> --
> Query plan 1:
> Index Only Scan using ix_test_normal on test  (cost=0.42..8.44 rows=1 width=4)
>   Index Cond: (s = 'Item123'::text)
>   
> Query plan 2 (SHOULD BE 'Index Only Scan'):
> Index Scan using ix_test_upper on test  (cost=0.42..8.44 rows=1 width=4)
>   Index Cond: (upper(s) = 'ITEM123'::text)
>    
>  
> If I add 's' as included column to ix_test_upper the plan does use 'Index 
> Only Scan'.
> That looks strange to me: there is no 's' in SELECT-clause, only in 
> WHERE-clause in
> the form of 'upper(s)' and this is why ix_test_upper is choosen by the 
> planner.

You need to create the index like this:

  CREATE INDEX ix_test_upper ON test (upper(s)) INCLUDE (s);

See https://www.postgresql.org/docs/current/indexes-index-only-scans.html:

 "In principle, index-only scans can be used with expression indexes.
  For example, given an index on f(x) where x is a table column, it
  should be possible to execute

SELECT f(x) FROM tab WHERE f(x) < 1;

  as an index-only scan; and this is very attractive if f() is an
  expensive-to-compute function. However, PostgreSQL's planner is currently
  not very smart about such cases. It considers a query to be potentially
  executable by index-only scan only when all columns needed by the query
  are available from the index. In this example, x is not needed except in
  the context f(x), but the planner does not notice that and concludes that
  an index-only scan is not possible. If an index-only scan seems sufficiently
  worthwhile, this can be worked around by adding x as an included column,
  for example

CREATE INDEX tab_f_x ON tab (f(x)) INCLUDE (x);

Yours,
Laurenz Albe

Re: "not related" code blocks for removal of dead rows when using vacuum and this kills the performance

2024-02-19 Thread Laurenz Albe

On Mon, 2024-02-19 at 16:14 +, Lars Aksel Opsahl wrote:
> Then we start testing VACUUM and very simple SQL testing in another window.
> 
> We can now show we have performance of "3343.794 ms" and not "0.123 ms", which
> is what we get when we are able to remove dead rows and run a new analyze.
> 
> The problem is that as long as the master code is active, we cannot remove
> alle dead rows and that what seems to be killing the performance.
> 
> With active I mean in hanging on pg_sleep and remember that this master has
> not created the test table or inserted any data in this test table it self.
> 
> Is the expected behavior ?

It is not entirely clear what you are doing, but it seems like you are holding
a database transaction open, and yes, then it is expected behavior that
VACUUM cannot clean up dead rows in the table.

Make sure that your database transactions are short.
Don't use table or row locks to synchronize application threads.
What you could use to synchronize your application threads are advisory locks,
they are not tied to a database transaction.

Yours,
Laurenz Albe

Re: "not related" code blocks for removal of dead rows when using vacuum and this kills the performance

2024-02-19 Thread Laurenz Albe

On Tue, 2024-02-20 at 05:46 +, Lars Aksel Opsahl wrote:
> If this is expected behavior it means that any user on the database that 
> writes
> a long running sql that does not even insert any data can kill performance for
> any other user in the database.

Yes, that is the case.  A long running query will hold a snapshot, and no data
visible in that snapshot can be deleted.

That can cause bloat, which can impact performance.

> So applications like QGIS who seems to keep open connections for a while can
> then also kill the performance for any other user in the data.

No, that is not a problem.  Keeping *connections* open is a good thing. It is
keeping data modifying transactions, cursors or long-running queries open
that constitutes a problem.

Yours,
Laurenz Albe

Re: sql statement not using all primary key values and poor performance

2024-02-23 Thread Laurenz Albe

On Fri, 2024-02-23 at 15:20 +0800, James Pang wrote:
>    we have a SQL from Postgresql JDBC,  primary is based on 
> (bigint,varchar2,bigint),
> but from sql plan, it convert to ::numeric so the plan just use one "varchar"
> key column and use the other 2 bigint keys as filters. what's the cause about 
> that ? 
> 
>                             Table "test.xx"
>       Column      |              Type              | Collation | Nullable | 
> Default
> --++---+--+-
>   xxxid           | bigint                         |           | not null |
>  paramname        | character varying(512)         |           | not null |
>  paramvalue       | character varying(1536)        |           |          |
>   sssid           | bigint                         |           | not null |
>  createtime       | timestamp(0) without time zone |           |          |
>  lastmodifiedtime | timestamp(0) without time zone |           |          |
>  mmmuuid          | character varying(32)          |           |          |
> Indexes:
>       "pk_xx" PRIMARY KEY, btree (xxxid, paramname, sssid)
>       "idx_xx_mmmuuid" btree (sssid, mmmuuid, paramname)
> 
> SET extra_float_digits = 3
> 
> duration: 7086.014 ms  plan:
>         Query Text: SELECT XXXFID, PARAMNAME, PARAMVALUE, SSSID, CREATETIME, 
> LASTMODIFIEDTIME, MMMUUID FROM test.XX WHERE  ( ( XXXID = $1  ) )  AND  ( 
> ( PARAMNAME = $2  ) )  AND  ( ( SSSID = $3  ) )
>         Index Scan using pk_xx on test.xx  (cost=0.57..2065259.09 
> rows=1 width=86) (actual time=7086.010..7086.011 rows=0 loops=1)
>           Output: confid, paramname, paramvalue, sssid, createtime, 
> lastmodifiedtime, mmmuuid
>           Index Cond: ((xx.paramname)::text = 'cdkkif'::text)   <<< 
> just use only one key instead all primary keys.
>           Filter: (((xx.xxxid)::numeric = '18174044'::numeric) AND 
> ((xx.sssid)::numeric = '253352'::numeric))    <<< it's bigint but 
> converted to numeric 
>           Buffers: shared read=1063470
>           I/O Timings: read=4402.029
> 
> it's from JDBC, we saw this JDBC driver try to set extra_float_digits = 3 
> before
> running the SQL ,does that make planner to convert bigint to numeric ?

Setting "extra_float_digits" is just something the JDBC driver does so as to
not lose precision with "real" and "double precision" values on old versions
of PostgreSQL.

The problem is that you bind the query parameters with the wrong data types.
Don't use "setBigDecimal()", but "setLong()" if you want to bind a "bigint".
An alternative is "setObject()" with "targetSqlType" set to "Types.BIGINT".

Yours,
Laurenz Albe

Re: sql statement not using all primary key values and poor performance

2024-02-23 Thread Laurenz Albe

On Fri, 2024-02-23 at 18:21 +0800, James Pang wrote:
> it's a third-party vendor application, not easy to change their code.

Then the application is broken, and you should make the vendor fix it.

> is it possible to   1) in Postgresql JDBC driver connection, set
> plan_cache_mode=force_custom_plan    or 2) some other parameters can 
> workaround this issue?

You can set "prepareThreshold" to 0 to keep the JDBC driver from using
prepared statements in PostgreSQL.  I am not sure if that is enough to
fix the problem.

Yours,
Laurenz Albe

Re: Optimizing count(), but Explain estimates wildly off

2024-02-26 Thread Laurenz Albe

On Mon, 2024-02-26 at 18:25 -0600, Chema wrote:
> I'm trying to optimize simple queries on two tables (tenders & items) with a 
> couple
> million records.  Besides the resulting records, the app also displays the 
> count of
> total results.  Doing count() takes as much time as the other query (which 
> can be
> 30+ secs), so it's an obvious target for optimization.
> 
> Reading around, seems many people are still using this 2005 snippet to obtain 
> the
> row count estimate from Explain:

I recommend using FORMAT JSON and extracting the top row count from that.  It is
simpler and less error-prone.

> Is this still the current best practice?  Any tips to increase precision?
> Currently it can estimate the actual number of rows for over or under a 
> million,
> as seen on the sample queries (1,955,297 instead of 1,001,200; 162,080 instead
> of 1,292,010).

Looking at the samples you provided, I get the impression that the statistics 
for
the table are quite outdated.  That will affect the estimates.  Try running 
ANALYZE
and see if that improves the estimates.

Yours,
Laurenz Albe

Re: Table Partitioning and Indexes Performance Questions

2024-02-29 Thread Laurenz Albe

On Thu, 2024-02-29 at 11:42 -0500, David Kelly wrote:
> I was told that partitioned table indexed must always start with the 
> partition key columns.

That's not true.

Only unique indexes (as used by primary key and unique constraints) must
contain the partitioning key (but they don't have to start with it).

> Any other performance considerations when it comes to partitioned table 
> indexing?
> Specifically, partitioning by range where the range is a single value.

Not particularly - selecting from a partitioned table is like selecting
from a UNION ALL of all partitions, except that sometimes PostgreSQL
can forgo scanning some of the partitions.
If you use very many partitions, the overhead for query planning and
execution can become noticable.

Yours,
Laurenz Albe

Re: Optimizing count(), but Explain estimates wildly off

2024-03-01 Thread Laurenz Albe

On Thu, 2024-02-29 at 17:15 -0600, Chema wrote:
> No major changes after doing Analyze, and also Vacuum Analyze.

Indeed.

This caught my attention:

> ->  Parallel Seq Scan on pricescope_items  (cost=0.00..1027794.01 
> rows=3277101 width=522) (actual time=0.753..41654.507 rows=2621681 loops=3)

Why does it take over 41 seconds to read a table with less than
3 million rows?  Are the rows so large?  Is the tabe bloated?
What is the size of the table as measured with pg_relation_size()
and pg_table_size()?

Yours,
Laurenz Albe

Re: maintenance_work_mem impact?

2024-03-19 Thread Laurenz Albe

On Tue, 2024-03-19 at 16:05 +, Adithya Kumaranchath wrote:
> I have a table approx. 20GB.
> 
> CREATE UNIQUE INDEX testindex_v1 ON testtable1 (test_index); 
> 
> My observations:
> maintenance_work_mem = 2G
> max_parallel_workers = '16'
> 
> The create index completes in 20 minutes.
> 
> When I change this:
> maintenance_work_mem = 16G
> max_parallel_workers = '16'
> 
> It completes in 9 minutes. So I can see that I can gain performance by 
> changing this number.
> 
> So it is faster but the question I have is it safe to set it to such a high 
> number?
> I am aware that onlyone of these operations can be executed at a time by a 
> database
> session, and an installation normally doesn't have many of them running 
> concurrently,
> so it's safe to set this value significantly larger. I have 128GB memory.
>    1. Any advice or thoughts?
>    2. Is there any other parameter that can accelerate index creation? 

It is safe as long as you have enough free memory on the machine.

You can verify with tools like "free" on Linux (look for "available" memory).

Yours,
Laurenz Albe

Re: LWlock:LockManager waits

2024-04-08 Thread Laurenz Albe

On Tue, 2024-04-09 at 11:07 +0800, James Pang wrote:
>    we found sometimes , with many sessions running same query "select ..." at 
> the same time, saw many sessions waiting on "LockManager".  for example, 
> pg_stat_activity show.  It's a production server, so no enable trace_lwlocks 
> flag. could you direct me what's the possible reason and how to reduce this 
> "lockmanager" lock?  all the sql statement are "select " ,no DML.
> 
>    time                                             wait_event           
> count(pid) 
> 2024-04-08 09:00:06.043996+00 | DataFileRead  |    42
>  2024-04-08 09:00:06.043996+00 |               |    15
>  2024-04-08 09:00:06.043996+00 | LockManager   |    31
>  2024-04-08 09:00:06.043996+00 | BufferMapping |    46
>  2024-04-08 09:00:07.114015+00 | LockManager   |    43
>  2024-04-08 09:00:07.114015+00 | DataFileRead  |    28
>  2024-04-08 09:00:07.114015+00 | ClientRead    |    11
>  2024-04-08 09:00:07.114015+00 |               |    11

That's quite obvious: too many connections cause internal contention in the 
database.

Reduce the number of connections by using a reasonably sized connection pool.

Yours,
Laurenz Albe

Re: Slow join

2018-06-27 Thread Laurenz Albe

Roman Kushnir wrote:
> The following basic inner join is taking too much time for me. (I’m using 
> count(videos.id)
> instead of count(*) because my actual query looks different, but I simplified 
> it here to the essence).
> I’ve tried following random people's suggestions and adjusting the 
> random_page_cost
> (decreasing it from 4 to 1.1) without a stable improvement. Any hints on what 
> is wrong here? Thank you.
> 
> The query
> 
> SELECT COUNT(videos.id) FROM videos JOIN accounts ON accounts.channel_id = 
> videos.channel_id;
> 
> The accounts table has 744 rows, videos table has 2.2M rows, the join 
> produces 135k rows.
> 
> Running on Amazon RDS, with default 10.1 parameters
> 
>  version
> -
>  PostgreSQL 10.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 
> (Red Hat 4.8.2-16), 64-bit
> 
> Execution plan https://explain.depesz.com/s/gf7

Your time is spent here:

> ->  Parallel Seq Scan on videos  (cost=0.00..480898.90 rows=940390 width=16) 
> (actual time=0.687..5.774 rows=764042 loops=3)
>   Buffers: shared hit=7138 read=464357

55 seconds to scan 3.5 GB is not so bad.

What I wonder is how it is that you have less than two rows per table block.
Could it be that the table is very bloated?

If you can, you could "VACUUM (FULL) videos" and see if that makes a difference.
If you can bring the table size down, it will speed up query performance.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Bug in PostgreSQL

2018-06-27 Thread Laurenz Albe

Rambabu V wrote:
> While taking pgdump we are getting error message cache lookup failed for 
> function 7418447.
> While trying select * from pg_proc where oid=7418447 returns zero rows. 
> Please help us on this. 

That means that some catalog data are corrupted.

If possible, restore from a backup.

Did you experiences any crashes recently?  Is your storage reliable?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Inconsistent query times and spiky CPU with GIN tsvector search

2018-09-04 Thread Laurenz Albe

Scott Rankin wrote:
> We are running postgresql 9.4 and we have a table where we do some
> full-text searching using a GIN index on a tsvector column:
> 
> CREATE INDEX location_search_tsvector_idx
> ON public.location_search USING gin
> (search_field_tsvector)
> TABLESPACE pg_default;
> 
> This setup has been running very well, but as our load is getting heavier,
> the performance seems to be getting much more inconsistent.
> Our searches are run on a dedicated read replica, so this server is only
> doing queries against this one table.  IO is very low, indicating to me
> that the data is all in memory.  However, we're getting some queries taking
> upwards of 15-20 seconds, while the average is closer to 1 second.
> 
> A sample query that's running slowly is
> 
> explain (analyze, buffers)
> SELECT ls.location AS locationId FROM location_search ls
> WHERE ls.client = 1363
> AND ls.favorite = TRUE
> AND search_field_tsvector @@ to_tsquery('CA-94:* &E &San:*')
> LIMIT 4;
> 
> And the explain analyze is:
> 
> Limit  (cost=39865.85..39877.29 rows=1 width=8) (actual 
> time=4471.120..4471.120 rows=0 loops=1)
>   Buffers: shared hit=25613
>   ->  Bitmap Heap Scan on location_search ls  (cost=39865.85..39877.29 rows=1 
> width=8) (actual time=4471.117..4471.117 rows=0 loops=1)
> Recheck Cond: (search_field_tsvector @@ to_tsquery('CA-94:* &E 
> &San:*'::text))
> Filter: (favorite AND (client = 1363))
> Rows Removed by Filter: 74
> Heap Blocks: exact=84
> Buffers: shared hit=25613
> ->  Bitmap Index Scan on location_search_tsvector_idx  
> (cost=0.00..39865.85 rows=6 width=0) (actual time=4470.895..4470.895 rows=84 
> loops=1)
>   Index Cond: (search_field_tsvector @@ to_tsquery('CA-94:* &E 
> &San:*'::text))
>   Buffers: shared hit=25529
> Planning time: 0.335 ms
> Execution time: 4487.224 ms

Not sure, but maybe you are suffering from bad performance because of a
long "GIN pending list".

If yes, then the following can help:

  ALTER INDEX location_search_tsvector_idx SET (gin_pending_list_limit = 512);

Or you can disable the feature altogether:

  ALTER INDEX location_search_tsvector_idx SET (fastupdate = off);

Then clean the pending list with

  SELECT gin_clean_pending_list('location_search_tsvector_idx'::regclass);

Disabling the pending list will slow down data modification, but should
keep the SELECT performance stable.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Select count(*) on a 2B Rows Tables Takes ~20 Hours

2018-09-17 Thread Laurenz Albe

Fred Habash wrote:
> If I'm reading this correctly, it took 57M ms out of an elapsed time of 61M 
> ms to read 45M pages from the filesystem?
> If the average service time per sarr is < 5 ms, Is this a case of bloated 
> index where re-indexing is warranted? 
> 
> explain (analyze,buffers,timing,verbose,costs)
> select count(*) from jim.pitations ;
>   
>QUERY PLAN 
> 
> -
> Aggregate  (cost=72893810.73..72893810.74 rows=1 width=8) (actual 
> time=61141110.437..61141110.437 rows=1 loops=1)
>Output: count(*)
>Buffers: shared hit=72620045 read=45297330
>I/O Timings: read=57489958.088
>->  Index Only Scan using pit_indx_fk03 on jim.pitations  
> (cost=0.58..67227187.37 rows=2266649344 width=0) (actual 
> time=42.327..60950272.189 rows=2269623575 loops=1)
>  Output: vsr_number
>  Heap Fetches: 499950392
>  Buffers: shared hit=72620045 read=45297330
>  I/O Timings: read=57489958.088
> Planning time: 14.014 ms
> Execution time: 61,141,110.516 ms
> (11 rows)

2269623575 / (45297330 + 72620045) ~ 20, so you have an average 20
items per block.  That is few, and the index seems indeed bloated.

Looking at the read times, you average out at about 1 ms per block
read from I/O, but with that many blocks that's of course still a long time.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Why could different data in a table be processed with different performance?

2018-09-20 Thread Laurenz Albe

Vladimir Ryabtsev wrote:
> explain (analyze, buffers)
> select count(*), sum(length(content::text)) from articles where article_id 
> between %s and %s
> 
> Sample output:
> 
> Aggregate  (cost=8635.91..8635.92 rows=1 width=16) (actual 
> time=6625.993..6625.995 rows=1 loops=1)
>   Buffers: shared hit=26847 read=3914
>   ->  Index Scan using articles_pkey on articles  (cost=0.57..8573.35 
> rows=5005 width=107) (actual time=21.649..1128.004 rows=5000 loops=1)
> Index Cond: ((article_id >= 43800) AND (article_id <= 438005000))
> Buffers: shared hit=4342 read=671
> Planning time: 0.393 ms
> Execution time: 6626.136 ms
> 
> Aggregate  (cost=5533.02..5533.03 rows=1 width=16) (actual 
> time=33219.100..33219.102 rows=1 loops=1)
>   Buffers: shared hit=6568 read=7104
>   ->  Index Scan using articles_pkey on articles  (cost=0.57..5492.96 
> rows=3205 width=107) (actual time=22.167..12082.624 rows=2416 loops=1)
> Index Cond: ((article_id >= '10002100'::bigint) AND (article_id 
> <= '10002101'::bigint))
> Buffers: shared hit=50 read=2378
> Planning time: 0.517 ms
> Execution time: 33219.218 ms
> 
> During iteration, I parse the result of EXPLAIN and collect series of 
> following metrics:
> 
> - buffer hits/reads for the table,
> - buffer hits/reads for the index,
> - number of rows (from "Index Scan..."),
> - duration of execution.
> 
> Based on metrics above I calculate inherited metrics:
> 
> - disk read rate: (index reads + table reads) * 8192 / duration,
> - reads ratio: (index reads + table reads) / (index reads + table reads + 
> index hits + table hits),
> - data rate: (index reads + table reads + index hits + table hits) * 8192 / 
> duration,
> - rows rate: number of rows / duration.
> 
> Since "density" of IDs is different in "small" and "big" ranges, I adjusted
> size of chunks in order to get around 5000 rows on each iteration in both 
> cases,
> though my experiments show that chunk size does not really matter a lot.
> 
> The issue posted at the very beginning of my message was confirmed for the
> *whole* first and second ranges (so it was not just caused by randomly cached 
> data).
> 
> To eliminate cache influence, I restarted Postgres server with flushing 
> buffers:
> 
> /$ postgresql stop; sync; echo 3 > /proc/sys/vm/drop_caches; postgresql start
> 
> After this I repeated the test and got next-to-same picture.
> 
> "Small' range: disk read rate is around 10-11 MB/s uniformly across the test.
> Output rate was 1300-1700 rows/s. Read ratio is around 13% (why? Shouldn't it 
> be
> ~ 100% after drop_caches?).
> "Big" range: In most of time disk read speed was about 2 MB/s but sometimes
> it jumped to 26-30 MB/s. Output rate was 70-80 rows/s (but varied a lot and
> reached 8000 rows/s). Read ratio also varied a lot.
> 
> I rendered series from the last test into charts:
> "Small" range: https://i.stack.imgur.com/3Zfml.png
> "Big" range (insane): https://i.stack.imgur.com/VXdID.png
> 
> During the tests I verified disk read speed with iotop and found its 
> indications
> very close to ones calculated by me based on EXPLAIN BUFFERS. I cannot say I 
> was
> monitoring it all the time, but I confirmed it when it was 2 MB/s and 22 MB/s 
> on
> the second range and 10 MB/s on the first range. I also checked with htop that
> CPU was not a bottleneck and was around 3% during the tests.
> 
> The issue is reproducible on both master and slave servers. My tests were 
> conducted
> on slave, while there were no any other load on DBMS, or disk activity on the
> host unrelated to DBMS.
> 
> My only assumption is that different fragments of data are being read with 
> different
> speed due to virtualization or something, but... why is it so strictly bound
> to these ranges? Why is it the same on two different machines?

What is the storage system?

Setting "track_io_timing = on" should measure the time spent doing I/O
more accurately.

One problem with measuring read speed that way is that "buffers read" can
mean "buffers read from storage" or "buffers read from the file system cache",
but you say you observe a difference even after dropping the cache.

To verify if the difference comes from the physical placement, you could
run VACUUM (FULL) which rewrites the table and see if that changes the behavior.

Another idea is that the operating system rearranges I/O in a way that
is not ideal for your storage.

Try a different I/O scheduler by running

echo deadline > /sys/block/sda/queue/scheduler

(replace "sda" with the disk where your database resides)

See if that changes the observed I/O speed.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: autovacuum is running but pg_stat_all_tables empty

2018-11-19 Thread Laurenz Albe

Mariel Cherkassky wrote:
> I'm trying to understand something that is weird on one of my environments.
> When I query pg_stat_all_tables I see that most of the tables dont have any
> value in the last_autovacuum/analyze column. In addition the columns
> autovacuum_count/analyze_count is set to 0. However, when checking the logs,
> I see that on some of those tables autovacuum run. I think that there is
> something wrong with the database statistics collector. In addition, the
> column n_dead_tup and n_live_tup are set and in some of the cases n_dead_tup
> is more then 20% of the table tuples. In addition, all tables have default
> vacuum threshold.
> 
> Any idea what else I can check ?
> The problem isnt only that dead tuples arent deleted (I dont have long running
> transaction that might cause it) but the fact that the statistics arent 
> accurate/wrong.

You can use the "pgstattuple" extension to check that table for the actual
dead tuple percentage to see if the statistics are accurate or not.

To see the statistic collector's UDP socket, run

netstat -a|grep udp|grep ESTABLISHED

Check if it is there.  If it is on IPv6, make sure that IPv6 is up, otherwise
that would explain why you have no accurate statistics.

Are there any log messages about statistics collection?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: No matching tables have ever been vacuumed

2019-01-17 Thread Laurenz Albe

Daulat Ram wrote:
> We have enabled the monitoring to monitor the vacuuming of tables via 
> check_postgres_last_vacuum plugin but we are getting the below warning 
> message.
>  
> Notification Type: PROBLEM
> Service: PostgreSQL last vacuum ()
> Host Alias: vmshowcasedb2.vpc.prod.scl1.us.tribalfusion.net
> Address: 10.26.12.89
> State: UNKNOWN
> Info: POSTGRES_LAST_VACUUM UNKNOWN: DB postgres 
> (host:vmshowcasedb2.vpc.prod.scl1.us.tribalfusion.net) No matching tables 
> have ever been vacuumed
>  
> Kindly suggest how we can overcome on this.

Disable the test, it is mostly pointless.

Only tables that regularly receive updates and deletes need to be vacuumed.
A table that is never modified needs to be vacuumed at most once during its 
lifetime
for transaction wraparound, but there are other checks for problems with that.

Alternatively, you can just manually vacuum all tables once - if all it
checks is if it *ever* has been vacuumed.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: SELECT performance drop

2019-01-23 Thread Laurenz Albe

On Wed, 2019-01-23 at 10:28 -0700, Jan Nielsen wrote:
> select
> order0_.id as id1_7_,
> order0_.created_by as created_2_7_,
> order0_.created_date as created_3_7_,
> order0_.last_modified_by as last_mod4_7_,
> order0_.last_modified_date as last_mod5_7_,
> order0_.consumer as consumer6_7_,
> order0_.market_id as market_14_7_,
> order0_.original as original7_7_,
> order0_.owner_id as owner_i15_7_,
> order0_.owner_target as owner_ta8_7_,
> order0_.price as price9_7_,
> order0_.session_id as session16_7_,
> order0_.side as side10_7_,
> order0_.supplier as supplie11_7_,
> order0_.type as type12_7_,
> order0_.units as units13_7_ 
> from
> fm_order order0_ cross 
> join
> fm_session session1_ 
> where
> order0_.session_id=session1_.id 
> and order0_.type='LIMIT' 
> and session1_.original=7569 
> and (
> order0_.consumer is null 
> or order0_.consumer>0 
> and (
> exists (
> select
> 1 
> from
> fm_order order2_ cross 
> join
> fm_session session3_ 
> where
> order2_.session_id=session3_.id 
> and order2_.id=order0_.consumer 
> and session3_.original=7569 
> and order2_.type='LIMIT' 
> and order2_.owner_id<>order0_.owner_id
> )
> )
> ) 
> order by
> order0_.last_modified_date DESC;

It might be more efficient to rewrite that along these lines:

SELECT DISTINCT order0_.*
FROM fm_order order0_
   JOIN fm_session session1_ ON order0_.session_id = session1_.id
   LEFT JOIN fm_order order2_ ON order2_.id = order0_.consumer
   LEFT JOIN fm_session session3_ ON order2_.session_id = session3_.id
WHERE coalesce(order2_.id, 1) > 0
AND /* all the other conditions */;

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: upgrade from 9.6 to 10/11

2019-01-28 Thread Laurenz Albe

Mariel Cherkassky wrote:
> I'm planning our db upgrade from 9.6. Basically I wanted to check how stable
> is pg11 version. I'm considering upgrading from 9.6 to 10 and then to 11 
> immediatly.
> Is there a way to upgrade directly to 11 and jump on 10.

v11 is stable, else the PGDG would not release it.

There is no need to upgrade via v10, I recommend that you upgrade from 9.6
to v11 directly, either via dump/restore or with pg_upgrade.

https://www.postgresql.org/docs/current/upgrading.html

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Setting effective_cache size

2019-01-31 Thread Laurenz Albe

Nandakumar M wrote:
> According to https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server :
> 
> > effective_cache_size should be set to an estimate of how much memory is 
> > available for disk caching by the operating system and within the database 
> > itself, after taking into account what's used by the OS itself and other 
> > applications.
> 
> I intend to run a java application and postgres server in the same
> server machine. The java application requires 2 GB RAM max.
> 
> Considering that our server machine has 4 GB RAM, should I reduce the
> effective_cache_size to say 768 MB or am I better off with the default
> 4 GB value?
> 
> This is particularly confusing because in this thread Tom Lane says
> the following
> 
> > I see no problem with a value of say 4GB;
> > that's very unlikely to be worse than the pre-9.4 default (128MB) on any 
> > modern machine.
> 
> PS : I got the value 768 MB from https://pgtune.leopard.in.ua/#/ by
> giving 1 GB as the amount of memory postgres can use.

I would set effective_cache_size to 2GB or a little lower.

This is a number that tells the optimizer how likely it is to
find index data cached if the index is used repeatedly, so it
is not important to get the value exactly right.

Yours,
Laurenz Albe

Re: How can sort performance be so different

2019-02-06 Thread Laurenz Albe

Merlin Moncure wrote:
> yeah, probably.  Having said that, I'm really struggling that it can
> take take several minutes to sort such a small number of rows even
> with location issues.  I can sort rocks faster than that :-).
> 
> Switching between various european collations, I'm seeing subsecond
> sort responses for 44k records on my test box.  I don't have the laos
> collation installed unfortunately.   Are you seeing kind of penalty in
> other conversions?

I find that it makes a lot of difference what you sort:

CREATE TABLE sort(t text);

INSERT INTO sort SELECT 'ຕົວອັກສອນລາວ... ງ່າຍຂື້ນ' || i FROM generate_series(1, 
10) AS i;

SET work_mem = '1GB';

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort ORDER BY t COLLATE "C";
[...]
 Execution Time: 288.752 ms

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort ORDER BY t COLLATE "lo_LA.utf8";
[...]
 Execution Time: 47006.683 ms

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort ORDER BY t COLLATE "en_US.utf8";
[...]
 Execution Time: 73962.934 ms


CREATE TABLE sort2(t text);

INSERT INTO sort2 SELECT 'this is plain old English' || i FROM 
generate_series(1, 10) AS i;

SET work_mem = '1GB';

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort2 ORDER BY t COLLATE "C";
[...]
 Execution Time: 237.615 ms

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort2 ORDER BY t COLLATE "lo_LA.utf8";
[...]
 Execution Time: 2467.848 ms

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort2 ORDER BY t COLLATE "en_US.utf8";
[...]
 Execution Time: 2927.667 ms


This is on my x86_64 Fedora 29 system, kernel 4.20.6, glibc 2.28.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread Laurenz Albe

On Wed, 2019-02-06 at 12:29 +0200, Mariel Cherkassky wrote:
> Hi,
> I have a table with a bytea column and its size is huge and thats why 
> postgres created a toasted table for that column.
> The original table contains about 1K-10K rows but the toasted can contain up 
> to 20M rows.
> I assigned the next two settings for the toasted table : 
>  alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);
>   alter table orig_table set (toast.autovacuum_vacuum_threshold  =1);
> 
> Therefore I hoped that after deletion of 10K rows from the toasted table 
> autovacuum will launch vacuum on the toasted table.
> From the logs I see that sometimes the autovacuum is running once in a few 
> hours (3-4 hours) and sometimes it runs even every few minutes.
> Now I wanted to check if only depends on the thresholds and on the frequency 
> of the deletes/updates on the table ?
> In some cases the autovacuum is taking a few hours (4+) it finishes and then 
> immediatly is starting to run vacuum again on  the table : 
> 
> Now the question is how to handle or tune it ? Is there any change that I 
> need to increase the cost_limit / cost_delay ?

Maybe configuring autovacuum to run faster will help:

alter table orig_table set (toast.autovacuum_vacuum_cost_limit = 2000);

Or, more extreme:

alter table orig_table set (toast.autovacuum_vacuum_cost_delay = 0);

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: understanding max_wal_size,wal_keep_segments and checkpoints

2019-02-13 Thread Laurenz Albe

Mariel Cherkassky wrote:
> I'm trying to understand the logic behind all of these so I would be happy
> if you can confirm what I understood or correct me if I'm wrong :
> -The commit command writes all the data in  the wal_buffers is written into 
> the wal files.

All the transaction log for the transaction has to be written to file, and the
files have to be sync'ed to storage before COMMIT completes.
That way the transaction can be replayed in case of a crash.

> -Checkpoints writes the data itself (blocks that were changed) into the data 
> files
>  in the base dir. Just to make sure, as part of the checkpoint, it needs to 
> read the
>  wal files that were generated since the last checkpoint right ?

No WAL file has to be read during a checkpoint.

When data in the database ar modified, they are modified in the "shared buffers"
RAM cache.  Later, these "direty blocks" are written to disk by the background
writer process or the checkpoint.

> -max_wal_size is a soft limit for the total size of all the wals that were 
> generated.
>  When the total_size of the pg_xlog dir reaches max_wal_size(can increase it 
> because
>  of peaks and some other issues..) the db will force a checkpoint to write 
> the changes
>  from the wals into the disk and then it will start recycling old wals (all 
> of them ?
>  or only those who were written ?).

A checkpoint is forced when more than max_wal_size WAL has been written since 
the
last checkpoint.

After a checkpoint, unneeded WAL segments are either recycled (renamed and 
reused)
or deleted (if max_wal_size has been exceeded).

WAL segments are unneeded if they are older than the checkpoint, have been 
archived
(if archiving is configured), don't need to be kept around because of 
wal_keep_segments
and are older than the position of any active replication slot.

> -wal_keep_segments is meant to help standbys that didn't receive the wals, so 
> it allow
>  us to keep wal_keep_segments wals in our pg_xlog dir.

Yes.

> - in case we have a collision between wal_keep_segments and max_wal_size the
>  wal_keep_segments will be the one that be used right ?. For example, lets 
> say my
>  wal_size is default(16MB). I set max_wal_size to 1GB which is 1600/16=100 
> wals.
>  However, my wal_keep_segments is set to 300. It means that when the 
> total_size of
>  the pg_xlog directory will reach 1GB, checkpoint will be forced but old wal 
> files
>  wont be recycled/deleted ?

Checkpoints are not forced by the size of pg_xlog, but by the amount of WAL
created since the last checkpoint.

The last wal_keep_segments WAL segments are always kept around, even if that
exceeds max_wal_size.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: understanding max_wal_size,wal_keep_segments and checkpoints

2019-02-13 Thread Laurenz Albe

Mariel Cherkassky wrote:
>  Yeah, so basically if we open a transaction and we do some insert queries, 
> until the transaction
>  is commited the changes(the wal data and not the blocked that are chaned) 
> are kept in the wal buffers ?
>  .  When the user commits the transaction, the wal buffer(only the 
> transaction log of that specific
>  transaction ?) is written to wal files. When the database completes saving 
> the content of the
>  transaction log into the wal files, the commit completes. Did I got it right 
> ?

WAL can be written to file before the transaction commits.
Otherwise the size of a transaction would be limited.
Only at commit time, it has to be written out and flushed to disk.

> What I meant, when checkpoint occurs, it reads the wal files created since 
> last checkpoint,
> and does those changing on the data blocks on the disk ? I was not talking 
> about dirty blocks
> from shared_buffer.

No, PostgreSQL does not read the WAL files when it performs a checkpoint.
When data are modified, first WAL is written, then it is written to shared 
buffers.
The checkpoint flushes dirty pages in shared buffers to disk.

> > so I'f I want have replication slot and wal_keep_segment is 0 after the 
> > archiving of
> > the wal it should be recycled/deleted ?

Only if it is older than the position of the replication slot.

> > So basically having wal_keep_segments and replication slot configured 
> > together is a mistake right ?
> > In that case, if you have both configured, and you set wal_keep_segments to 
> > 0, the db should 
> > delete all the unused wals ?

It is pointless to have both a replication slot and wal_keep_segments, yes.
Setting wal_keep_segments to 0 is the right move in that case and should
reduce pg_xlog size in time.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: partition pruning

2019-02-14 Thread Laurenz Albe

suganthi Sekar wrote:
>  i am using Postgresql 11, i have 2 partition table , when i joined both 
> table in query 
> a table  its goes exact partition table  , but other table scan all partition
> 
> please clarify on this .
> 
> i have enabled below parameter on in configuration file
> Note  :  alter system set enable_partitionwise_join  to 'on';
> 
> 
> Example : 
> 
> explain analyze
> select * from call_report1 as a  inner join  call_report2 as b on 
> a.call_id=b.call_id
>  where a.call_created_date ='2017-11-01' and '2017-11-30'
> 
> 
> 
>  "Hash Right Join  (cost=8.19..50.47 rows=2 width=3635) (actual 
> time=0.426..0.447 rows=7 loops=1)"
> "  Hash Cond: (b.call_id = a.call_id)"
> "  ->  Append  (cost=0.00..41.81 rows=121 width=2319) (actual 
> time=0.040..0.170 rows=104 loops=1)"
> "->  Seq Scan on call_report2 b  (cost=0.00..0.00 rows=1 width=528) 
> (actual time=0.010..0.010 rows=0 loops=1)"
> "->  Seq Scan on call_report2_201803 b_1  (cost=0.00..10.30 rows=30 
> width=2334) (actual time=0.029..0.031 rows=14 loops=1)"
> "->  Seq Scan on call_report2_201711 b_2  (cost=0.00..10.30 rows=30 
> width=2334) (actual time=0.014..0.015 rows=7 loops=1)"
> "->  Seq Scan on call_report2_201712 b_3  (cost=0.00..10.30 rows=30 
> width=2334) (actual time=0.017..0.047 rows=34 loops=1)"
> "->  Seq Scan on call_report2_201801 b_4  (cost=0.00..10.30 rows=30 
> width=2334) (actual time=0.017..0.058 rows=49 loops=1)"
> "  ->  Hash  (cost=8.17..8.17 rows=2 width=1314) (actual time=0.104..0.104 
> rows=7 loops=1)"
> "Buckets: 1024  Batches: 1  Memory Usage: 12kB"
> "->  Append  (cost=0.00..8.17 rows=2 width=1314) (actual 
> time=0.053..0.060 rows=7 loops=1)"
> "  ->  Seq Scan on call_report1 a  (cost=0.00..0.00 rows=1 
> width=437) (actual time=0.022..0.022 rows=0 loops=1)"
> "Filter: ((call_created_date >= '2017-11-01'::date) AND 
> (call_created_date <= '2017-11-30'::date))"
> "  ->  Index Scan using idx_call_report1_201711_ccd on 
> call_report1_201711 a_1  (cost=0.14..8.16 rows=1 width=2190) (actual 
> time=0.029..0.034 rows=7 loops=1)"
> "Index Cond: ((call_created_date >= '2017-11-01'::date) 
> AND (call_created_date <= '2017-11-30'::date))"
> "Planning Time: 20.866 ms"
> "Execution Time: 1.205 ms"

There is no condition on the table "call_report2" in your query,
so it is not surprising that all partitions are scanned, right?

You have to add a WHERE condition that filters on the partitioning
column(s) of "call_report2".

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: autovacuum big table taking hours and sometimes seconds

2019-02-15 Thread Laurenz Albe

Mariel Cherkassky wrote:
> Lets focus for example on one of the outputs :
> postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum of 
> table "myDB.pg_toast.pg_toast_1958391": index scans: 8
> postgresql-Fri.log-   pages: 2253 removed, 13737828 remain
> postgresql-Fri.log-   tuples: 21759258 removed, 27324090 remain
> postgresql-Fri.log-   buffer usage: 15031267 hits, 21081633 misses, 19274530 
> dirtied
> postgresql-Fri.log-   avg read rate: 2.700 MiB/s, avg write rate: 2.469 MiB/s
> 
> The cost_limit is set to 200 (default) and the cost_delay is set to 20ms. 
> The calculation I did : (1*15031267+10*21081633+20*19274530)/200*20/1000 = 
> 61133.8197 seconds ~ 17H
> So autovacuum was laying down for 17h ? I think that I should increase the 
> cost_limit to max specifically on the toasted table. What do you think ? Am I 
> wrong here ?

Increasing cost_limit or reducing cost_delay improves the situation.

cost_delay = 0 makes autovacuum as fast as possible.

Yours,
Laurenz Albe

Re: Query slow for new participants

2019-02-26 Thread Laurenz Albe

supp...@mekong.be wrote:
> EXPLAIN (ANALYZE, BUFFERS), not just EXPLAIN
> "Index Scan using ix_companyarticledb_company on companyarticledb  
> (cost=0.57..2.80 rows=1 width=193) (actual time=1011.335..1011.454 rows=1 
> loops=1)"
> "  Index Cond: (companyid = 77)"
> "  Filter: (articleid = 7869071)"
> "  Rows Removed by Filter: 2674361"
> "  Buffers: shared hit=30287"
> "Planning time: 0.220 ms"
> "Execution time: 1011.502 ms"

Your problem are the "Rows Removed by Filter: 2674361".

The first thing I would try is:

  ALTER TABLE public.companyarticledb
 ALTER companyid SET STATISTICS 1000;

  ALTER TABLE public.companyarticledb
 ALTER articleid SET STATISTICS 1000;

  ANALYZE public.companyarticledb;

Then PostgreSQL has a better idea which condition is selective.

You can set STATISTICS up to 1, but don't forget that high values
make ANALYZE and planning slower.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: What is pg_stat_user_tables Showing NULL for last_autoanalyze & last_autovacuum

2019-02-28 Thread Laurenz Albe

On Wed, 2019-02-27 at 09:47 -0500, Fd Habash wrote:
> I have been able to locate four google search results with the same inquiry. 
> What’ve been able to understand is …
>  
> If auto-vaccum is working as expected, stats collector does not nullify these 
> values as part of a
> startup sequence or regular Maitenance. If a relation gets 
> auto[vacuumed|analyzed], the timestamps should remain.
> A database engine crash or restart with ‘immediate’ option will cause the 
> timestamps to nullify.
> Table never qualified for vacuuming based on auto-vacuum settings.
>  
> I can rule out all three scenarios above, but I still see null values. What 
> else could be at play here?

The obvious suspicion is that autovacuum starts, but cannot finish because it 
either
cannot keep up with the change rate or gives up because it is blocking a 
concurrent
session.

What is "n_live_tup" and "n_dead_tup" in "pg_stat_user_tables" for these tables?
Are there any autovacuum workers running currently?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: How to get the content of Bind variables

2019-02-28 Thread Laurenz Albe

ROS Didier wrote:
>In the log file of my PostgreSQL cluster, I find :
> >> 
> Statement: update t_shared_liste_valeurs set deletion_date=$1, 
> deletion_login=$2, modification_date=$3, modification_login=$4, 
> administrable=$5, libelle=$6, niveau=$7 where code=$8
> << 
>  
> how to get the content of the bind variables ?

Can we see the whole log entry and the following one?

Perhaps there was a syntax error or similar, and the statement was never 
executed.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: How to get the content of Bind variables

2019-03-01 Thread Laurenz Albe

ROS Didier wrote:
> Here is a biggest part of my log file :
> 
>  2019-02-27 14:41:28 CET [16239]: [5696-1] [10086] 
> user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainLOG:  
> duration: 1.604 ms
> 2019-02-27 14:41:28 CET [16239]: [5697-1] [10086] 
> user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainLOG:  
> duration: 0.084 ms  parse : update t_shared_liste_valeurs set 
> deletion_date=$1, deletion_login=$2, modification_date=$3, 
> modification_login=$4, administrable=$5, libelle=$6, niveau=$7 where code=$8
> 2019-02-27 14:41:28 CET [16239]: [5698-1] [10086] 
> user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainLOG:  
> plan:
> 2019-02-27 14:41:28 CET [16239]: [5699-1] [10086] 
> user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainSTATEMENT:
>   update t_shared_liste_valeurs set deletion_date=$1, deletion_login=$2, 
> modification_date=$3, modification_login=$4, administrable=$5, libelle=$6, 
> niveau=$7 where code=$8
> 2019-02-27 14:41:28 CET [16239]: [5700-1] [10086] 
> user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainLOG:  
> duration: 0.288 ms  bind : update t_shared_liste_valeurs set 
> deletion_date=$1, deletion_login=$2, modification_date=$3, 
> modification_login=$4, administrable=$5, libelle=$6, niveau=$7 where code=$8
> 2019-02-27 14:41:28 CET [16239]: [5701-1] [10086] 
> user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainLOG:  
> execute : update t_shared_liste_valeurs set deletion_date=$1, 
> deletion_login=$2, modification_date=$3, modification_login=$4, 
> administrable=$5, libelle=$6, niveau=$7 where code=$8
> <<
> The statement has been executed
> It is the same problem for all the statements.
> I can not get the content of the bind variables.

You should set "log_error_verbosity" back from "terse" to "default".
Then you will see the DETAIL messages.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: pgstattuple_approx for toasted table

2019-03-03 Thread Laurenz Albe

Mariel Cherkassky wrote:
> I was testing pgstattuple and I realized that pgstattuple is working on 
> toasted table but pgstattuple_approx is raising the next error msg : 
> 
> ERROR:  "pg_toast_18292" is not a table or materialized view
> 
> ahm, is that because the pgstattuple_approx uses visibility map ? Can someone 
> explain ? tnx.

You are right; here is the code:

/*
 * We support only ordinary relations and materialised views, because we
 * depend on the visibility map and free space map for our estimates about
 * unscanned pages.
 */
if (!(rel->rd_rel->relkind == RELKIND_RELATION ||
  rel->rd_rel->relkind == RELKIND_MATVIEW))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("\"%s\" is not a table or materialized view",
        RelationGetRelationName(rel;

Yours,
Laurenz Albe

-- 
Cybertec | https://www.cybertec-postgresql.com

Re: IS NOT DISTINCT FROM statement

2019-03-08 Thread Laurenz Albe

Artur Zając wrote:
> Is there any reason that „NOT DISTINCT FROM” can’t be autotransformed to „=” 
> when value
> on right side of expression is not NULL or is this any way to use index with 
> „IS NOT DISTINCT FROM” statement?

That would subtly change the semantics of the expression:

test=> SELECT NULL IS NOT DISTINCT FROM 21580;
 ?column? 
--
 f
(1 row)

test=> SELECT NULL = 21580;
 ?column? 
--
 
(1 row)

One expression is FALSE, the other NULL.

It doesn't matter in the context of your specific query, but it could matter.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Shared_buffers

2019-03-12 Thread Laurenz Albe

Daulat Ram wrote:
> I want to know about the working and importance of shared_buffers  in 
> Postgresql?
> is it similar to the oracle database buffer cache?

Yes, exactly.

The main difference is that PostgreSQL uses buffered I/O, while Oracle usually
uses direct I/O.

Usually you start with shared_buffers being the minimum of a quarter of the
available RAM and 8 GB.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: trying to analyze deadlock

2019-03-27 Thread Laurenz Albe

Mariel Cherkassky wrote:
> Hi all,
> I'm trying to analyze a deadlock that I have in one of our environments.
> The deadlock message : 
> 
> 06:15:49 EET db 14563  DETAIL:  Process 14563 waits for ShareLock on 
> transaction 1017405468; blocked by process 36589.
>   Process 36589 waits for ShareLock on transaction 1017403840; blocked by 
> process 14563.
>   Process 14563: delete from tableB where a in (select id from tableA 
> where c in ()
>   Process 36589: delete from tableA where c in ()
>   06:15:49 EET db 14563  HINT:  See server log for query details.
> 06:15:49 EET db 14563  STATEMENT:  delete from tableB where a in (select id 
> from tableA where c in ()
> 06:15:49 EET db 36589  LOG:  process 36589 acquired ShareLock on transaction 
> 1017403840 after 1110158.778 ms
> 06:15:49 EET db 36589  STATEMENT:  delete from tableA where c in ()
> 06:15:49 EET db 36589  LOG:  duration: 1110299.539 ms  execute : 
> delete from tableA where c in (...)
> 
> tableA : (id int, c int references c(id))
> tableB : (id int, a int references a(id) on delete cascade)
> tableC(id int...)
> 
> One A can have Many B`s connected to (One A to Many B).
> 
> deadlock_timeout is set to 5s.
> 
> Now I'm trying to understand what might cause this deadlock. I think that its 
> related to the foreign keys...

You can get that if the foreign key is defined as ON CASCADE DELETE or ON 
CASCADE SET NULL:

  CREATE TABLE a (a_id integer PRIMARY KEY);

  INSERT INTO a VALUES (1), (2);

  CREATE TABLE b (b_id integer PRIMARY KEY, a_id integer NOT NULL REFERENCES a 
ON DELETE CASCADE);

  INSERT INTO b VALUES (100, 1), (101, 1), (102, 2), (103, 2);

Transaction 1:

  BEGIN;

  DELETE FROM b WHERE b_id = 100;

Transaction 2:

  BEGIN;

  DELETE FROM a WHERE a_id = 2;

  DELETE FROM a WHERE a_id = 1; -- hangs

Transaction 1:

  DELETE FROM b WHERE b_id = 102;

ERROR:  deadlock detected
DETAIL:  Process 10517 waits for ShareLock on transaction 77325; blocked by 
process 10541.
Process 10541 waits for ShareLock on transaction 77323; blocked by process 
10517.
HINT:  See server log for query details.
CONTEXT:  while deleting tuple (0,3) in relation "b"

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: trying to analyze deadlock

2019-03-27 Thread Laurenz Albe

Mariel Cherkassky wrote:
> Hi all,
> I'm trying to analyze a deadlock that I have in one of our environments.
> The deadlock message : 
> 
> 06:15:49 EET db 14563  DETAIL:  Process 14563 waits for ShareLock on 
> transaction 1017405468; blocked by process 36589.
>   Process 36589 waits for ShareLock on transaction 1017403840; blocked by 
> process 14563.
>   Process 14563: delete from tableB where a in (select id from tableA 
> where c in ()
>   Process 36589: delete from tableA where c in ()
>   06:15:49 EET db 14563  HINT:  See server log for query details.
> 06:15:49 EET db 14563  STATEMENT:  delete from tableB where a in (select id 
> from tableA where c in ()
> 06:15:49 EET db 36589  LOG:  process 36589 acquired ShareLock on transaction 
> 1017403840 after 1110158.778 ms
> 06:15:49 EET db 36589  STATEMENT:  delete from tableA where c in ()
> 06:15:49 EET db 36589  LOG:  duration: 1110299.539 ms  execute : 
> delete from tableA where c in (...)
> 
> tableA : (id int, c int references c(id))
> tableB : (id int, a int references a(id) on delete cascade)
> tableC(id int...)
> 
> One A can have Many B`s connected to (One A to Many B).
> 
> deadlock_timeout is set to 5s.
> 
> Now I'm trying to understand what might cause this deadlock. I think that its 
> related to the foreign keys...

You can get that if the foreign key is defined as ON CASCADE DELETE or ON 
CASCADE SET NULL:

  CREATE TABLE a (a_id integer PRIMARY KEY);

  INSERT INTO a VALUES (1), (2);

  CREATE TABLE b (b_id integer PRIMARY KEY, a_id integer NOT NULL REFERENCES a 
ON DELETE CASCADE);

  INSERT INTO b VALUES (100, 1), (101, 1), (102, 2), (103, 2);

Transaction 1:

  BEGIN;

  DELETE FROM b WHERE b_id = 100;

Transaction 2:

  BEGIN;

  DELETE FROM a WHERE a_id = 2;

  DELETE FROM a WHERE a_id = 1; -- hangs

Transaction 1:

  DELETE FROM b WHERE b_id = 102;

ERROR:  deadlock detected
DETAIL:  Process 10517 waits for ShareLock on transaction 77325; blocked by 
process 10541.
Process 10541 waits for ShareLock on transaction 77323; blocked by process 
10517.
HINT:  See server log for query details.
CONTEXT:  while deleting tuple (0,3) in relation "b"

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: parallel query

2019-04-02 Thread Laurenz Albe

Mariel Cherkassky wrote:
> I wanted to a few questions regarding the parallel parameters :  
> max_worker_processes and max_parallel_workers_per_gather.
> 
> 1)Basically, max_worker_processes should be set to the number of cpus I have 
> in the machine ?

Setting it higher would not be smart.
Setting it lower can also be a good idea; it depends
on your workload.

> 2)If I set max_worker_processes to X and max_parallel_workers_per_gather to Y 
> (X>Y)
> it means that I will have at max (X/2) queries that can run in parallel. Am I 
> right ?
> For example, max_worker_processes =8,max_parallel_workers_per_gather =4, it 
> means
> that at max I can have 4 queries that are running in parallel ? and at min 2 
> queries
> (or none) can run in parallel ?

That is correct, but unless you set "max_parallel_workers_per_gather" to 1, one
query can use more than one parallel worker, and then you can have fewer
concurrent queries.

It also depends on the size of the table or index how many workers PostgreSQL 
will use.

> 3)So If I calculate my work_mem based on the number of sessions I have : 
> (TOTAL_MEM/2/NUM_OF_CONNECTIONS)
> I should add 8 to the NUM_OF_CONNECTIONS to have a new value for the work_mem 
> in order to consider queries that run in parallel..

Yes, but don't forget that one query can use "work_mem" several times if the
execution plan has several memory intensive nodes.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: RAM usage of PostgreSql

2019-04-04 Thread Laurenz Albe

Prasad wrote:
> I have installed PostgreSQL 9.4 (open source) version on my CentOS
> Linux Red Hat 7 production server and kept default parameters which
> are in postgresql.conf file.So my basic question is, once I start
> using postgres how much RAM the postgres processes  consumes
> (postgres related processes only). 
> 
> There are lot of allocations in postgresql.conf file, for example
> shared_buffers, work_mem...etc.
> 
> As per my knowledge, all postgres processes should not consume the
> RAM more than the value assigned in shared_buffers.Please clarify
> and let me know if I misunderstand the concept..

shared_buffers only determines the shared memory cache, each database
process still needs private memory.

As a rule of thumb, start with shared_buffers set to 1/4 of your
available RAM, but no more than 8GB.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Commit(?) overhead

2019-04-10 Thread Laurenz Albe

Duncan Kinnear wrote:
> Bingo! Adding 'SET LOCAL synchronous_commit TO OFF;' to my 'BEGIN; UPDATE 
> ; COMMIT;'
> block has given me sub-1ms timings! Thanks Andres.

That's a pretty clear indication that your I/O subsystem was overloaded.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Pg10 : Client Configuration for Parallelism ?

2019-04-17 Thread Laurenz Albe

laurent.decha...@orange.com wrote:
> There is something in documentation that says that there won't be parallelism
> if " The client sends an Execute message with a non-zero fetch count."
> I am not sure what this sentence means.

The JDBC driver sends an "Execute" message to the server.
https://www.postgresql.org/docs/current/protocol-message-formats.html says:

Execute (F)

Byte1('E')
Identifies the message as an Execute command.
Int32
Length of message contents in bytes, including self.
String
The name of the portal to execute (an empty string selects the unnamed 
portal).
Int32
Maximum number of rows to return, if portal contains a query that 
returns rows
(ignored otherwise). Zero denotes “no limit”.

If you use setMaxRows non-zero, that number is sent as the "maximum number of 
rows".

Parallelism currently cannot be used if there is a limit on the row count.
Imagine you want ten rows and already have nine, now if two workers are busy
calculating the next row, there is no good way to stop one of them when the 
other
returns a row.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: integrate Postgres Users Authentication with our own LDAP Server

2019-05-08 Thread Laurenz Albe

On Thu, 2019-05-09 at 04:51 +, M Tarkeshwar Rao wrote:
> We would need to integrate Postgres Users Authentication with our own LDAP 
> Server.  
>  
> Basically as of now we are able to login to Postgress DB with a user/password 
> credential.
>
> [roles "pg_signal_backend" and "postgres"]
>  
> These user objects are the part of Postgres DB server. Now we want that these 
> users should be authenticated by LDAP server.
> We would want the authentication to be done with LDAP, so basically the user 
> credentials should be store in LDAP server
>  
> Can you mention the prescribed steps in Postgres needed for this integration 
> with LDAP Server?

LDAP authentication is well documented:
https://www.postgresql.org/docs/current/auth-ldap.html

But I don't think you are on the right track.

"pg_signal_backend" cannot login, it is a role to which you add a login user
to give it certain privileges.  So you don't need to authenticate the role.

"postgres" is the installation superuser.  If security is important for you,
you won't set a password for that user and you won't allow remote logins
with that user.

But for your application users LDAP authentication is a fine thing, and not
hard to set up if you know a little bit about LDAP.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Log size in bytes of query result

2019-05-10 Thread Laurenz Albe

Franklin Haut wrote:
> How can we generate in the log of executed querys (directory pg_log)
> the amount of bytes transferred between the server and the client
> of the result of a query?

As far as I know, there is no parameter to do that.

You'd have to write an extension that hooks into PostgreSQL, but I
have no idea how hard that would be.

Yours,
Laurenz Albe

Re: Max_connections limit

2019-06-26 Thread Laurenz Albe

Daulat Ram wrote:
> We have migrated our database  from Oracle 12c to Postgres 11. I need your 
> suggestions ,
> we have sessions limit in Oracle = 3024 . Do we need to set the same 
> connection limit
> in Postgres as well. How we can decide the max_connections limit for postgres.
> Are there any differences in managing connections in Oracle and postgres.

I'd say that is way too high in both Oracle and PostgreSQL.

Set the value to 50 or 100 and get a connection pooler if the
application cannot do that itself.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: performance bottlenecks on lock transactionid

2019-08-14 Thread Laurenz Albe

王若楠 wrote:
> We used benchmarksql 4.1.0 to test the performance of PG12 beta TPCC.
> We found performance bottlenecks on lock transactionid.

You included an attachment with results from the "pg_locks" view
where "granted" is FALSE for all entries.

I'll assume that these are not *all* the entries in the view, right?

Since the locks are waiting for different transaction IDs, I'd
assume that this is just a case of contention: many transactions are
trying to modify the same rows concurrently.

This is to be expected.
Perhaps your benchmark is running with too many connections on
too few table rows?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Re: performance bottlenecks on lock transactionid

2019-08-14 Thread Laurenz Albe

王若楠 wrote:
> I want to find a way to reduce the lock waiting and improve the
> performance.

You either have to make the transactions shorter, or you let the
different clients modify different rows, so that they don't lock each
other.

That concurrent writers on the same data lock each other is
unavoidable, and all database management systems I know do it the same
way.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Surprising benchmark count(1) vs. count(*)

2019-09-19 Thread Laurenz Albe

On Thu, 2019-09-19 at 12:09 +0200, Thomas Kellerer wrote:
> https://blog.jooq.org/2019/09/19/whats-faster-count-or-count1/
> 
> Is there a reason why count(*) seems to be faster?

"count(*)" is just the SQL standard's way of saying what you'd
normally call "count()", that is, an aggregate without arguments.

"count(1)" has to check if 1 IS NULL for each row, because NULL
values are not counted.  "count(*)" doesn't have to do that.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: sequence depends on many tables

2019-09-25 Thread Laurenz Albe

On Wed, 2019-09-25 at 15:39 +0300, Mariel Cherkassky wrote:
> select objid,refobjid::regclass from pg_depend where objid=16704;
>  objid |refobjid
> ---+-
>  16704 | 2200
>  16704 | table_A
>  16704 | table_A
>  16704 | table_A
>  16704 | table_B
> (5 rows)
> 
> for unclear reason, both table A and table B depends on the sequence.
> When I check table_A I dont see any column that might use it..

Could you select all rows from pg_depend so that it is easier to see
what is going on?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

1 2 >

1 - 100 of 189 matches

Mail list logo