[GENERAL] Waiting on ExclusiveLock on extension
Hi, After upgrading our database from 9.3.5 to 9.4.1 last night, the server suffers from high CPU spikes. During these spikes, there are a lot of these messages in the logs: process X still waiting for ExclusiveLock on extension of relation Y of database Z after 1036.234 ms And: process X acquired ExclusiveLock on extension of relation Y of database Z after 2788.050 ms What looks suspicious is that there are sometimes several "acquired" messages for the exact same relation number in the exact same millisecond. a) I'm assuming "extension" means growing the physical diskspace used by a table-- is that correct? b) How can you find the name of the relation being extended? based on the relation number. c) Why would Postgres grow a table twice in the same millisecond? Could it be an index with a high fill factor? Any suggestions on how to approach this issue are welcome. Thanks for your time, Andomar P.S. The upgrade was done with pg_dump. So the database was converted to SQL and then imported into 9.4. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Waiting on ExclusiveLock on extension
Thanks for your reply. This issue has been complained several times, and here is the most recent one: http://www.postgresql.org/message-id/0ddfb621-7282-4a2b-8879-a47f7cecb...@simply.name That post is about a server with huge shared_buffers, but ours is just 8GB. Total memory 48GB memory on a dedicated server. Checkpoints write around 2% of the buffers. PG 9.4.1 shall have much alleviated it by relaxing buffer pool related locks. PG 9.4.1 shall be actually better in relation extension handling - a possible explanation is that your 9.3.5 database has been used for a while thus there are holes in pages, so not many extensions are required. The 9.3.5 version went live as an upgrade from 9.1.x in the same way. So it started from an SQL dump. The load has not changed much since the 9.3.5 upgrade. With holes in pages, I suppose you mean the fill factor? Is there a way to see the current fillfactor of a table and its indices? Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Waiting on ExclusiveLock on extension
Are you able to take some 'perf top' during high CPU spike and see what's burning CPU there? Though the issue is related to blocking, but high CPU spikes may hint some spinning to acquire behavior. Will do, although hopefully the spikes were only growing pains after the upgrade. If your previous relation size is smaller than after upgrade, that's a signal that you do have holes in relation, thus extension can be avoided sometimes for new tuples. The relation between high CPU and page splits is not immediately obvious to me. We run with synchronous_commit off, but there does seem to be a peak in I/O requests around the CPU spikes. Is a page split by nature a synchronous I/O activity? And do the other connections wait in some kind of CPU intensive form (like a spinlock?) Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Waiting on ExclusiveLock on extension
00 200274.51 0.00 0.00 0.00 0.00 08:39:01 PM 62.61858.31 487099.01 0.00 330130.65 0.00 0.00 0.00 0.0 pgscand/s = Number of pages scanned directly per second. pgsteal/s = Number of pages the system has reclaimed from cache (pagecache and swapcache) per second to satisfy its memory demands. Could the pgscand and pgsteal numbers provide a hint? They're sometimes zero for more than half an hour, so they don't seem related to checkpoints. Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Waiting on ExclusiveLock on extension
To put the top question first: How can table extension locks explain a a massive spike in CPU usage? I can imagine 400 connections waiting on disk I/O, but then, wouldn't they all be sleeping? > Ok, that's a MAJOR hint, because relation 1249 is a system catalog; > namely pg_attribute. So I think what's happening here is that your > catalog has become horrifically bloated. I'm 99% certain that VACUUM ALL > will not vacuum the catalog tables. > > Do you by chance have autovacuum turned off? > > A manual VACUUM VERBOSE pg_attribute might provide some immediate relief. > Autovacuum is turned on. In addition, we do a manual VACUUM ALL at night. VACUUM VERBOSE pg_attribute ran in 0 seconds and processed a few hundred rows. > Are you using a connection pool? Establishing 50 new database > connections per second won't do anything to help performance... > As I understand it, a pool reduces network and CPU load. We have never seen any issues with those. So the extra monitoring and maintenance cost of a pool seems hard to justify. > I think what that means is that there was suddenly a big spike in memory > demand at the OS level, so now the OS is frantically dumping cached > pages. That in itself won't explain this, but it may be a clue. > We monitor memory usage with Cacti. It's a dedicated server and nearly all memory is used as cache. If a script runs and demands memory, that becomes visible as cache is cleared out. There is no change in the amount of memory used as cache around the outage. > In order to extend a relation we need to ask the filesystem to actually > extend the file (which presumably means at least writing some metadata > to disk), and then I think we create a WAL record. Creating the WAL > record won't by itself write to disk... *unless* the wal_buffers are all > already full. > I have a question here, we have "synchronous_commit = off". So when Postgres extends a page, would it do that just in memory, or does part of the extend operation require synchronous I/O? > So if you also see an I/O spike when this happens you could well > just be starved from the I/O system (though obviously it'd be > better if we handled that situation more elegantly than this). The SAR data shows no increase in pgpgin/s and pgpgout/s, which if I understand it correctly, means that there is no I/O spike. There is however an enormous increase in CPU usage. > I do suspect your pgfree/s is very high though; putting 200k pages/s on > the free list seems like something's broken. > The system has constant and considerable load of small writes. The pg_activity tool shows 300 IOPs sustained (it claims max IPs above 11000.) Postgres 9.3 had a comparable pgfree/s. Would you know a good resource to get more knowledgeable about pgfree, pgpin, pgsteal? Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Waiting on ExclusiveLock on extension
Would you be able to get a stack trace of a backend that's holding an extension lock? Or maybe perf would provide some insight. The outage occurred again but briefer. There were no ExclusiveLock messages, presumably because the timeout for logging locks was not exceeded. But all available connection slots were used and many incoming requests were denied. Below you'll find the "perf report" and the "zoomed in" (I suppose that's the callstack) of the top entry. The top entry's call stack has these 4 postmaster functions near the top: PinBuffer LockRelease hash_any HeapTupleSatisfiesMVCC We'll be rolling back from 9.4.1 to 9.3.6 tonight, hoping that will resolve the issue. = BELOW A 10 SECOND perf top CAPTURE DURING THE OUTAGE # # captured on: Mon Apr 20 20:34:43 2015 # hostname : db1a # os release : 2.6.32-504.1.3.el6.x86_64 # perf version : 2.6.32-504.1.3.el6.x86_64.debug # arch : x86_64 # nrcpus online : 24 # nrcpus avail : 24 # cpudesc : Intel(R) Xeon(R) CPU E5-2667 0 @ 2.90GHz # cpuid : GenuineIntel,6,45,7 # total memory : 49373964 kB # cmdline : /usr/bin/perf record -a -o 2015-04-20_20:34:28 sleep 10 # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 1, precise_ip = 0, attr_mmap2 = 0, attr_mmap = 1, attr_mmap_data = 0 # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: uncore_r3qpi_1 = 23, uncore_r3qpi_0 = 22, uncore_r2pcie = 21, uncore_qpi_1 = 20, uncore_qpi_0 = 19, uncore_imc_3 = 18, uncore_imc_2 = 17, uncore_imc_1 = 16, uncore_imc_0 = 15, uncore_ha = 14, uncore_pcu = 13, uncore_cbox_5 = 12, uncore_c # # # Samples: 960K of event 'cycles' # Event count (approx.): 757057831613 # # Overhead Command Shared Object Symbol # ... . . # 59.73% postmaster [kernel.kallsyms] [k] compaction_alloc 1.31% postmaster [kernel.kallsyms] [k] _spin_lock 0.94% postmaster [kernel.kallsyms] [k] __reset_isolation_suitable 0.78% postmaster [kernel.kallsyms] [k] compact_zone 0.67% postmaster [kernel.kallsyms] [k] get_pageblock_flags_group 0.64% postmaster [kernel.kallsyms] [k] copy_page_c 0.48% :13410 [kernel.kallsyms] [k] compaction_alloc 0.45% :13465 [kernel.kallsyms] [k] compaction_alloc 0.45% postmaster [kernel.kallsyms] [k] clear_page_c 0.44% postmaster postgres [.] hash_search_with_hash_value 0.41% :13324 [kernel.kallsyms] [k] compaction_alloc 0.40% :13561 [kernel.kallsyms] [k] compaction_alloc 0.38% :13374 [kernel.kallsyms] [k] compaction_alloc 0.37% :13272 [kernel.kallsyms] [k] compaction_alloc 0.37% postmaster [kernel.kallsyms] [k] unmap_vmas 0.36% postmaster [kernel.kallsyms] [k] page_fault 0.36% :13380 [kernel.kallsyms] [k] compaction_alloc 0.35% :13482 [kernel.kallsyms] [k] compaction_alloc 0.34% :13555 [kernel.kallsyms] [k] compaction_alloc 0.34% postmaster [kernel.kallsyms] [k] set_pageblock_flags_group 0.34% postmaster [kernel.kallsyms] [k] page_check_address 0.33% :13528 [kernel.kallsyms] [k] compaction_alloc 0.33% :13464 [kernel.kallsyms] [k] compaction_alloc 0.31% :13547 [kernel.kallsyms] [k] compaction_alloc 0.30% postmaster [kernel.kallsyms] [k] _spin_lock_irqsave 0.29% :13395 [kernel.kallsyms] [k] compaction_alloc 0.29% :13546 [kernel.kallsyms] [k] compaction_alloc 0.28% postmaster [kernel.kallsyms] [k] remove_migration_pte 0.28% :13355 [kernel.kallsyms] [k] compaction_alloc 0.28% postmaster [kernel.kallsyms] [k] list_del 0.28% :13432 [kernel.kallsyms] [k] compaction_alloc 0.27% :13258 [kernel.kallsyms] [k] compaction_alloc 0.27% :13328 [kernel.kallsyms] [k] compaction_alloc 0.26% postmaster [kernel.kallsyms] [k] __wake_up_bit 0.26% :13361 [kernel.kallsyms] [k] compaction_alloc 0.25% :13334 [kernel.kallsyms] [k] compaction_alloc 0.25% :13366 [kernel.kallsyms] [k] compaction_alloc 0.25% :13549 [kernel.kallsyms] [k] compaction_alloc 0.25% :13530 [kernel.kallsyms] [k] compaction_alloc 0.24% :13391 [kernel.kallsyms] [k] compaction_alloc 0.24% :13387 [kernel.kallsyms] [k]
Re: [GENERAL] Waiting on ExclusiveLock on extension
Another update (let me know if I'm mailing too much). > 59.73% postmaster [kernel.kallsyms] [k] > compaction_alloc The compaction_alloc function seemed to point to issues with transparent huge pages (THP.) I tried to turn this off with: echo never > /sys/kernel/mm/transparent_hugepage/enabled This change yielded immediate results. The "migration/xx" processes disappeared from "top", the function "compaction_alloc" disappeared from "perf top", and system load dropped from 5 to 0.3. We've called off the 9.4 -> 9.3 downgrade for now. For anyone else finding this thread, we are running CentOS 6.5 with kernel 2.6.32 (which was released in Dec 2009.) Cheers, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ERROR: could not open relation with OID
This morning we got the following error from a daily script that produces a simple largest-table report: ERROR: could not open relation with OID 597597503 From a bit of Googling, it seems that Postgres was unable to open the physical file that contains the relation. Is it possible that there was an error accessing the disk? How is the filesystem that contains your PGDATA directory mounted? Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] COALESCE woes
SELECT COALESCE(ts1.user_id, ts2,user_id, ts3.user_id) AS user_id, That should probably be ts2 DOT user_id. Cheers, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] delete is getting hung when there is a huge data in table
> Now issue is that when this script for the deletion of data is launched , it is taking more than 7 days and doing nothing i.e not a single row has been deleted. Deleting a large number of rows can take a long time. Often it's quicker to delete smaller chunks. The LIMIT clause is not supported by DELETE, so you need some kind of subquery. We use something like: do $_$declare num_rows bigint; begin loop delete from YourTable where id in (select id from YourTable where id < 500 limit 100); get diagnostics num_rows = row_count; raise notice 'deleted % rows', num_rows; exit when num_rows = 0; end loop; end;$_$; This deletes rows with an id smaller than 500 in chunks of 100. Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] count distinct and group by
And this doesn't: select count(distinct id) from bg order by id; ERROR: column "bg.id <http://bg.id>" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: select count(distinct id) from bg order by id; Your result set will contain one row with the count of distinct ids. You can't really order 1 row. The error message occurs because your result set has one unnamed column: count(distinct id). You could write the query like: select count(distinct id) as cnt from bg order by cnt; That would be correct SQL, because the column "cnt" now does exist. Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Replicate over pgbouncer?
Hi, Today I installed pgbouncer. I added a second installation as a hot standby. Before starting the standby, I configured "recovery.conf" to connect to pgbouncer. This results in an error message: Pooler Error: Unsupported startup parameter: replication Is it possible to replicate over a connection through pgbouncer? Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Replicate over pgbouncer?
I'm doubtful. Why do you think you need such a capability? For simplicity. If I can replicate through pgbouncer, I'll need only one open port on the machine. Postgres would just listen on localhost. If not, I'll have to make Postgres listen on an interface on a different port. -Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Replicate over pgbouncer?
Since you did not specify your O/S and PostgreSQL version, I'll just point you to the documentation for 9.1. We're using CentOS 6.5 with PostgreSQL 9.4.1. Like the original post mentioned, we use a hot standby for replication. The question is whether we can do the hot standby through pgbouncer, or if we need a separate open port on the database server just for replication. Cheers, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Include.d and warnings
We've set up an include directory in postgresql.conf: include_dir = 'conf.d' The include directory has a file that changes the value of "max_connections". Now when we reload Postgres, an error appears in the logs: - parameter ""max_connections"" cannot be changed without restarting the server ... configuration file ""/var/lib/pgsql/9.4/data/postgresql.conf"" contains errors; unaffected changes were applied - But max_connections did not change before the reload. It is just overruled in an included configuration file. Is this behavior "by design", or should Postgres not give an error under the above circumstances? Kind Regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Include.d and warnings
But max_connections did not change before the reload. It is just overruled in an included configuration file. This is fixed in 9.4 and up. Since it's just cosmetic, we did not back-patch the change. We're running 9.4.4. I've just tested this, and doing a "restart" followed by a "reload" causes the error message to appear. Is anyone else willing to try and reproduce the problem? > Just for my edification. If an include file contains a different > value for a setting that can only be changed on restart, Postgres is > supposed to throw out the restart message? To clarify, the server is started with the setting already present: postgresql.conf : max_connections 100 conf.d/extra.conf : max_connections 125 If you start the server and reload, this will give the error message. Checking the live setting with "select * from pg_settings where name = 'max_connections'" shows that the setting already is 125. So the warning is confusing because a restart would not change anything. Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Systemd vs logging collector
In Arch Linux, Postgres is started by systemd. If you configure logging_collector in postgresql.conf this has no effect. All of the logging output is send to the journal, where you can read it with “journalctl -u postgresql”. The configuration for systemd contains a StandardError option. I’ve tried various settings but none result in a regular Postgres log file. The Postgres settings for systemd can be found here: /usr/lib/systemd/system/postgresql.service Is there a way to let Postgres do its own logging when it is started by systemd? Kind regards, Andomar P.S. Trying to figure this out in order to answer this question: http://dba.stackexchange.com/questions/105384/postgresql-doesnt-log/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Systemd vs logging collector
are you just missing this parameter: #log_destination = 'stderr' # Valid values are combinations of # stderr, csvlog, syslog, and eventlog, # depending on platform. csvlog # requires logging_collector to be on. Thanks for your reply, log_destination is set to "stderr". And logging_collector is "on". On CentOS, this results in regular Postgres log files. On Arch Linux, the results end up in the system journal. It looks like the Postgres logging collector somehow misses out on the stream of messages. Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to restore
> > If I have taken any backup successfully through pg_dump? How can I restore > this pg_dump(Sql file) without use of pg_restore. > You can send the file to psql, the command line client: psql yourdb < yourbackup.sql Or: cat yourbackup.sql | psql yourdb Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Prepared Statements and Pooling
> I have considered opening a single connection to the server and letting all > threads execute statements through it, and use prepared statements (serviced > by postgresql, not JDBC or DBCP). Or simply opening 50 connections and > establishing prepared statements and then handing them off to the threads. > Is there a reason I shouldn't do this? I read that the JDBC driver for > postgres is thread safe. Connection pooling reduces the overhead for creating a new connection, and allows Postgres to reuse execution plans for prepared statements. Both things will gain you a few milliseconds per query. For a small or medium sized project that is entirely irrelevant. It is far more important to have stable and easy to understand code. To gain that, keep each connection open for as short a time as you can. Run a single query and then close it. When the time comes where the few milliseconds matter, it is easy to add connection pooling without changing a single line of code. Postgres offers pgbouncer and pgpool for that job, but if DBCP can do it, that’s fine too. Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?
However, I know from experience that's not entirely true, (although it's not always easy to measure all aspects of your I/O bandwith). Am I missing something? Two things I can think of: Transaction writes are entirely sequential. If you have disks assigned for just this purpose, then the heads will always be in the right spot, and the writes go through more quickly. A database server process waits until the transaction logs are written and then returns control to the client. The data writes can be done in the background while the client goes on to do other things. Splitting up data and logs mean that there is less chance the disk controller will cause data writes to interfere with log files. Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] SO question about disappearing row
Saw the following question on Stack Overflow: http://stackoverflow.com/questions/39550510/select-for-update-returns-zero-rows One transaction updates a row with the same values, causing another transaction to return an empty set. This is surprising as the row passes the where clause both before and after the update. Is there a way to have Postgres explain why the outer query drops a row even though the query plan does not show a filter? Cheers, Andomar
[GENERAL] Allow login on slave only
We run a master server and a hot standby server. Reporting users login to the standby server to run long queries. However, their login is also valid on the master server. Is it possible to prevent a user from logging in to the master server? A statement like: alter role newbie_business_analyst nologin; Is replicated, so it would block logins on both servers. Thanks, Andomar
[GENERAL] Crypt change in 9.4.5
Hi, After upgrading to PostgreSQL 9.4.6, our test system gave error messages like: ERROR: invalid salt The cause of these errors is statements like: WHERE password = crypt('secret', 'secret') After reverting to Postgres 9.4.4 the test system worked properly again. This might be related to a security fix in 9.4.5: --- Fix contrib/pgcrypto to detect and report too-short crypt() salts (Josh Kupershmidt) Certain invalid salt arguments crashed the server or disclosed a few bytes of server memory. We have not ruled out the viability of attacks that arrange for presence of confidential information in the disclosed bytes, but they seem unlikely. (CVE-2015-5288) --- The "crypt" call is hardcoded in legacy code that hasn't been recompiled in years. Are there ways to keep the old code running against a newer Postgres version? Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Monitoring query plan cache
Hi, We run a Postgres 9.1 system that handles a lot of identical queries (around 500 per second.) The queries originate from a Python WSGI script running under Apache. Performance is an issue, and we suspect query planning might be an underlying cause. I've rewritten the scripts from ad-hoc SQL to a Postgres functions (CREATE FUNCTION) and we saw server load go down quite a bit. Now we'd like to monitor the Postgres query parser, to confirm that it's not analyzing the same SQL over and over, and see if we missed other things. a) Is it possible that Postgres caches the query plans for psql functions, but not for ad-hoc SQL? b) Is there a way to see if a single query uses a cached execution plan? c) Can you monitor the query parser as a whole, with stats like parses/sec or cache hits/sec? d) Is there a way to montior how many reads are done against the statistics table (pg_statistics)? Thanks for any answers or pointers, Andomar P.S. I've asked this question on StackExchange http://dba.stackexchange.com/questions/86620/monitoring-the-postgres-query-parser/86626#86626 but I'm looking for more specific information. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Monitoring query plan cache
Thanks for your reply, I have a follow-up question: c) Can you monitor the query parser as a whole, with stats like parses/sec or cache hits/sec? Possibly log_parser_stats/log_planner_stats/log_statement_stats would help you. They're pretty old-school though; you'd need to write some tool that scans the postmaster log and accumulates the stats to get anything very useful out of those features. (It could be that somebody's already done that, though --- take a look at things like pgFouine.) Below is an example output from log_planner_stats: LOG: PLANNER STATISTICS DETAIL: ! system usage stats: ! 0.000132 elapsed 0.00 user 0.00 system sec ! [0.181972 user 0.052991 sys total] ! 0/0 [0/248] filesystem blocks in/out ! 0/0 [0/2705] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [1/4249] voluntary/involuntary context switches How can we tell from this whether the query planner used a cached plan? Is the example above a hit or a miss? a) Is it possible that Postgres caches the query plans for psql functions, but not for ad-hoc SQL? plpgsql functions would cache query plans. Ad-hoc SQL doesn't, unless you explicitly make use of prepared queries. It is not always easy to tell the query type (function, prepared or ad-hoc.) We use Python in mod_wsgi with psycopg2. The code shows ad-hoc SQL, but who knows what the many layers between Python and the database do. As suggested by Erwin Brandstetter on dba.stackexchange.com, I ran a test with: cursor.execute(""" load 'auto_explain'; set auto_explain.log_min_duration = 0; set auto_explain.log_nested_statements = on; select col1 from table1 where id = %(id)s """, {'id': 123}); print(curs.fetchone()[0]); The logging doesn't look like a cached plan, you can see the 123 value but not a parameter like $1. This suggests Postgres was previously compiling around 200 queries a second on our production machine. Is that even possible? Cheers, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Monitoring query plan cache
psycopg2 sends the SQL you feed it straight to the DB. If you don't feed it a PREPARE statement [1] it'll be an ad-hoc query; the value placeholders will be interpolated prior to statement submission by psycopg2. Thanks, that confirms what I saw during tests. If you're seeing that output then planning happened. The only way you get a cached plan for a client-issued SQL statement is if the client uses PREPARE/EXECUTE or the "extended query" protocol (parse/bind/execute). Neither of those cases would go through here. So if there was a cached plan, there is no "LOG: PLANNER STATISTICS". That is helpful, thanks. Well, at 132 microseconds apiece, it does not seem from this example that repeat planning is a huge problem for you ... of course, some of your statements might take longer, but you've not demonstrated here that you have anything to worry about. Well that was just an example query, but checking a realistic query on a test machine shows 15ms spent on query analysis. For 200 queries/sec that would keep around 3 CPU's busy, which is what we saw. I will try to verify that using the log_planner_stats option. Cheers, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general