Re: [PERFORM] Restricting Postgres
On Tue, Nov 02, 2004 at 11:52:12PM +, Martin Foster wrote: > Is there a way to restrict how much load a PostgreSQL server can take > before dropping queries in order to safeguard the server?I was Well, you could limit the number of concurrent connections, and set the query timeout to a relatively low level. What that ought to mean is that, under heavy load, some queries will abort. A -- Andrew Sullivan | [EMAIL PROTECTED] When my information changes, I alter my conclusions. What do you do sir? --attr. John Maynard Keynes ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] preloading indexes
I am working with some pretty convoluted queries that work very slowly the first time they’re called but perform fine on the second call. I am fairly certain that these differences are due to the caching. Can someone point me in a direction that would allow me to pre-cache the critical indexes?
Re: [PERFORM] preloading indexes
Title: Message The best way to get all the stuff needed by a query into RAM is to run the query. Is it more that you want to 'pin' the data in RAM so it doesn't get overwritten by other queries? -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]Sent: 03 November 2004 17:31To: [EMAIL PROTECTED]Subject: [PERFORM] preloading indexes I am working with some pretty convoluted queries that work very slowly the first time they’re called but perform fine on the second call. I am fairly certain that these differences are due to the caching. Can someone point me in a direction that would allow me to pre-cache the critical indexes?
[PERFORM] vacuum analyze slows sql query
Greetings pgsql-performance :) Yesterday I posted to the pgsql-sql list about an issue with VACUUM while trying to track-down an issue with performance of a SQL SELECT statement invovling a stored function. It was suggested that I bring the discussion over to -performance. Instread of reposting the message here is a link to the original message followed by a brief summary: http://marc.theaimsgroup.com/?l=postgresql-sql&m=109945118928530&w=2 Summary: Our customer complains about web/php-based UI sluggishness accessing the data in db. I created a "stripped down" version of the tables in question to be able to post to the pgsql-sql list asking for hints as to how I can improve the SQL query. While doing this I noticed that if I 'createdb' and populate it with the "sanatized" data the query in question is quite fast; 618 rows returned in 864.522 ms. This was puzzling. Next I noticed that after a VACUUM the very same query would slow down to a crawl; 618 rows returned in 1080688.921 ms). This was reproduced on PostgreSQL 7.4.2 running on a Intel PIII 700Mhz, 512mb. This system is my /personal/ test system/sandbox. i.e., it isn't being stressed by any other processes. Thanks for reading, --patrick __ Do you Yahoo!? Check out the new Yahoo! Front Page. www.yahoo.com ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] preloading indexes
Title: Message That’s correct – I’d like to be able to keep particular indexes in RAM available all the time The best way to get all the stuff needed by a query into RAM is to run the query. Is it more that you want to 'pin' the data in RAM so it doesn't get overwritten by other queries? I am working with some pretty convoluted queries that work very slowly the first time they’re called but perform fine on the second call. I am fairly certain that these differences are due to the caching. Can someone point me in a direction that would allow me to pre-cache the critical indexes?
Re: [PERFORM] vacuum analyze slows sql query
Given that the plan doesn't change after an analyze, my guess would be that the first query is hitting cached data, then you vacuum and that chews though all the cache with its own data pushing the good data out of the cache so it has to be re-fetched from disk. If you run the select a 2nd time after the vacuum, what is the time? Not sure what your pkk_offer_has_pending_purch function does, that might be something to look at as well. I could be wrong, but thats the only thing that makes sense to me. ARC is supposed to help with that type of behavior in 8.0 patrick ~ wrote: Greetings pgsql-performance :) Yesterday I posted to the pgsql-sql list about an issue with VACUUM while trying to track-down an issue with performance of a SQL SELECT statement invovling a stored function. It was suggested that I bring the discussion over to -performance. Instread of reposting the message here is a link to the original message followed by a brief summary: http://marc.theaimsgroup.com/?l=postgresql-sql&m=109945118928530&w=2 Summary: Our customer complains about web/php-based UI sluggishness accessing the data in db. I created a "stripped down" version of the tables in question to be able to post to the pgsql-sql list asking for hints as to how I can improve the SQL query. While doing this I noticed that if I 'createdb' and populate it with the "sanatized" data the query in question is quite fast; 618 rows returned in 864.522 ms. This was puzzling. Next I noticed that after a VACUUM the very same query would slow down to a crawl; 618 rows returned in 1080688.921 ms). This was reproduced on PostgreSQL 7.4.2 running on a Intel PIII 700Mhz, 512mb. This system is my /personal/ test system/sandbox. i.e., it isn't being stressed by any other processes. Thanks for reading, --patrick ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Restricting Postgres
On Tue, 2004-11-02 at 23:52, Martin Foster wrote: > Is there a way to restrict how much load a PostgreSQL server can take > before dropping queries in order to safeguard the server?I was > looking at the login.conf (5) man page and while it allows me to limit > by processor time this seems to not fit my specific needs. > > Essentially, I am looking for a sort of functionality similar to what > Sendmail and Apache have. Once the load of the system reaches a > certain defined limit the daemon drops tasks until such a time that it > can resume normal operation. Sounds great... could you give more shape to the idea, so people can comment on it? What limit? Measured how? Normal operation is what? Drop what? How to tell? > > While not necessarily common on my servers I have witnessed some fairly > high load averages which may have led to the machine dropping outright. >Any help on this matter would be appreciated. You can limit the number of connections overall? -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] preloading indexes
On Wed, Nov 03, 2004 at 12:12:43PM -0700, [EMAIL PROTECTED] wrote: > That's correct - I'd like to be able to keep particular indexes in RAM > available all the time If these are queries that run frequently, then the relevant cache will probably remain populated[1]. If they _don't_ run frequently, why do you want to force the memory to be used to optimise something that is uncommon? But in any case, there's no mechanism to do this. A [1] there are in fact limits on the caching: if your data set is larger than memory, for instance, there's no way it will all stay cached. Also, VACUUM does nasty things to the cache. It is hoped that nastiness is fixed in 8.0. -- Andrew Sullivan | [EMAIL PROTECTED] The plural of anecdote is not data. --Roger Brinner ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] preloading indexes
-- uh, you can always load a table in cache by doing a seq scan on it... like select count(1) from table or something... this doesn't work for indexes of course, but you can always look in the system catalogs, find the filename for the index, then just open() it from an external program and read it without caring for the data... it'll save you the seeks in the index... of course you'll have problems with file permissions etc, not mentioning security, locking, etc, etc, etc, is that worth the trouble ? On Wed, 3 Nov 2004 14:35:28 -0500, Andrew Sullivan <[EMAIL PROTECTED]> wrote: On Wed, Nov 03, 2004 at 12:12:43PM -0700, [EMAIL PROTECTED] wrote: That's correct - I'd like to be able to keep particular indexes in RAM available all the time If these are queries that run frequently, then the relevant cache will probably remain populated[1]. If they _don't_ run frequently, why do you want to force the memory to be used to optimise something that is uncommon? But in any case, there's no mechanism to do this. A [1] there are in fact limits on the caching: if your data set is larger than memory, for instance, there's no way it will all stay cached. Also, VACUUM does nasty things to the cache. It is hoped that nastiness is fixed in 8.0. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] preloading indexes
<[EMAIL PROTECTED]> writes: > I am working with some pretty convoluted queries that work very slowly the > first time they're called but perform fine on the second call. I am fairly > certain that these differences are due to the caching. Can someone point me > in a direction that would allow me to pre-cache the critical indexes? Buy more RAM. Also check your shared_buffers setting (but realize that more is not necessarily better). regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] vacuum analyze slows sql query
patrick ~ <[EMAIL PROTECTED]> writes: > that if I 'createdb' and populate it with the "sanatized" data the > query in question is quite fast; 618 rows returned in 864.522 ms. > This was puzzling. Next I noticed that after a VACUUM the very same > query would slow down to a crawl; 618 rows returned in 1080688.921 ms). The outer query is too simple to have more than one possible plan, so the issue is certainly a change in query plans inside the function. You need to be investigating what's happening inside that function. 7.1 doesn't have adequate tools for this, but in 7.4 you can use PREPARE and EXPLAIN ANALYZE EXECUTE to examine the query plans used for parameterized statements, which is what you've got here. My bet is that with ANALYZE stats present, the planner guesses wrong about which index to use; but without looking at EXPLAIN ANALYZE output there's no way to be sure. BTW, why the bizarrely complicated substitute for a NOT NULL test? ISTM you only need create function pkk_offer_has_pending_purch( integer ) returns bool as ' select p0.purchase_id is not null from pkk_purchase p0 where p0.offer_id = $1 and ( p0.pending = true or ( ( p0.expire_time > now() or p0.expire_time isnull ) and p0.cancel_date isnull ) ) limit 1 ' language 'sql' ; (Actually, seeing that pkk_purchase.purchase_id is defined as NOT NULL, I wonder why the function exists at all ... but I suppose you've "stripped" the function to the point of being nonsense.) regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] preloading indexes
The caching appears to disappear overnight. The environment is not in production yet so I'm the only one on it. Is there a time limit on the length of time in cache? I believe there is sufficient RAM, but maybe I need to look again. s -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Sullivan Sent: Wednesday, November 03, 2004 12:35 PM To: [EMAIL PROTECTED] Subject: Re: [PERFORM] preloading indexes On Wed, Nov 03, 2004 at 12:12:43PM -0700, [EMAIL PROTECTED] wrote: > That's correct - I'd like to be able to keep particular indexes in RAM > available all the time If these are queries that run frequently, then the relevant cache will probably remain populated[1]. If they _don't_ run frequently, why do you want to force the memory to be used to optimise something that is uncommon? But in any case, there's no mechanism to do this. A [1] there are in fact limits on the caching: if your data set is larger than memory, for instance, there's no way it will all stay cached. Also, VACUUM does nasty things to the cache. It is hoped that nastiness is fixed in 8.0. -- Andrew Sullivan | [EMAIL PROTECTED] The plural of anecdote is not data. --Roger Brinner ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] preloading indexes
<[EMAIL PROTECTED]> writes: > The caching appears to disappear overnight. You've probably got cron jobs that run late at night and blow out your kernel disk cache by accessing a whole lot of non-Postgres stuff. (A nightly disk backup is one obvious candidate.) The most likely solution is to run some cron job a little later to exercise your database and thereby repopulate the cache with Postgres files before you get to work ;-) regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] preloading indexes
On Wed, Nov 03, 2004 at 01:19:43PM -0700, [EMAIL PROTECTED] wrote: > The caching appears to disappear overnight. The environment is not in > production yet so I'm the only one on it. Are you vacuuming at night? It grovels through the entire database, and may bust your query out of the cache. Also, we'd need some more info about how you've tuned this thing. Maybe check out the archives first for some tuning pointers to help you. A -- Andrew Sullivan | [EMAIL PROTECTED] In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] preloading indexes
Thanks - this is what I was afraid of, but I may have to do this Is there a good way to monitor what's in the cache? j <[EMAIL PROTECTED]> writes: > The caching appears to disappear overnight. You've probably got cron jobs that run late at night and blow out your kernel disk cache by accessing a whole lot of non-Postgres stuff. (A nightly disk backup is one obvious candidate.) The most likely solution is to run some cron job a little later to exercise your database and thereby repopulate the cache with Postgres files before you get to work ;-) regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Restricting Postgres
Simon Riggs wrote: On Tue, 2004-11-02 at 23:52, Martin Foster wrote: Is there a way to restrict how much load a PostgreSQL server can take before dropping queries in order to safeguard the server?I was looking at the login.conf (5) man page and while it allows me to limit by processor time this seems to not fit my specific needs. Essentially, I am looking for a sort of functionality similar to what Sendmail and Apache have. Once the load of the system reaches a certain defined limit the daemon drops tasks until such a time that it can resume normal operation. Sounds great... could you give more shape to the idea, so people can comment on it? What limit? Measured how? Normal operation is what? Drop what? How to tell? Let's use the example in Apache, there is the Apache::LoadAvgLimit mod_perl module which allows one to limit based on the system load averages. Here is an example of the configuration one would find: PerlInitHandler Apache::LoadAvgLimit PerlSetVar LoadAvgLimit_1 3.00 PerlSetVar LoadAvgLimit_5 2.00 PerlSetVar LoadAvgLimit_15 1.50 PerlSetVar LoadAvgRetryAfter 120 The end state is simple, once the load average moves above 3.00 for the 1 minute average the web server will not process the CGI scripts or mod_perl applications under that directory. Instead it will return a 503 error and save the system from being crushed by ever increasing load averages. Only once the load average is below the defined limits will the server process requests as normal. This is not necessarily the nicest or cleanest way or doing things, but it does allow the Apache web server to prevent a collapse. There are ways of restricting the size of files, number of concurrent processes and even memory being used by a daemon. This can be done through ulimit or the login.conf file if your system supports it. However, there is no way to restrict based on load averages, only processor time which is ineffective for a perpetually running daemon like PostgreSQL has. While not necessarily common on my servers I have witnessed some fairly high load averages which may have led to the machine dropping outright. Any help on this matter would be appreciated. You can limit the number of connections overall? Limiting concurrent connections is not always the solution to the problem. Problems can occur when there is a major spike in activity that would be considered abnormal, due to outside conditions. For example using Apache::DBI or pgpool the DBMS may be required to spawn a great deal of child processed in a short order of time. This in turn can cause a major spike in processor load and if unchecked by running as high demand queries the system can literally increase in load until the server buckles. I've seen this behavior before when restarting the web server during heavy loads.Apache goes from zero connections to a solid 120, causing PostgreSQL to spawn that many children in a short order of time just to keep up with the demand. PostgreSQL undertakes a penalty when spawning a new client and accepting a connection, this slows takes resources at every level to accomplish. However clients on the web server are hitting the server at an accelerated rate because of the slowed response, leading to even more demand being placed on both machines. In most cases the processor will be taxed and the load average high enough to cause even a noticeable delay when using a console, however it will generally recover... slowly or in rare cases crash outright. In such a circumstance, having the database server refuse queries when the sanity of the system is concerned might come in handy for such a circumstance. Of course, I am not blaming PostgreSQL, there are probably some instabilities in the AMD64 port of FreeBSD 5.2.1 for dual processor systems that lead to an increased chance of failure instead of recovery. However, if there was a way to prevent the process from reaching those limits, it may avoid the problem altogether. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] vacuum analyze slows sql query
Here is a fresh run with 'explain analyze' run before and after the VACUUM statement: -- begin % dropdb pkk DROP DATABASE % createdb pkk CREATE DATABASE % psql pkk < pkk_db.sql ERROR: function pkk_offer_has_pending_purch(integer) does not exist ERROR: function pkk_offer_has_pending_purch2(integer) does not exist ERROR: table "pkk_billing" does not exist ERROR: table "pkk_purchase" does not exist ERROR: table "pkk_offer" does not exist NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "pkk_offer_pkey" for table "pkk_offer" CREATE TABLE NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "pkk_purchase_pkey" for table "pkk_purchase" CREATE TABLE CREATE INDEX CREATE INDEX CREATE INDEX CREATE TABLE CREATE INDEX CREATE FUNCTION CREATE FUNCTION % zcat pkk.20041028_00.sql.gz | psql pkk SET SET SET SET % psql pkk pkk=# select offer_id, pkk_offer_has_pending_purch( offer_id ) from pkk_offer ; (618 rows) Time: 877.348 ms pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id ) from pkk_offer ; QUERY PLAN --- Seq Scan on pkk_offer (cost=0.00..22.50 rows=1000 width=4) (actual time=1.291..845.485 rows=618 loops=1) Total runtime: 849.475 ms (2 rows) Time: 866.613 ms pkk=# vacuum analyze ; VACUUM Time: 99344.399 ms pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id ) from pkk_offer ; QUERY PLAN - Seq Scan on pkk_offer (cost=0.00..13.72 rows=618 width=4) (actual time=3636.401..1047412.851 rows=618 loops=1) Total runtime: 1047415.525 ms (2 rows) Time: 1047489.477 ms -- end Tom, The reason of the extra "case" part in the function is to ensure non-null fields on the result. I tried your version as well and i get similar performance results: -- begin pkk=# create function toms_pending_purch( integer ) returns bool as 'select p0.purchase_id is not null from pkk_purchase p0 where p0.offer_id = $1 and ( p0.pending = true or ( ( p0.expire_time > now() or p0.expire_time isnull ) and p0.cancel_date isnull ) ) limit 1 ' language 'sql' ; CREATE FUNCTION Time: 2.496 ms pkk=# select offer_id, toms_pending_purch( offer_id ) from pkk_offer ; (618 rows) Time: 1052339.506 ms -- end Right now, I'm studying the document section on PREPARE and will attempt to play around with it. I was asked (in a prior post) whether running the statement a second time after the VACUUM improves in performance. It does not. After the VACUUM the statement remains slow. Thanks for your help, --patrick --- Tom Lane <[EMAIL PROTECTED]> wrote: > patrick ~ <[EMAIL PROTECTED]> writes: > > that if I 'createdb' and populate it with the "sanatized" data the > > query in question is quite fast; 618 rows returned in 864.522 ms. > > This was puzzling. Next I noticed that after a VACUUM the very same > > query would slow down to a crawl; 618 rows returned in 1080688.921 ms). > > The outer query is too simple to have more than one possible plan, > so the issue is certainly a change in query plans inside the function. > You need to be investigating what's happening inside that function. > 7.1 doesn't have adequate tools for this, but in 7.4 you can use > PREPARE and EXPLAIN ANALYZE EXECUTE to examine the query plans used > for parameterized statements, which is what you've got here. > > My bet is that with ANALYZE stats present, the planner guesses wrong > about which index to use; but without looking at EXPLAIN ANALYZE output > there's no way to be sure. > > BTW, why the bizarrely complicated substitute for a NOT NULL test? > ISTM you only need > > create function > pkk_offer_has_pending_purch( integer ) > returns bool > as ' > select p0.purchase_id is not null > from pkk_purchase p0 > where p0.offer_id = $1 > and ( p0.pending = true > or ( ( p0.expire_time > now() > or p0.expire_time isnull ) >and p0.cancel_date isnull ) ) > limit 1 > ' language 'sql' ; > > (Actually, seeing that pkk_purchase.purchase_id is defined as NOT NULL, > I wonder why the function exists at all ... but I suppose you've > "stripped" the function to the point of being nonsense.) > > regards, tom lane __ Do you Yahoo!? Check out the new Yahoo! Front Page. www.yahoo.com ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Restricting Postgres
Martin Foster wrote: Simon Riggs wrote: On Tue, 2004-11-02 at 23:52, Martin Foster wrote: [...] I've seen this behavior before when restarting the web server during heavy loads.Apache goes from zero connections to a solid 120, causing PostgreSQL to spawn that many children in a short order of time just to keep up with the demand. But wouldn't limiting the number of concurrent connections do this at the source. If you tell it that "You can at most have 20 connections" you would never have postgres spawn 120 children. I'm not sure what apache does if it can't get a DB connection, but it seems exactly like what you want. Now, if you expected to have 50 clients that all like to just sit on open connections, you could leave the number of concurrent connections high. But if your only connect is from the webserver, where all of them are designed to be short connections, then leave the max low. The other possibility is having the webserver use connection pooling, so it uses a few long lived connections. But even then, you could limit it to something like 10-20, not 120. John =:-> signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Restricting Postgres
John A Meinel wrote: Martin Foster wrote: Simon Riggs wrote: On Tue, 2004-11-02 at 23:52, Martin Foster wrote: [...] I've seen this behavior before when restarting the web server during heavy loads.Apache goes from zero connections to a solid 120, causing PostgreSQL to spawn that many children in a short order of time just to keep up with the demand. But wouldn't limiting the number of concurrent connections do this at the source. If you tell it that "You can at most have 20 connections" you would never have postgres spawn 120 children. I'm not sure what apache does if it can't get a DB connection, but it seems exactly like what you want. Now, if you expected to have 50 clients that all like to just sit on open connections, you could leave the number of concurrent connections high. But if your only connect is from the webserver, where all of them are designed to be short connections, then leave the max low. The other possibility is having the webserver use connection pooling, so it uses a few long lived connections. But even then, you could limit it to something like 10-20, not 120. John =:-> I have a dual processor system that can support over 150 concurrent connections handling normal traffic and load. Now suppose I setup Apache to spawn all of it's children instantly, what will happen is that as this happens the PostgreSQL server will also receive 150 attempts at connection. This will spawn 150 children in a short order of time and as this takes place clients can connect and start requesting information not allowing the machine to settle down to a normal traffic.That spike when initiated can cripple the machine or even the webserver if a deadlocked transaction is introduced. Because on the webserver side a slowdown in the database means that it will just get that many more connection attempts pooled from the clients. As they keep clicking and hitting reload over and over to get a page load, that server starts to buckle hitting unbelievably high load averages. When the above happened once, I lost the ability to type on a console because of a 60+ (OpenBSD) load average on a single processor system. The reason why Apache now drops a 503 Service Unavailable when loads get too high. It's that spike I worry about and it can happen for whatever reason. It could just as easily be triggered by a massive concurrent request for processing of an expensive query done in DDOS fashion. This may not affect the webserver at all, at least immediately, but the same problem can effect can come into effect. Limiting connections help, but it's not the silver bullet and limits your ability to support more connections because of that initial spike. The penalty for forking a new child is hardly unexecpected, even Apache will show the same effect when restarted in a high traffic time. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] preloading indexes
If your running Linux, and kernel 2.6.x, you can try playing with the: /proc/sys/vm/swappiness setting. My understanding is that: echo "0" > /proc/sys/vm/swappiness Will try to keep all in-use application memory from being swapped out when other processes query the disk a lot. Although, since PostgreSQL utilizes the disk cache quite a bit, this may not help you. On Wed, 2004-11-03 at 15:53 -0500, Tom Lane wrote: > <[EMAIL PROTECTED]> writes: > > The caching appears to disappear overnight. > > You've probably got cron jobs that run late at night and blow out your > kernel disk cache by accessing a whole lot of non-Postgres stuff. > (A nightly disk backup is one obvious candidate.) The most likely > solution is to run some cron job a little later to exercise your > database and thereby repopulate the cache with Postgres files before > you get to work ;-) > > regards, tom lane > > ---(end of broadcast)--- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match -- Mike Benoit <[EMAIL PROTECTED]> signature.asc Description: This is a digitally signed message part
[PERFORM] index not used if using IN or OR
Hello ! Sorry if this has been discussed before, it is just hard to find in the archives using the words "or" or "in" :-o I use postgres-8.0 beta4 for windows. I broke down my problem to a very simple table - two columns "primary_key" and "secondary_key". Creates and Insert you will find below. If I query the _empty_ freshly created table I get the following explain result: select * from tt where seckey = 1; Index Scan using seckey_key on tt (cost=0.00..17.07 rows=5 width=12) Index Cond: (seckey = 1) If I use "OR" (or IN) things get worse: select * from tt where seckey = 1 or seckey = 2 Seq Scan on tt (cost=0.00..0.00 rows=1 width=12) Filter: ((seckey = 1) OR (seckey = 2)) Note the "Seq Scan" instead of using the index. After populating the table with 8920 records and "analyze" the scenario gets even worser: select * from tt where seckey = 1; Seq Scan on tt (cost=0.00..168.50 rows=1669 width=12) (actual time=0.000..15.000 rows=1784 loops=1) Filter: (seckey = 1) Total runtime: 31.000 ms Now also this simple query uses a "Seq Scan". Now the questions are: a) Why is the index not used if I use "OR" or "IN" b) Why is the index not used after "analyze" ? Any help is very appreciated! Thanks, Mario // The table and data CREATE TABLE tt ( pkey int4 NOT NULL DEFAULT nextval('public."tt_PKEY_seq"'::text), seckey int8, CONSTRAINT pkey_key PRIMARY KEY (pkey) ) WITHOUT OIDS; CREATE INDEX seckey_key ON tt USING btree (seckey); // inserted many-many times insert into tt values (default, 1); insert into tt values (default, 2); insert into tt values (default, 3); insert into tt values (default, 4); insert into tt values (default, 5); ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match