Re: [PERFORM] How to improve db performance with $7K?
With hardware tuning, I am sure we can do better than 35Mb per sec. Also WTF ? My Laptop does 19 MB/s (reading <10 KB files, reiser4) ! A recent desktop 7200rpm IDE drive # hdparm -t /dev/hdc1 /dev/hdc1: Timing buffered disk reads: 148 MB in 3.02 seconds = 49.01 MB/sec # ll "DragonBall 001.avi" -r--r--r--1 peufeu users218M mar 9 20:07 DragonBall 001.avi # time cat "DragonBall 001.avi" >/dev/null real0m4.162s user0m0.020s sys 0m0.510s (the file was not in the cache) => about 52 MB/s (reiser3.6) So, you have a problem with your hardware... ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Delete query takes exorbitant amount of time
On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote: > > Each value has 1/13th of the table, which is too many rows per value to > > make an IndexScan an efficient way of deleting rows from the table. > > But, the original question was that the delete that was taking a long time > was on a different table. I tried to delete 150 rows from a table with 750 > rows, which is FK referenced from this large table. If I understand > correctly, Tom suggested that the length of time was due to a sequential > scan being done on the large table for each value being deleted from the > small one. > For this FK check, there only need be one referring id to invalidate the > delete. ISTM that for any delete with a FK reference, the index could > always be used to search for a single value in the referring table > (excepting very small tables). Why then must a sequential scan be > performed in this case, and/or in general? My understanding was that you were doing a DELETE on the smaller table and that this was doing a DELETE on the measurement table because you had the FK defined as ON DELETE CASCADE. You are right - only a single row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE action of CASCADE then you will want to touch all rows referenced, so a SeqScan is a perfectly valid consequence of such actions. I think now that you are using the default action, rather than specifically requesting CASCADE? Stephan, Tom: The SQL generated for RI checking by the RI triggers currently applies a limit at execution time, not at prepare time. i.e. there is no LIMIT clause in the SQL. We know whether the check will be limit 1 or limit 0 at prepare time, so why not add a LIMIT clause to the SQL so it changes the plan, not just the number of rows returned when the check query executes? (I note that PREPARE does allow you to supply a LIMIT 1 clause). That is *ought* to have some effect on the plan used by the RI check queries. In costsize.c:cost_index we would have tuples_fetched==1 and it would be hard (but not impossible) for the index cost to ever be more than the cost of a SeqScan. ...but, I see no way for OidFunctionCall8 to ever return an answer of "always just 1 row, no matter how big the relation"...so tuples_fetched is always proportional to the size of the relation. Are unique indexes treated just as very-low-selectivity indexes? - they're a very similar situation in terms of forcing an absolute, not relative, number of rows returned. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How to improve db performance with $7K?
Yeah, 35Mb per sec is slow for a raid controller, the 3ware mirrored is about 50Mb/sec, and striped is about 100 Dave PFC wrote: With hardware tuning, I am sure we can do better than 35Mb per sec. Also WTF ? My Laptop does 19 MB/s (reading <10 KB files, reiser4) ! A recent desktop 7200rpm IDE drive # hdparm -t /dev/hdc1 /dev/hdc1: Timing buffered disk reads: 148 MB in 3.02 seconds = 49.01 MB/sec # ll "DragonBall 001.avi" -r--r--r--1 peufeu users218M mar 9 20:07 DragonBall 001.avi # time cat "DragonBall 001.avi" >/dev/null real0m4.162s user0m0.020s sys 0m0.510s (the file was not in the cache) => about 52 MB/s (reiser3.6) So, you have a problem with your hardware... ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings -- Dave Cramer http://www.postgresintl.com 519 939 0336 ICQ#14675561 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Delete query takes exorbitant amount of time
On Tue, 29 Mar 2005, Simon Riggs wrote: > On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote: > > > Each value has 1/13th of the table, which is too many rows per value to > > > make an IndexScan an efficient way of deleting rows from the table. > > > > But, the original question was that the delete that was taking a long time > > was on a different table. I tried to delete 150 rows from a table with 750 > > rows, which is FK referenced from this large table. If I understand > > correctly, Tom suggested that the length of time was due to a sequential > > scan being done on the large table for each value being deleted from the > > small one. > > > For this FK check, there only need be one referring id to invalidate the > > delete. ISTM that for any delete with a FK reference, the index could > > always be used to search for a single value in the referring table > > (excepting very small tables). Why then must a sequential scan be > > performed in this case, and/or in general? > > My understanding was that you were doing a DELETE on the smaller table > and that this was doing a DELETE on the measurement table because you > had the FK defined as ON DELETE CASCADE. You are right - only a single > row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE > action of CASCADE then you will want to touch all rows referenced, so a > SeqScan is a perfectly valid consequence of such actions. > I think now that you are using the default action, rather than > specifically requesting CASCADE? > > Stephan, Tom: > The SQL generated for RI checking by the RI triggers currently applies a > limit at execution time, not at prepare time. i.e. there is no LIMIT > clause in the SQL. > > We know whether the check will be limit 1 or limit 0 at prepare time, so > why not add a LIMIT clause to the SQL so it changes the plan, not just > the number of rows returned when the check query executes? Because IIRC, FOR UPDATE and LIMIT at least historically didn't play nicely together, so you could sometimes get a result where if the first row was locked, the FOR UPDATE would wait on it, but if it was deleted by the other transaction you could get 0 rows back in the trigger. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Delete query takes exorbitant amount of time
On Tue, 29 Mar 2005, Stephan Szabo wrote: > On Tue, 29 Mar 2005, Simon Riggs wrote: > > > On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote: > > > > Each value has 1/13th of the table, which is too many rows per value to > > > > make an IndexScan an efficient way of deleting rows from the table. > > > > > > But, the original question was that the delete that was taking a long time > > > was on a different table. I tried to delete 150 rows from a table with 750 > > > rows, which is FK referenced from this large table. If I understand > > > correctly, Tom suggested that the length of time was due to a sequential > > > scan being done on the large table for each value being deleted from the > > > small one. > > > > > For this FK check, there only need be one referring id to invalidate the > > > delete. ISTM that for any delete with a FK reference, the index could > > > always be used to search for a single value in the referring table > > > (excepting very small tables). Why then must a sequential scan be > > > performed in this case, and/or in general? > > > > My understanding was that you were doing a DELETE on the smaller table > > and that this was doing a DELETE on the measurement table because you > > had the FK defined as ON DELETE CASCADE. You are right - only a single > > row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE > > action of CASCADE then you will want to touch all rows referenced, so a > > SeqScan is a perfectly valid consequence of such actions. > > I think now that you are using the default action, rather than > > specifically requesting CASCADE? > > > > Stephan, Tom: > > The SQL generated for RI checking by the RI triggers currently applies a > > limit at execution time, not at prepare time. i.e. there is no LIMIT > > clause in the SQL. > > > > We know whether the check will be limit 1 or limit 0 at prepare time, so > > why not add a LIMIT clause to the SQL so it changes the plan, not just > > the number of rows returned when the check query executes? > > Because IIRC, FOR UPDATE and LIMIT at least historically didn't play > nicely together, so you could sometimes get a result where if the first > row was locked, the FOR UPDATE would wait on it, but if it was deleted by > the other transaction you could get 0 rows back in the trigger. If there were some way to pass a "limit" into SPI_prepare that was treated similarly to a LIMIT clause for planning purposes but didn't actually change the output plan to only return that number of rows, we could use that. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Delete query takes exorbitant amount of time
Stephan Szabo <[EMAIL PROTECTED]> writes: > On Tue, 29 Mar 2005, Simon Riggs wrote: >> The SQL generated for RI checking by the RI triggers currently applies a >> limit at execution time, not at prepare time. i.e. there is no LIMIT >> clause in the SQL. >> >> We know whether the check will be limit 1 or limit 0 at prepare time, so >> why not add a LIMIT clause to the SQL so it changes the plan, not just >> the number of rows returned when the check query executes? > Because IIRC, FOR UPDATE and LIMIT at least historically didn't play > nicely together, so you could sometimes get a result where if the first > row was locked, the FOR UPDATE would wait on it, but if it was deleted by > the other transaction you could get 0 rows back in the trigger. Yeah, this is still true. It would probably be a good idea to change it but I haven't looked into exactly what would be involved. The basic problem is that the FOR UPDATE filter needs to execute before LIMIT instead of after, so presumably the FOR UPDATE shenanigans in execMain.c would need to be pushed into a separate plan node that could go underneath the LIMIT node. Originally this would have led to even more broken behavior --- locks taken on rows that weren't returned --- because the original coding of the LIMIT node tended to pull one more row from the lower plan than it would actually return. But we fixed that. I think having such a node might allow us to support FOR UPDATE in subqueries, as well, but I haven't looked at the details. (Whether that is a good idea is another question --- the problem of pulling rows that aren't nominally necessary, and thereby locking them, would apply in spades.) regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Delete query takes exorbitant amount of time
Simon Riggs <[EMAIL PROTECTED]> writes: > ...but, I see no way for OidFunctionCall8 to ever return an answer of > "always just 1 row, no matter how big the relation"...so tuples_fetched > is always proportional to the size of the relation. Are unique indexes > treated just as very-low-selectivity indexes? Yeah. It is not the job of amcostestimate to estimate the number of rows, only the index access cost. (IIRC there is someplace in the planner that explicitly considers unique indexes as a part of developing selectivity estimates ... but it's not that part.) regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How to improve db performance with $7K?
On Mon, 2005-03-28 at 17:36 +, Steve Poe wrote: > I agree with you. Unfortunately, I am not the developer of the > application. The vendor uses ProIV which connects via ODBC. The vendor > could certain do some tuning and create more indexes where applicable. I > am encouraging the vendor to take a more active role and we work > together on this. I've done a lot browsing through pg_stat_activity, looking for queries that either hang around for a while or show up very often, and using explain to find out if they can use some assistance. You may also find that a dump and restore with a reconfiguration to mirrored drives speeds you up a lot - just from the dump and restore. > With hardware tuning, I am sure we can do better than 35Mb per sec. Also > moving the top 3 or 5 tables and indexes to their own slice of a RAID10 > and moving pg_xlog to its own drive will help too. If your database activity involves a lot of random i/o, 35Mb per second wouldn't be too bad. While conventional wisdom is that pg_xlog on its own drives (I know you meant plural :) ) is a big boost, in my particular case I could never get a a measurable boost that way. Obviously, YMMV. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Delete query takes exorbitant amount of time
Stephan Szabo <[EMAIL PROTECTED]> writes: > If there were some way to pass a "limit" into SPI_prepare that was treated > similarly to a LIMIT clause for planning purposes but didn't actually > change the output plan to only return that number of rows, we could use > that. Hmm ... the planner does have the ability to do that sort of thing (we use it for cursors). SPI_prepare doesn't expose the capability. Perhaps adding a SPI_prepare variant that does expose it would be the quickest route to a solution. I get a headache every time I look at the RI triggers ;-). Do they always know at the time of preparing a plan which way it will be used? regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Delete query takes exorbitant amount of time
On Tue, 2005-03-29 at 05:50 -0800, Stephan Szabo wrote: > On Tue, 29 Mar 2005, Simon Riggs wrote: > > > On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote: > > > > Each value has 1/13th of the table, which is too many rows per value to > > > > make an IndexScan an efficient way of deleting rows from the table. > > > > > > But, the original question was that the delete that was taking a long time > > > was on a different table. I tried to delete 150 rows from a table with 750 > > > rows, which is FK referenced from this large table. If I understand > > > correctly, Tom suggested that the length of time was due to a sequential > > > scan being done on the large table for each value being deleted from the > > > small one. > > > > > For this FK check, there only need be one referring id to invalidate the > > > delete. ISTM that for any delete with a FK reference, the index could > > > always be used to search for a single value in the referring table > > > (excepting very small tables). Why then must a sequential scan be > > > performed in this case, and/or in general? > > > > My understanding was that you were doing a DELETE on the smaller table > > and that this was doing a DELETE on the measurement table because you > > had the FK defined as ON DELETE CASCADE. You are right - only a single > > row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE > > action of CASCADE then you will want to touch all rows referenced, so a > > SeqScan is a perfectly valid consequence of such actions. > > I think now that you are using the default action, rather than > > specifically requesting CASCADE? > > > > Stephan, Tom: > > The SQL generated for RI checking by the RI triggers currently applies a > > limit at execution time, not at prepare time. i.e. there is no LIMIT > > clause in the SQL. > > > > We know whether the check will be limit 1 or limit 0 at prepare time, so > > why not add a LIMIT clause to the SQL so it changes the plan, not just > > the number of rows returned when the check query executes? > > Because IIRC, FOR UPDATE and LIMIT at least historically didn't play > nicely together, so you could sometimes get a result where if the first > row was locked, the FOR UPDATE would wait on it, but if it was deleted by > the other transaction you could get 0 rows back in the trigger. > Well, sorry to ask more... ...but surely we only need FOR UPDATE clause if we are performing a CASCADE action? whereas we only want the LIMIT 1 clause if we are NOT performing a CASCADE action? That way the two clauses are mutually exclusive and the problem you outline should never (need to) occur. The current code doesn't seem to vary the check query according to the requested FK action... Best Regards, Simon Riggs ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Delete query takes exorbitant amount of time
Simon Riggs <[EMAIL PROTECTED]> writes: > If action is NO ACTION or RESTRICT then > we need to SELECT at most 1 row that matches the criteria > which means we can use LIMIT 1 > If action is CASCADE, SET NULL, SET DEFAULT then > we need to UPDATE or DELETE all rows that match the criteria > which means we musnt use LIMIT and need to use FOR UPDATE Huh? UPDATE/DELETE don't use FOR UPDATE. I think you have failed to break down the cases sufficiently. In particular it matters which side of the RI constraint you are working from ... regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Delete query takes exorbitant amount of time
On Tue, 2005-03-29 at 09:56 -0500, Tom Lane wrote: > Stephan Szabo <[EMAIL PROTECTED]> writes: > > If there were some way to pass a "limit" into SPI_prepare that was treated > > similarly to a LIMIT clause for planning purposes but didn't actually > > change the output plan to only return that number of rows, we could use > > that. > > Hmm ... the planner does have the ability to do that sort of thing (we > use it for cursors). SPI_prepare doesn't expose the capability. > Perhaps adding a SPI_prepare variant that does expose it would be the > quickest route to a solution. > > I get a headache every time I look at the RI triggers ;-). Do they > always know at the time of preparing a plan which way it will be used? If action is NO ACTION or RESTRICT then we need to SELECT at most 1 row that matches the criteria which means we can use LIMIT 1 If action is CASCADE, SET NULL, SET DEFAULT then we need to UPDATE or DELETE all rows that match the criteria which means we musnt use LIMIT and need to use FOR UPDATE We know that at CONSTRAINT creation time, which always occurs before plan preparation time. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Delete query takes exorbitant amount of time
On Tue, 2005-03-29 at 10:31 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > If action is NO ACTION or RESTRICT then > > we need to SELECT at most 1 row that matches the criteria > > which means we can use LIMIT 1 > > > If action is CASCADE, SET NULL, SET DEFAULT then > > we need to UPDATE or DELETE all rows that match the criteria > > which means we musnt use LIMIT and need to use FOR UPDATE > > Huh? UPDATE/DELETE don't use FOR UPDATE. I think you have failed > to break down the cases sufficiently. In particular it matters which > side of the RI constraint you are working from ... OK... too quick, sorry. I'll hand over to Stephan for a better and more exhaustive explanation/analysis... but AFAICS we *can* always know the correct formulation of the query prepare time, whether or not we do currently. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Delete query takes exorbitant amount of time
On Tue, 29 Mar 2005, Simon Riggs wrote: > On Tue, 2005-03-29 at 10:31 -0500, Tom Lane wrote: > > Simon Riggs <[EMAIL PROTECTED]> writes: > > > If action is NO ACTION or RESTRICT then > > > we need to SELECT at most 1 row that matches the criteria > > > which means we can use LIMIT 1 > > > > > If action is CASCADE, SET NULL, SET DEFAULT then > > > we need to UPDATE or DELETE all rows that match the criteria > > > which means we musnt use LIMIT and need to use FOR UPDATE > > > > Huh? UPDATE/DELETE don't use FOR UPDATE. I think you have failed > > to break down the cases sufficiently. In particular it matters which > > side of the RI constraint you are working from ... > > OK... too quick, sorry. I'll hand over to Stephan for a better and more > exhaustive explanation/analysis... but AFAICS we *can* always know the > correct formulation of the query prepare time, whether or not we do > currently. We currently use FOR UPDATE on the NO ACTION check, because otherwise we might get back a row that's already marked for deletion by a concurrent transaction. I think that's intended to wait and succeed, not fail. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Delete query takes exorbitant amount of time
On Tue, 29 Mar 2005, Tom Lane wrote: > Stephan Szabo <[EMAIL PROTECTED]> writes: > > If there were some way to pass a "limit" into SPI_prepare that was treated > > similarly to a LIMIT clause for planning purposes but didn't actually > > change the output plan to only return that number of rows, we could use > > that. > > Hmm ... the planner does have the ability to do that sort of thing (we > use it for cursors). SPI_prepare doesn't expose the capability. > Perhaps adding a SPI_prepare variant that does expose it would be the > quickest route to a solution. > > I get a headache every time I look at the RI triggers ;-). Do they Me too, honestly. > always know at the time of preparing a plan which way it will be used? I believe so. I think each saved plan pretty much lives for a single trigger type/argument set and is basically used in only one place. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Delete query takes exorbitant amount of time
On Tue, 2005-03-29 at 09:40 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > ...but, I see no way for OidFunctionCall8 to ever return an answer of > > "always just 1 row, no matter how big the relation"...so tuples_fetched > > is always proportional to the size of the relation. Are unique indexes > > treated just as very-low-selectivity indexes? > > Yeah. It is not the job of amcostestimate to estimate the number of > rows, only the index access cost. (IIRC there is someplace in the > planner that explicitly considers unique indexes as a part of developing > selectivity estimates ... but it's not that part.) Well, I mention this because costsize.c:cost_index *does* calculate the number of rows returned. If unique indexes are handled elsewhere then this would not cause problems for them...but for LIMIT queries..? cost_index gets the selectivity then multiplies that by number of tuples in the relation to calc tuples_fetched, so it can use that in the Mackert & Lohman formula. There's no consideration of the query limits. That implies to me that LIMIT queries are not considered correctly in the M&L formula and thus we are more likely to calculate a too-high cost for using an index in those circumstancesand thus more likely to SeqScan for medium sized relations? Best Regards, Simon Riggs ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Delete query takes exorbitant amount of time
Simon Riggs <[EMAIL PROTECTED]> writes: > That implies to me that LIMIT queries are not considered correctly in > the M&L formula and thus we are more likely to calculate a too-high cost > for using an index in those circumstancesand thus more likely to > SeqScan for medium sized relations? You misunderstand how LIMIT is handled. The plan structure is LIMIT ... regular plan ... and so the strategy is to plan and cost the regular plan as though it would be carried out in full, and then take an appropriate fraction of that at the LIMIT stage. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How to improve db performance with $7K?
Dave Cramer <[EMAIL PROTECTED]> writes: > PFC wrote: > > > > My Laptop does 19 MB/s (reading <10 KB files, reiser4) ! > > Yeah, 35Mb per sec is slow for a raid controller, the 3ware mirrored is > about 50Mb/sec, and striped is about 100 Well you're comparing apples and oranges here. A modern 7200rpm drive should be capable of doing 40-50MB/s depending on the location of the data on the disk. But that's only doing sequential access of data using something like dd and without other processes intervening and causing seeks. In practice it seems a busy databases see random_page_costs of about 4 which for a drive with 10ms seek time translates to only about 3.2MB/s. I think the first order of business is getting pg_xlog onto its own device. That alone should remove a lot of the seeking. If it's an ext3 device I would also consider moving the journal to a dedicated drive as well. (or if they're scsi drives or you're sure the raid controller is safe from write caching then just switch file systems to something that doesn't journal data.) -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Million of rows
Hi everybody... I'm new hear, and I will try to explain my problem, and maybe I can get a help... I'm writing a software for 3 years, and this software has the position GPS from vehicles, and other informations. My problem starts when I had to store all the data about the vehicles, about 1 or 2 months. Actually I had a table called DADO_LIDO, that I write all information and the primary key is DATA (GPS DAY+HOUR) and the VEHICLE IDENTIFY. Each vehicle trasmit 1 position by 30 seconds, so I have something like 2000 rows per vehicle/day. I already has 2 clients one with 4000 vehicles, and the other with 500 vehicles. My application was made in delphi using ZEOS that's permit me testing in mysql and postgres. I allready has the two databases. But now the problem starts when I has to select data from this vehicles about the history ( I store only 2 months ) something like 40 or 50 millions of data about 500 vehicles. Using the keys VEHICLE_ID and GPS_TIME, the perfomance is very low... I need some ideas for a better perfomance in this table using selects by PERIOD / VEHICLE PERIOD / VEHICLES PERIOD / VEHICLE / ( a bit test in 3 integer columns using logical operators ) Thanks for any help Vinicius Marques De Bernardi ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Delete query takes exorbitant amount of time
On Tue, 2005-03-29 at 12:31 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > That implies to me that LIMIT queries are not considered correctly in > > the M&L formula and thus we are more likely to calculate a too-high cost > > for using an index in those circumstancesand thus more likely to > > SeqScan for medium sized relations? > > You misunderstand how LIMIT is handled. Huh? Well, not this time. (Though my error rate is admittedly high.) > The plan structure is > > LIMIT ... > regular plan ... > > and so the strategy is to plan and cost the regular plan as though it > would be carried out in full, and then take an appropriate fraction > of that at the LIMIT stage. To cost it as if it would be carried out in full and then not execute in full is the same thing as saying it overestimates the actual execution cost. Which can lead to selection of SeqScan plan when the IndexScan would have been cheaper, all things considered. ...it could work like this LIMIT regular plan (plan chosen knowing that LIMIT follows) so that the LIMIT would be considered in the M&L formula. Not that I am driven by how other systems work, but both DB2 and Oracle allow this form of optimization. There's not a huge benefit in sending LIMIT 1 through on the FK check queries unless they'd be taken into account in the planning. Anyway, I'm not saying I know how to do this yet/ever, just to say it is possible to use the information available to better effect. This looks like a TODO item to me? Thoughts? Best Regards, Simon Riggs ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Million of rows
On Tue, Mar 29, 2005 at 03:33:24PM -0300, Vinicius Bernardi wrote: > > But now the problem starts when I has to select data from this > vehicles about the history ( I store only 2 months ) something like 40 > or 50 millions of data about 500 vehicles. > > Using the keys VEHICLE_ID and GPS_TIME, the perfomance is very low... Please post an example query and the EXPLAIN ANALYZE output. The table definition might be useful too. > I need some ideas for a better perfomance in this table Do you have indexes where you need them? Do you cluster on any of the indexes? Do you VACUUM and ANALYZE the database regularly? Have you investigated whether you need to increase the statistics on any columns? Have you tuned postgresql.conf? What version of PostgreSQL are you using? -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Million of rows
At now I have this system runing in a mysql, all the examples I have are in mysql, but the biggest client that will start now, we will use PostgreSQL, so I need a way to do those questions in postgres... Ideas like TABLESPACES or anothe things... Just looking for start ideas... Thanks Vinicius Marques De Bernardi On Tue, 29 Mar 2005 12:08:15 -0700, Michael Fuhr <[EMAIL PROTECTED]> wrote: > On Tue, Mar 29, 2005 at 03:33:24PM -0300, Vinicius Bernardi wrote: > > > > But now the problem starts when I has to select data from this > > vehicles about the history ( I store only 2 months ) something like 40 > > or 50 millions of data about 500 vehicles. > > > > Using the keys VEHICLE_ID and GPS_TIME, the perfomance is very low... > > Please post an example query and the EXPLAIN ANALYZE output. The > table definition might be useful too. > > > I need some ideas for a better perfomance in this table > > Do you have indexes where you need them? Do you cluster on any of > the indexes? Do you VACUUM and ANALYZE the database regularly? > Have you investigated whether you need to increase the statistics > on any columns? Have you tuned postgresql.conf? What version of > PostgreSQL are you using? > > -- > Michael Fuhr > http://www.fuhr.org/~mfuhr/ > ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] VACUUM on duplicate DB gives FSM and total pages discrepancies
I recently pg_dumpall'd my DB from a (used for testing) v20z install of postgresql 8.0.1, and restored it to my production (but not yet in service) v40z running the same version. The test DB has had multiple millions of rows created/dropped during testing. The results of VACUUM VERBOSE are giving me pause: * v40z (hardly used after restore): orfs=# vacuum full analyze verbose; INFO: free space map: 114 relations, 84 pages stored; 1824 total pages needed DETAIL: Allocated FSM size: 1000 relations + 30 pages = 1864 kB shared memory. VACUUM * v20z (after having undergone extensive tweaking, deletes, and inserts): orfs=# vacuum full analyze verbose; INFO: free space map: 53 relations, 13502 pages stored; 9776 total pages needed DETAIL: Allocated FSM size: 1000 relations + 30 pages = 1864 kB shared memory. VACUUM 1) Should I be concerned about the total pages needed? ISTM that using that many more pages can't help but degrade relative performance on my testing machine. 2) How is it that my FSM has different numbers of relations? 3) Are either of these affects normal for an oft-used (or not) DB? FWIW: v40z v20z maintenance_work_mem 262144 16384 shared_buffers 3 1000 Thanks, -- Karim Nassar Department of Computer Science Box 15600, College of Engineering and Natural Sciences Northern Arizona University, Flagstaff, Arizona 86011 Office: (928) 523-5868 -=- Mobile: (928) 699-9221 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match