Hi, My question was more of a fundamental nature as this count by scan seemed to contradict the theory about how to optimize it. I assume(d) the more expensive statistics (e.g., value distribution info) are updated only when outdated too much or on request (manual vacuum). Usually, other/cheap statistics can easily be maintained incrementally and thus reflect actual table state after each update. Of course, the MVCC principle seems to make things a bit more complicated I understand now. But tracking whether statistics are dirty has to be in the system anyway. How does it otherwise decide when to do its own statistics updates? So if explain can get the most recent count, why not use it in the count as well if you know the statistics are still acurate? By the way, a count(*) without any where does occur very frequently if you are dealing with an OLAP load, which is the case in my setting. So, I indeed already 'fixed' the performance problem by precomputing all counts I can predict to be of any use. Anyway, I understood this issue has been subject to discusion before I was on the list (searching the archive/website was/is not very effective, so I didn't know until someone told me so, sorry). So, I leave it to the developers what to do with this topic. Regards, Henk Ernst Tino Wildenhain wrote: hi, On Tue, 2004-10-26 at 10:16, Henk Ernst Blok wrote:Hi Posgres users/developers,Can anyone explain why PosgreSQL (version 7.4.5 on Linux) does a full table scan to compute a count(*) on a base table after a vacuum analyze has been done with no following updates that might have outdated any statistics. Strangly the explain command does give the correct number of tuples instantaniously from the catalog, as one would expect. Still the optimizer thinks it needs a full table scan to do count....The consequence of this seemingly odd count implementation is a very very slow count.How should the query planner know the vacuum was recent enough and there were no modifications to the table since? If you are interested in rough numbers you could read the system tables for the last vacuum statistics. If you need fast count and can spend some cycles on inserts, just make a buffer table with count results after insert. Unqualified count e.g. without a WHERE clause should not need to be used a lot. Regards Tino -- address: DB group, Computer Science, EEMCS Dept., University of Twente, PO Box 217, 7500 AE, ENSCHEDE, THE NETHERLANDS phone: ++31 (0)53 489 3754 (if no response: 3690) email: [EMAIL PROTECTED] WWW: http://www.cs.utwente.nl/~blokh |
- [GENERAL] Strange count(*) implementation? Henk Ernst Blok
- Re: [GENERAL] Strange count(*) implementation? Richard Huxton
- Re: [GENERAL] Strange count(*) implementation? Tino Wildenhain
- Re: [GENERAL] Strange count(*) implementation? Henk Ernst Blok
- Re: [GENERAL] Strange count(*) implementation... Neil Conway
- Re: [GENERAL] Strange count(*) implementation... Alvaro Herrera
- Re: [GENERAL] Strange count(*) implementa... Henk Ernst Blok
- Re: [GENERAL] Strange count(*) implementation... Tino Wildenhain
- Re: [GENERAL] Strange count(*) implementa... Henk Ernst Blok
- Re: [GENERAL] Strange count(*) imple... Tino Wildenhain
- Re: [GENERAL] Strange count(*) i... Henk Ernst Blok