2011/6/19 Robert Haas <robertmh...@gmail.com>: > On Sun, Jun 19, 2011 at 9:38 AM, Greg Stark <st...@mit.edu> wrote: >> On Tue, Jun 14, 2011 at 4:04 PM, Robert Haas <robertmh...@gmail.com> wrote: >>> 1. ANALYZE happens far too infrequently to believe that any data taken >>> at ANALYZE time will still be relevant at execution time. >>> 2. Using data gathered by ANALYZE will make plans less stable, and our >>> users complain not infrequently about the plan instability we already >>> have, therefore we should not add more. >>> 3. Even if the data were accurate and did not cause plan stability, we >>> have no evidence that using it will improve real-world performance. >> >> I feel like this is all baseless FUD. ANALYZE isn't perfect but it's >> our interface for telling postgres to gather stats and we generally >> agree that having stats and modelling the system behaviour as >> accurately as practical is the right direction so we need a specific >> reason why this stat and this bit of modeling is a bad idea before we >> dismiss it. >> >> I think the kernel of truth in these concerns is simply that >> everything else ANALYZE looks at mutates only on DML. If you load the >> same data into two databases and run ANALYZE you'll get (modulo random >> sampling) the same stats. And if you never modify it and analyze it >> again a week later you'll get the same stats again. So autovacuum can >> guess when to run analyze based on the number of DML operations, it >> can run it without regard to how busy the system is, and it can hold >> off on running it if the data hasn't changed. >> >> In the case of the filesystem buffer cache the cached percentage will >> vary over time regardless of whether the data changes. Plain select >> queries will change it, even other activity outside the database will >> change it. There are a bunch of strategies for mitigating this >> problem: we might want to look at the cache situation more frequently, >> discount the results we see since more aggressively, and possibly >> maintain a kind of running average over time. >> >> There's another problem which I haven't seen mentioned. Because the >> access method will affect the cache there's the possibility of >> feedback loops. e.g. A freshly loaded system prefers sequential scans >> for a given table because without the cache the seeks of random reads >> are too expensive... causing it to never load that table into cache... >> causing that table to never be cached and never switch to an index >> method. It's possible there are mitigation strategies for this as well >> such as keeping a running average over time and discounting the >> estimates with some heuristic values. > > *scratches head* > > Well, yeah. I completely agree with you that these are the things we > need to worry about. Maybe I did a bad job explaining myself, because > ISTM you said my concerns were FUD and then went on to restate them in > different words. > > I'm not bent out of shape about using ANALYZE to try to gather the > information. That's probably a reasonable approach if it turns out we > actually need to do it at all. I am not sure we do. What I've argued > for in the past is that we start by estimating the percentage of the > relation that will be cached based on its size relative to > effective_cache_size, and allow the administrator to override the > percentage on a per-relation basis if it turns out to be wrong. That > would avoid all of these concerns and allow us to focus on the issue > of how the caching percentages impact the choice of plan, and whether > the plans that pop out are in fact better when you provide information > on caching as input. If we have that facility in core, then people > can write scripts or plug-in modules to do ALTER TABLE .. SET > (caching_percentage = XYZ) every hour or so based on the sorts of > statistics that Cedric is gathering here, and users will be able to > experiment with a variety of algorithms and determine which ones work > the best.
Robert, I am very surprised. My patch does offer that. 1st, I used ANALYZE because it is the way to update pg_class I found. You are suggesting ALTER TABLE instead, that is fine, but give me that lock-free :) else we have the ahem.. Alvaro's pg_class_ng (I find this one interesting because it will be lot easier to have different values on standby server if we find a way to have pg_class_ng 'updatable' per server) So, as long as the value can be change without problem, I don't care where it resides. 2nd, I provided the patches on the last CF, exactly to allow to go to the exciting part: the cost-estimates changes. (after all, we can work on the cost estimate, and if later we find a way to use ALTER TABLE or pg_class_ng, just do it instead of via the ANALYZE magic) 3nd, you can right now write a plugin to set the value of rel_oscache (exactly like the one you'll do for a ALTER TABLE SET reloscache...) RelationGetRelationOSCacheInFork(Relation relation, ForkNumber forkNum) { float4 percent = 0; /* if a plugin is present, let it manage things */ if (OSCache_hook) percent = (*OSCache_hook) (relation, forkNum); return percent; } Looks like the main fear is because I used the ANALYZE word... PS: ANALYZE OSCACHE does *not* run with ANALYZE, those are distinct operations. (ANALYZE won't do the job of ANALYZE OSCACHE, we can discuss the grammar, maybe a ANALYZE ([OSCACHE], [DATA], ...) will be better ). -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers