2011/5/16 Greg Smith <g...@2ndquadrant.com>: > Cédric Villemain wrote: >> >> >> http://git.postgresql.org/gitweb?p=users/c2main/postgres.git;a=shortlog;h=refs/heads/analyze_cache >> > > This rebases easily to make Cedric's changes move to the end; I just pushed > a version with that change to > https://github.com/greg2ndQuadrant/postgres/tree/analyze_cache if anyone > wants a cleaner one to browse. I've attached a patch too if that's more > your thing. > > I'd recommend not getting too stuck on the particular hook Cédric has added > here to compute the cache estimate, which uses mmap and mincore to figure it > out. It's possible to compute similar numbers, albeit less accurate, using > an approach similar to how pg_buffercache inspects things. And I even once > wrote a background writer extension that collected this sort of data as it > was running the LRU scan anyway. Discussions of this idea seem to focus on > how the "what's in the cache?" data is collected, which as far as I'm > concerned is the least important part. There are multiple options, some > work better than others, and there's no reason that can't be swapped out > later. The more important question is how to store the data collected and > then use it for optimizing queries.
Attached are updated patches without the plugin itself. I've also added the cache_page_cost GUC, this one is not per tablespace, like others page_cost. There are 6 patches: 0001-Add-reloscache-column-to-pg_class.patch 0002-Add-a-function-to-update-the-new-pg_class-cols.patch 0003-Add-ANALYZE-OSCACHE-VERBOSE-relation.patch 0004-Add-a-Hook-to-handle-OSCache-stats.patch 0005-Add-reloscache-to-Index-Rel-OptInfo.patch 0006-Add-cache_page_cost-GUC.patch I have some comments on my own code: * I am not sure of the best datatype to use for 'reloscache' * I didn't include the catalog number change in the patch itself. * oscache_update_relstats() is very similar to vac_update_relstats(), maybe better to merge them but reloscache should not be updated at the same time than other stats. * There is probably too much work done in do_oscache_analyze_rel() because I kept vac_open_indexes() (not a big drama atm) * I don't know so much how gram.y works, so I am not sure my changes cover all cases. * No tests; similar columns and GUC does not have test either, but it lacks a test for ANALYZE OSCACHE -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
From d2fe7e85aea31cfe8cd6559a060f71c424fe03af Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr> Date: Wed, 25 May 2011 23:17:36 +0200 Subject: [PATCH 1/7] Add reloscache column to pg_class 1 column reloscache to contain the percentage of pages in cache per relation. May be used by the planner and updated with ANALYZE OSCACHE; (not done yet, see next commits) --- doc/src/sgml/catalogs.sgml | 11 +++++++++ src/backend/catalog/heap.c | 1 + src/backend/utils/cache/relcache.c | 2 + src/include/catalog/pg_class.h | 44 ++++++++++++++++++----------------- 4 files changed, 37 insertions(+), 21 deletions(-) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml new file mode 100644 index 8504555..4cfad39 *** a/doc/src/sgml/catalogs.sgml --- b/doc/src/sgml/catalogs.sgml *************** *** 1634,1639 **** --- 1634,1650 ---- </row> <row> + <entry><structfield>reloscache</structfield></entry> + <entry><type>float4</type></entry> + <entry></entry> + <entry> + Percentage of the files in OS cache. This is only an estimate used by + the planner. It is updated by <command>ANALYZE OSCACHE</command>. + By default, the value is not updated and an extension is required. + </entry> + </row> + + <row> <entry><structfield>reltoastrelid</structfield></entry> <entry><type>oid</type></entry> <entry><literal><link linkend="catalog-pg-class"><structname>pg_class</structname></link>.oid</literal></entry> diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c new file mode 100644 index a6e541d..2043c40 *** a/src/backend/catalog/heap.c --- b/src/backend/catalog/heap.c *************** InsertPgClassTuple(Relation pg_class_des *** 764,769 **** --- 764,770 ---- values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace); values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages); values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples); + values[Anum_pg_class_reloscache - 1] = Float4GetDatum(rd_rel->reloscache); values[Anum_pg_class_reltoastrelid - 1] = ObjectIdGetDatum(rd_rel->reltoastrelid); values[Anum_pg_class_reltoastidxid - 1] = ObjectIdGetDatum(rd_rel->reltoastidxid); values[Anum_pg_class_relhasindex - 1] = BoolGetDatum(rd_rel->relhasindex); diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c new file mode 100644 index d7e94ff..ca09e3b *** a/src/backend/utils/cache/relcache.c --- b/src/backend/utils/cache/relcache.c *************** formrdesc(const char *relationName, Oid *** 1417,1422 **** --- 1417,1423 ---- relation->rd_rel->relpages = 1; relation->rd_rel->reltuples = 1; + relation->rd_rel->reloscache = 0; relation->rd_rel->relkind = RELKIND_RELATION; relation->rd_rel->relhasoids = hasoids; relation->rd_rel->relnatts = (int16) natts; *************** RelationSetNewRelfilenode(Relation relat *** 2661,2666 **** --- 2662,2668 ---- { classform->relpages = 0; /* it's empty until further notice */ classform->reltuples = 0; + classform->reloscache = 0; } classform->relfrozenxid = freezeXid; diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h new file mode 100644 index ffcce3c..93218ad *** a/src/include/catalog/pg_class.h --- b/src/include/catalog/pg_class.h *************** CATALOG(pg_class,1259) BKI_BOOTSTRAP BKI *** 45,50 **** --- 45,51 ---- Oid reltablespace; /* identifier of table space for relation */ int4 relpages; /* # of blocks (not always up-to-date) */ float4 reltuples; /* # of tuples (not always up-to-date) */ + float4 reloscache; /* % of files in OS cache (not always up-to-date) */ Oid reltoastrelid; /* OID of toast table; 0 if none */ Oid reltoastidxid; /* if toast table, OID of chunk_id index */ bool relhasindex; /* T if has (or has had) any indexes */ *************** typedef FormData_pg_class *Form_pg_class *** 92,98 **** * ---------------- */ ! #define Natts_pg_class 26 #define Anum_pg_class_relname 1 #define Anum_pg_class_relnamespace 2 #define Anum_pg_class_reltype 3 --- 93,99 ---- * ---------------- */ ! #define Natts_pg_class 27 #define Anum_pg_class_relname 1 #define Anum_pg_class_relnamespace 2 #define Anum_pg_class_reltype 3 *************** typedef FormData_pg_class *Form_pg_class *** 103,124 **** #define Anum_pg_class_reltablespace 8 #define Anum_pg_class_relpages 9 #define Anum_pg_class_reltuples 10 ! #define Anum_pg_class_reltoastrelid 11 ! #define Anum_pg_class_reltoastidxid 12 ! #define Anum_pg_class_relhasindex 13 ! #define Anum_pg_class_relisshared 14 ! #define Anum_pg_class_relpersistence 15 ! #define Anum_pg_class_relkind 16 ! #define Anum_pg_class_relnatts 17 ! #define Anum_pg_class_relchecks 18 ! #define Anum_pg_class_relhasoids 19 ! #define Anum_pg_class_relhaspkey 20 ! #define Anum_pg_class_relhasrules 21 ! #define Anum_pg_class_relhastriggers 22 ! #define Anum_pg_class_relhassubclass 23 ! #define Anum_pg_class_relfrozenxid 24 ! #define Anum_pg_class_relacl 25 ! #define Anum_pg_class_reloptions 26 /* ---------------- * initial contents of pg_class --- 104,126 ---- #define Anum_pg_class_reltablespace 8 #define Anum_pg_class_relpages 9 #define Anum_pg_class_reltuples 10 ! #define Anum_pg_class_reloscache 11 ! #define Anum_pg_class_reltoastrelid 12 ! #define Anum_pg_class_reltoastidxid 13 ! #define Anum_pg_class_relhasindex 14 ! #define Anum_pg_class_relisshared 15 ! #define Anum_pg_class_relpersistence 16 ! #define Anum_pg_class_relkind 17 ! #define Anum_pg_class_relnatts 18 ! #define Anum_pg_class_relchecks 19 ! #define Anum_pg_class_relhasoids 20 ! #define Anum_pg_class_relhaspkey 21 ! #define Anum_pg_class_relhasrules 22 ! #define Anum_pg_class_relhastriggers 23 ! #define Anum_pg_class_relhassubclass 24 ! #define Anum_pg_class_relfrozenxid 25 ! #define Anum_pg_class_relacl 26 ! #define Anum_pg_class_reloptions 27 /* ---------------- * initial contents of pg_class *************** typedef FormData_pg_class *Form_pg_class *** 130,142 **** */ /* Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId */ ! DATA(insert OID = 1247 ( pg_type PGNSP 71 0 PGUID 0 0 0 0 0 0 0 f f p r 29 0 t f f f f 3 _null_ _null_ )); DESCR(""); ! DATA(insert OID = 1249 ( pg_attribute PGNSP 75 0 PGUID 0 0 0 0 0 0 0 f f p r 20 0 f f f f f 3 _null_ _null_ )); DESCR(""); ! DATA(insert OID = 1255 ( pg_proc PGNSP 81 0 PGUID 0 0 0 0 0 0 0 f f p r 25 0 t f f f f 3 _null_ _null_ )); DESCR(""); ! DATA(insert OID = 1259 ( pg_class PGNSP 83 0 PGUID 0 0 0 0 0 0 0 f f p r 26 0 t f f f f 3 _null_ _null_ )); DESCR(""); --- 132,144 ---- */ /* Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId */ ! DATA(insert OID = 1247 ( pg_type PGNSP 71 0 PGUID 0 0 0 0 0 0 0 0 f f p r 29 0 t f f f f 3 _null_ _null_ )); DESCR(""); ! DATA(insert OID = 1249 ( pg_attribute PGNSP 75 0 PGUID 0 0 0 0 0 0 0 0 f f p r 20 0 f f f f f 3 _null_ _null_ )); DESCR(""); ! DATA(insert OID = 1255 ( pg_proc PGNSP 81 0 PGUID 0 0 0 0 0 0 0 0 f f p r 25 0 t f f f f 3 _null_ _null_ )); DESCR(""); ! DATA(insert OID = 1259 ( pg_class PGNSP 83 0 PGUID 0 0 0 0 0 0 0 0 f f p r 27 0 t f f f f 3 _null_ _null_ )); DESCR(""); -- 1.7.5.3
From 1e2dffd1007d488b5848ee7b5fa0016914cc1b60 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr> Date: Wed, 25 May 2011 23:43:27 +0200 Subject: [PATCH 2/7] Add a function to update the new pg_class cols function is a copy of vac_update_relstats() but just updating the oscache_percent column --- src/backend/commands/vacuum.c | 42 +++++++++++++++++++++++++++++++++++++++++ src/include/commands/vacuum.h | 4 +++ 2 files changed, 46 insertions(+), 0 deletions(-) diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c new file mode 100644 index 5cbf3a0..ba1f7bb *** a/src/backend/commands/vacuum.c --- b/src/backend/commands/vacuum.c *************** vacuum_delay_point(void) *** 1162,1164 **** --- 1162,1206 ---- CHECK_FOR_INTERRUPTS(); } } + + + /* + * oscache_update_relstats() -- update oscache statistics for one relation + * + * /!\ Same comment as function vac_update_relstats() + */ + void + oscache_update_relstats(Relation relation, + float4 per_oscache, + TransactionId frozenxid) + { + Oid relid = RelationGetRelid(relation); + Relation rd; + HeapTuple ctup; + Form_pg_class pgcform; + bool dirty; + + rd = heap_open(RelationRelationId, RowExclusiveLock); + + /* Fetch a copy of the tuple to scribble on */ + ctup = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid)); + if (!HeapTupleIsValid(ctup)) + elog(ERROR, "pg_class entry for relid %u vanished during cache analyze", + relid); + pgcform = (Form_pg_class) GETSTRUCT(ctup); + + /* Apply required updates, if any, to copied tuple */ + + dirty = false; + if (pgcform->reloscache != (float4) per_oscache) + { + pgcform->reloscache = (float4) per_oscache; + dirty = true; + } + + /* If anything changed, write out the tuple. */ + if (dirty) + heap_inplace_update(rd, ctup); + + heap_close(rd, RowExclusiveLock); + } diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h new file mode 100644 index cfbe0c4..5bceec0 *** a/src/include/commands/vacuum.h --- b/src/include/commands/vacuum.h *************** extern void vacuum_set_xid_limits(int fr *** 159,164 **** --- 159,168 ---- extern void vac_update_datfrozenxid(void); extern void vacuum_delay_point(void); + extern void oscache_update_relstats(Relation relation, + float4 per_oscache, + TransactionId frozenxid); + /* in commands/vacuumlazy.c */ extern void lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy); -- 1.7.5.3
From 76ed77241453adfb10894eed6b03ea799a36923b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr> Date: Thu, 26 May 2011 00:35:13 +0200 Subject: [PATCH 3/7] Add "ANALYZE OSCACHE [VERBOSE] [relation];" it updates the oscache column in pg_class with, currently, dummy functions --- src/backend/commands/analyze.c | 124 ++++++++++++++++++++++++++++++++++- src/backend/parser/gram.y | 22 +++++-- src/backend/storage/buffer/bufmgr.c | 13 ++++ src/include/nodes/parsenodes.h | 3 +- src/include/parser/kwlist.h | 1 + src/include/storage/bufmgr.h | 2 + 6 files changed, 156 insertions(+), 9 deletions(-) diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c new file mode 100644 index 57188bc..a78f46c *** a/src/backend/commands/analyze.c --- b/src/backend/commands/analyze.c *************** static BufferAccessStrategy vac_strategy *** 85,90 **** --- 85,91 ---- static void do_analyze_rel(Relation onerel, VacuumStmt *vacstmt, bool inh); + static void do_oscache_analyze_rel(Relation onerel, bool inh); static void BlockSampler_Init(BlockSampler bs, BlockNumber nblocks, int samplesize); static bool BlockSampler_HasMore(BlockSampler bs); *************** analyze_rel(Oid relid, VacuumStmt *vacst *** 228,240 **** /* * Do the normal non-recursive ANALYZE. */ ! do_analyze_rel(onerel, vacstmt, false); /* * If there are child tables, do recursive ANALYZE. */ if (onerel->rd_rel->relhassubclass) ! do_analyze_rel(onerel, vacstmt, true); /* * Close source relation now, but keep lock so that no one deletes it --- 229,249 ---- /* * Do the normal non-recursive ANALYZE. */ ! if (vacstmt->options & (VACOPT_OSCACHE)) ! do_oscache_analyze_rel(onerel, false); ! else ! do_analyze_rel(onerel, vacstmt, false); /* * If there are child tables, do recursive ANALYZE. */ if (onerel->rd_rel->relhassubclass) ! { ! if (vacstmt->options & (VACOPT_OSCACHE)) ! do_oscache_analyze_rel(onerel, true); ! else ! do_analyze_rel(onerel, vacstmt, true); ! } /* * Close source relation now, but keep lock so that no one deletes it *************** cleanup: *** 623,628 **** --- 632,748 ---- /* Restore current context and release memory */ MemoryContextSwitchTo(caller_context); + MemoryContextDelete(anl_context); + anl_context = NULL; + } + + /* + * do_analyze_rel() -- analyze one relation, recursively or not + */ + static void + do_oscache_analyze_rel(Relation onerel, bool inh) + { + int ind; + Relation *Irel; + int nindexes; + bool hasindex; + AnlIndexData *indexdata; + PGRUsage ru0; + TimestampTz starttime = 0; + MemoryContext caller_context; + int save_nestlevel; + + if (inh) + ereport(elevel, + (errmsg("cache analyzing \"%s.%s\" inheritance tree", + get_namespace_name(RelationGetNamespace(onerel)), + RelationGetRelationName(onerel)))); + else + ereport(elevel, + (errmsg("cache analyzing \"%s.%s\"", + get_namespace_name(RelationGetNamespace(onerel)), + RelationGetRelationName(onerel)))); + + /* + * Set up a working context so that we can easily free whatever junk gets + * created. + */ + anl_context = AllocSetContextCreate(CurrentMemoryContext, + "Analyze", + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + caller_context = MemoryContextSwitchTo(anl_context); + + /* + * Arrange to make GUC variable changes local to this command. + */ + save_nestlevel = NewGUCNestLevel(); + + /* measure elapsed time iff autovacuum logging requires it */ + if (IsAutoVacuumWorkerProcess() && Log_autovacuum_min_duration >= 0) + { + pg_rusage_init(&ru0); + if (Log_autovacuum_min_duration > 0) + starttime = GetCurrentTimestamp(); + } + + /* + * Open all indexes of the relation, and see if there are any analyzable + * columns in the indexes. We do not analyze index columns if there was + * an explicit column list in the ANALYZE command, however. If we are + * doing a recursive scan, we don't want to touch the parent's indexes at + * all. + */ + if (!inh) + vac_open_indexes(onerel, AccessShareLock, &nindexes, &Irel); + else + { + Irel = NULL; + nindexes = 0; + } + hasindex = (nindexes > 0); + indexdata = NULL; + + /* + * Update cache stats in pg_class. + */ + oscache_update_relstats(onerel, + RelationGetRelationOSCacheInFork(onerel, MAIN_FORKNUM), + InvalidTransactionId); + + /* + * Same for indexes. + */ + for (ind = 0; ind < nindexes; ind++) + { + oscache_update_relstats(Irel[ind], + RelationGetRelationOSCacheInFork(Irel[ind], MAIN_FORKNUM), + InvalidTransactionId); + } + + /* Done with indexes */ + vac_close_indexes(nindexes, Irel, NoLock); + + /* Log the action if appropriate */ + if (IsAutoVacuumWorkerProcess() && Log_autovacuum_min_duration >= 0) + { + if (Log_autovacuum_min_duration == 0 || + TimestampDifferenceExceeds(starttime, GetCurrentTimestamp(), + Log_autovacuum_min_duration)) + ereport(LOG, + (errmsg("automatic cache analyze of table \"%s.%s.%s\" system usage: %s", + get_database_name(MyDatabaseId), + get_namespace_name(RelationGetNamespace(onerel)), + RelationGetRelationName(onerel), + pg_rusage_show(&ru0)))); + } + + /* Roll back any GUC changes executed by index functions */ + AtEOXact_GUC(false, save_nestlevel); + + /* Restore current context and release memory */ + MemoryContextSwitchTo(caller_context); MemoryContextDelete(anl_context); anl_context = NULL; } diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y new file mode 100644 index 1d39674..c312227 *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *************** static void SplitColQualList(List *qualL *** 342,348 **** %type <boolean> opt_instead %type <boolean> opt_unique opt_concurrently opt_verbose opt_full ! %type <boolean> opt_freeze opt_default opt_recheck %type <defelt> opt_binary opt_oids copy_delimiter %type <boolean> copy_from --- 342,348 ---- %type <boolean> opt_instead %type <boolean> opt_unique opt_concurrently opt_verbose opt_full ! %type <boolean> opt_freeze opt_oscache opt_default opt_recheck %type <defelt> opt_binary opt_oids copy_delimiter %type <boolean> copy_from *************** static void SplitColQualList(List *qualL *** 529,535 **** NULLS_P NUMERIC OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR ! ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY --- 529,535 ---- NULLS_P NUMERIC OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR ! ORDER OSCACHE OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY *************** vacuum_option_elem: *** 7801,7811 **** ; AnalyzeStmt: ! analyze_keyword opt_verbose { VacuumStmt *n = makeNode(VacuumStmt); n->options = VACOPT_ANALYZE; if ($2) n->options |= VACOPT_VERBOSE; n->freeze_min_age = -1; n->freeze_table_age = -1; --- 7801,7813 ---- ; AnalyzeStmt: ! analyze_keyword opt_oscache opt_verbose { VacuumStmt *n = makeNode(VacuumStmt); n->options = VACOPT_ANALYZE; if ($2) + n->options |= VACOPT_OSCACHE; + if ($3) n->options |= VACOPT_VERBOSE; n->freeze_min_age = -1; n->freeze_table_age = -1; *************** AnalyzeStmt: *** 7813,7828 **** n->va_cols = NIL; $$ = (Node *)n; } ! | analyze_keyword opt_verbose qualified_name opt_name_list { VacuumStmt *n = makeNode(VacuumStmt); n->options = VACOPT_ANALYZE; if ($2) n->options |= VACOPT_VERBOSE; n->freeze_min_age = -1; n->freeze_table_age = -1; ! n->relation = $3; ! n->va_cols = $4; $$ = (Node *)n; } ; --- 7815,7832 ---- n->va_cols = NIL; $$ = (Node *)n; } ! | analyze_keyword opt_oscache opt_verbose qualified_name opt_name_list { VacuumStmt *n = makeNode(VacuumStmt); n->options = VACOPT_ANALYZE; if ($2) + n->options |= VACOPT_OSCACHE; + if ($3) n->options |= VACOPT_VERBOSE; n->freeze_min_age = -1; n->freeze_table_age = -1; ! n->relation = $4; ! n->va_cols = $5; $$ = (Node *)n; } ; *************** opt_freeze: FREEZE { $$ = TRUE; *** 7845,7850 **** --- 7849,7859 ---- | /*EMPTY*/ { $$ = FALSE; } ; + opt_oscache: + OSCACHE { $$ = TRUE; } + | /*EMPTY*/ { $$ = FALSE; } + ; + opt_name_list: '(' name_list ')' { $$ = $2; } | /*EMPTY*/ { $$ = NIL; } *************** type_func_name_keyword: *** 12158,12163 **** --- 12167,12173 ---- | LIKE | NATURAL | NOTNULL + | OSCACHE | OUTER_P | OVER | OVERLAPS diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c new file mode 100644 index b12348b..670dbdf *** a/src/backend/storage/buffer/bufmgr.c --- b/src/backend/storage/buffer/bufmgr.c *************** RelationGetNumberOfBlocksInFork(Relation *** 1928,1933 **** --- 1928,1946 ---- return smgrnblocks(relation->rd_smgr, forkNum); } + /* + * RelationGetRelationOSCacheInFork + * Determines the current percentage of pages in OS cache for the + * relation. + */ + float4 + RelationGetRelationOSCacheInFork(Relation relation, ForkNumber forkNum) + { + float4 percent = 0; + + return percent; + } + /* --------------------------------------------------------------------- * DropRelFileNodeBuffers * diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h new file mode 100644 index 14937d4..bc191c2 *** a/src/include/nodes/parsenodes.h --- b/src/include/nodes/parsenodes.h *************** typedef enum VacuumOption *** 2417,2423 **** VACOPT_VERBOSE = 1 << 2, /* print progress info */ VACOPT_FREEZE = 1 << 3, /* FREEZE option */ VACOPT_FULL = 1 << 4, /* FULL (non-concurrent) vacuum */ ! VACOPT_NOWAIT = 1 << 5 } VacuumOption; typedef struct VacuumStmt --- 2417,2424 ---- VACOPT_VERBOSE = 1 << 2, /* print progress info */ VACOPT_FREEZE = 1 << 3, /* FREEZE option */ VACOPT_FULL = 1 << 4, /* FULL (non-concurrent) vacuum */ ! VACOPT_NOWAIT = 1 << 5, ! VACOPT_OSCACHE = 1 << 6 /* do OSCACHE stats analyze */ } VacuumOption; typedef struct VacuumStmt diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h new file mode 100644 index 12c2faf..95a7e3d *** a/src/include/parser/kwlist.h --- b/src/include/parser/kwlist.h *************** PG_KEYWORD("option", OPTION, UNRESERVED_ *** 264,269 **** --- 264,270 ---- PG_KEYWORD("options", OPTIONS, UNRESERVED_KEYWORD) PG_KEYWORD("or", OR, RESERVED_KEYWORD) PG_KEYWORD("order", ORDER, RESERVED_KEYWORD) + PG_KEYWORD("oscache", OSCACHE, TYPE_FUNC_NAME_KEYWORD) PG_KEYWORD("out", OUT_P, COL_NAME_KEYWORD) PG_KEYWORD("outer", OUTER_P, TYPE_FUNC_NAME_KEYWORD) PG_KEYWORD("over", OVER, TYPE_FUNC_NAME_KEYWORD) diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h new file mode 100644 index b8fc87e..7d22b5a *** a/src/include/storage/bufmgr.h --- b/src/include/storage/bufmgr.h *************** extern void CheckPointBuffers(int flags) *** 179,184 **** --- 179,186 ---- extern BlockNumber BufferGetBlockNumber(Buffer buffer); extern BlockNumber RelationGetNumberOfBlocksInFork(Relation relation, ForkNumber forkNum); + extern float4 RelationGetRelationOSCacheInFork(Relation relation, + ForkNumber forkNum); extern void FlushRelationBuffers(Relation rel); extern void FlushDatabaseBuffers(Oid dbid); extern void DropRelFileNodeBuffers(RelFileNodeBackend rnode, -- 1.7.5.3
From 5a024cbbc3261454c49d8a77afd121605abc3390 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr> Date: Thu, 26 May 2011 00:46:03 +0200 Subject: [PATCH 4/7] Add a Hook to handle OSCache stats The hook is in RelationGetRelationOSCacheInFork --- src/backend/storage/buffer/bufmgr.c | 9 +++++++++ src/include/storage/bufmgr.h | 7 +++++++ src/tools/pgindent/typedefs.list | 1 + 3 files changed, 17 insertions(+), 0 deletions(-) diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c new file mode 100644 index 670dbdf..9bfa78c *** a/src/backend/storage/buffer/bufmgr.c --- b/src/backend/storage/buffer/bufmgr.c *************** static volatile BufferDesc *BufferAlloc( *** 106,111 **** --- 106,116 ---- static void FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln); static void AtProcExit_Buffers(int code, Datum arg); + /* + * Hooks for plugin to get control in + * RelationGetRelationOSCacheInFork + */ + oscache_hook_type OSCache_hook = NULL; /* * PrefetchBuffer -- initiate asynchronous read of a block of a relation *************** RelationGetRelationOSCacheInFork(Relatio *** 1938,1943 **** --- 1943,1952 ---- { float4 percent = 0; + /* if a plugin is present, let it manage things */ + if (OSCache_hook) + percent = (*OSCache_hook) (relation, forkNum); + return percent; } diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h new file mode 100644 index 7d22b5a..3098275 *** a/src/include/storage/bufmgr.h --- b/src/include/storage/bufmgr.h *************** extern void AtProcExit_LocalBuffers(void *** 217,220 **** --- 217,227 ---- extern BufferAccessStrategy GetAccessStrategy(BufferAccessStrategyType btype); extern void FreeAccessStrategy(BufferAccessStrategy strategy); + /* + * Hooks for plugin to get control in + * RelationGetRelationOSCacheInFork + */ + typedef float4 (*oscache_hook_type) (Relation relation, ForkNumber forkNum); + extern PGDLLIMPORT oscache_hook_type OSCache_hook; + #endif diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list new file mode 100644 index f2982cd..a1bf8fe *** a/src/tools/pgindent/typedefs.list --- b/src/tools/pgindent/typedefs.list *************** OprCacheKey *** 963,968 **** --- 963,969 ---- OprInfo OprProofCacheEntry OprProofCacheKey + OSCache_hook_type OutputContext OverrideSearchPath OverrideStackEntry -- 1.7.5.3
From d742c0f1c16672c1632cea9936cc846c8b039187 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr> Date: Thu, 2 Jun 2011 02:27:35 +0200 Subject: [PATCH 5/7] Add reloscache to (Index|Rel)OptInfo So that costsize.c can use the values --- src/backend/optimizer/util/plancat.c | 7 +++++++ src/include/nodes/relation.h | 2 ++ 2 files changed, 9 insertions(+), 0 deletions(-) diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c new file mode 100644 index b286816..a9aa4b1 *** a/src/backend/optimizer/util/plancat.c --- b/src/backend/optimizer/util/plancat.c *************** static List *get_relation_constraints(Pl *** 64,69 **** --- 64,70 ---- * indexlist list of IndexOptInfos for relation's indexes * pages number of pages * tuples number of tuples + * oscache density of page in cache * * Also, initialize the attr_needed[] and attr_widths[] arrays. In most * cases these are left as zeroes, but sometimes we need to compute attr *************** get_relation_info(PlannerInfo *root, Oid *** 111,120 **** --- 112,125 ---- * case the size will be computed later in set_append_rel_pathlist, and we * must leave it zero for now to avoid bollixing the total_table_pages * calculation. + * XXX: we need to handle oscache for inherited tables....check that */ if (!inhparent) + { + rel->oscache = (float4) relation->rd_rel->reloscache; estimate_rel_size(relation, rel->attr_widths - rel->min_attr, &rel->pages, &rel->tuples); + } /* * Make list of indexes. Ignore indexes on system catalogs if told to. *************** get_relation_info(PlannerInfo *root, Oid *** 327,337 **** --- 332,344 ---- */ if (info->indpred == NIL) { + info->oscache = (float4) indexRelation->rd_rel->reloscache; info->pages = RelationGetNumberOfBlocks(indexRelation); info->tuples = rel->tuples; } else { + info->oscache = (float4) indexRelation->rd_rel->reloscache; estimate_rel_size(indexRelation, NULL, &info->pages, &info->tuples); if (info->tuples > rel->tuples) diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h new file mode 100644 index f659269..e94b48b *** a/src/include/nodes/relation.h --- b/src/include/nodes/relation.h *************** typedef struct RelOptInfo *** 408,413 **** --- 408,414 ---- List *indexlist; /* list of IndexOptInfo */ BlockNumber pages; double tuples; + float4 oscache; /* density of the relation in cache */ struct Plan *subplan; /* if subquery */ List *subrtable; /* if subquery */ List *subrowmark; /* if subquery */ *************** typedef struct IndexOptInfo *** 466,471 **** --- 467,473 ---- /* statistics from pg_class */ BlockNumber pages; /* number of disk pages in index */ double tuples; /* number of index tuples in index */ + float4 oscache; /* density of the relation in cache */ /* index descriptor information */ int ncolumns; /* number of columns in index */ -- 1.7.5.3
From 4bea6a4d150b694f3d16546ec865e4be948745ed Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr> Date: Thu, 2 Jun 2011 02:28:01 +0200 Subject: [PATCH 6/7] Add cache_page_cost GUC this is used to improve cache estimate and impact in IO cost estimations --- doc/src/sgml/config.sgml | 13 +++++++++++++ src/backend/optimizer/path/costsize.c | 1 + src/backend/utils/misc/guc.c | 10 ++++++++++ src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/optimizer/cost.h | 2 ++ 5 files changed, 27 insertions(+), 0 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml new file mode 100644 index e835e4b..8bc5560 *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *************** SET ENABLE_SEQSCAN TO OFF; *** 2514,2519 **** --- 2514,2532 ---- </listitem> </varlistentry> + <varlistentry id="guc-cache-page-cost" xreflabel="cache_page_cost"> + <term><varname>cache_page_cost</varname> (<type>floating point</type>)</term> + <indexterm> + <primary><varname>cache_page_cost</> configuration parameter</primary> + </indexterm> + <listitem> + <para> + Sets the planner's estimate of the cost of a + fetched page from cache. The default is 0.1. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-cpu-tuple-cost" xreflabel="cpu_tuple_cost"> <term><varname>cpu_tuple_cost</varname> (<type>floating point</type>)</term> <indexterm> diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c new file mode 100644 index bb38768..a80dc0c *** a/src/backend/optimizer/path/costsize.c --- b/src/backend/optimizer/path/costsize.c *************** *** 100,105 **** --- 100,106 ---- double seq_page_cost = DEFAULT_SEQ_PAGE_COST; double random_page_cost = DEFAULT_RANDOM_PAGE_COST; + double cache_page_cost = DEFAULT_CACHE_PAGE_COST; double cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST; double cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST; double cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST; diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c new file mode 100644 index 92391ed..7465c08 *** a/src/backend/utils/misc/guc.c --- b/src/backend/utils/misc/guc.c *************** static struct config_real ConfigureNames *** 2392,2397 **** --- 2392,2407 ---- NULL, NULL, NULL }, { + {"cache_page_cost", PGC_USERSET, QUERY_TUNING_COST, + gettext_noop("Sets the planner's estimate of the cost of a " + "fetched page from cache."), + NULL + }, + &cache_page_cost, + DEFAULT_CACHE_PAGE_COST, 0, DBL_MAX, + NULL, NULL, NULL + }, + { {"cpu_tuple_cost", PGC_USERSET, QUERY_TUNING_COST, gettext_noop("Sets the planner's estimate of the cost of " "processing each tuple (row)."), diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample new file mode 100644 index 655dad4..82159e1 *** a/src/backend/utils/misc/postgresql.conf.sample --- b/src/backend/utils/misc/postgresql.conf.sample *************** *** 233,238 **** --- 233,239 ---- #seq_page_cost = 1.0 # measured on an arbitrary scale #random_page_cost = 4.0 # same scale as above + #cache_page_cost = 0.1 # same scale as above #cpu_tuple_cost = 0.01 # same scale as above #cpu_index_tuple_cost = 0.005 # same scale as above #cpu_operator_cost = 0.0025 # same scale as above diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h new file mode 100644 index 2763863..f4babd0 *** a/src/include/optimizer/cost.h --- b/src/include/optimizer/cost.h *************** *** 23,28 **** --- 23,29 ---- /* If you change these, update backend/utils/misc/postgresql.sample.conf */ #define DEFAULT_SEQ_PAGE_COST 1.0 #define DEFAULT_RANDOM_PAGE_COST 4.0 + #define DEFAULT_CACHE_PAGE_COST 0.1 #define DEFAULT_CPU_TUPLE_COST 0.01 #define DEFAULT_CPU_INDEX_TUPLE_COST 0.005 #define DEFAULT_CPU_OPERATOR_COST 0.0025 *************** typedef enum *** 45,50 **** --- 46,52 ---- /* parameter variables and flags */ extern PGDLLIMPORT double seq_page_cost; extern PGDLLIMPORT double random_page_cost; + extern PGDLLIMPORT double cache_page_cost; extern PGDLLIMPORT double cpu_tuple_cost; extern PGDLLIMPORT double cpu_index_tuple_cost; extern PGDLLIMPORT double cpu_operator_cost; -- 1.7.5.3
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers