2011/5/16 Greg Smith <g...@2ndquadrant.com>:
> Cédric Villemain wrote:
>>
>>
>> http://git.postgresql.org/gitweb?p=users/c2main/postgres.git;a=shortlog;h=refs/heads/analyze_cache
>>
>
> This rebases easily to make Cedric's changes move to the end; I just pushed
> a version with that change to
> https://github.com/greg2ndQuadrant/postgres/tree/analyze_cache if anyone
> wants a cleaner one to browse.  I've attached a patch too if that's more
> your thing.
>
> I'd recommend not getting too stuck on the particular hook Cédric has added
> here to compute the cache estimate, which uses mmap and mincore to figure it
> out.  It's possible to compute similar numbers, albeit less accurate, using
> an approach similar to how pg_buffercache inspects things.  And I even once
> wrote a background writer extension that collected this sort of data as it
> was running the LRU scan anyway.  Discussions of this idea seem to focus on
> how the "what's in the cache?" data is collected, which as far as I'm
> concerned is the least important part.  There are multiple options, some
> work better than others, and there's no reason that can't be swapped out
> later.  The more important question is how to store the data collected and
> then use it for optimizing queries.

Attached are updated patches without the plugin itself. I've also
added the cache_page_cost GUC, this one is not per tablespace, like
others page_cost.

There are 6 patches:

0001-Add-reloscache-column-to-pg_class.patch
0002-Add-a-function-to-update-the-new-pg_class-cols.patch
0003-Add-ANALYZE-OSCACHE-VERBOSE-relation.patch
0004-Add-a-Hook-to-handle-OSCache-stats.patch
0005-Add-reloscache-to-Index-Rel-OptInfo.patch
0006-Add-cache_page_cost-GUC.patch

I have some comments on my own code:

* I am not sure of the best datatype to use for 'reloscache'
* I didn't include the catalog number change in the patch itself.
* oscache_update_relstats() is very similar to vac_update_relstats(),
maybe better to merge them but reloscache should not be updated at the
same time than other stats.
* There is probably too much work done in do_oscache_analyze_rel()
because I kept vac_open_indexes() (not a big drama atm)
* I don't know so much how gram.y works, so I am not sure my changes
cover all cases.
* No tests; similar columns and GUC does not have test either, but it
lacks a test for ANALYZE OSCACHE

-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support
From d2fe7e85aea31cfe8cd6559a060f71c424fe03af Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr>
Date: Wed, 25 May 2011 23:17:36 +0200
Subject: [PATCH 1/7] Add reloscache column to pg_class

1 column reloscache to contain the percentage of pages in cache per relation.
May be used by the planner and updated with ANALYZE OSCACHE; (not done yet,
see next commits)
---
 doc/src/sgml/catalogs.sgml         |   11 +++++++++
 src/backend/catalog/heap.c         |    1 +
 src/backend/utils/cache/relcache.c |    2 +
 src/include/catalog/pg_class.h     |   44 ++++++++++++++++++-----------------
 4 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
new file mode 100644
index 8504555..4cfad39
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
***************
*** 1634,1639 ****
--- 1634,1650 ----
       </row>
  
       <row>
+       <entry><structfield>reloscache</structfield></entry>
+       <entry><type>float4</type></entry>
+       <entry></entry>
+       <entry>
+        Percentage of the files in OS cache.  This is only an estimate used by
+        the planner.  It is updated by <command>ANALYZE OSCACHE</command>.
+        By default, the value is not updated and an extension is required.
+       </entry>
+      </row>
+ 
+      <row>
        <entry><structfield>reltoastrelid</structfield></entry>
        <entry><type>oid</type></entry>
        <entry><literal><link linkend="catalog-pg-class"><structname>pg_class</structname></link>.oid</literal></entry>
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
new file mode 100644
index a6e541d..2043c40
*** a/src/backend/catalog/heap.c
--- b/src/backend/catalog/heap.c
*************** InsertPgClassTuple(Relation pg_class_des
*** 764,769 ****
--- 764,770 ----
  	values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
  	values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
  	values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
+ 	values[Anum_pg_class_reloscache - 1] = Float4GetDatum(rd_rel->reloscache);
  	values[Anum_pg_class_reltoastrelid - 1] = ObjectIdGetDatum(rd_rel->reltoastrelid);
  	values[Anum_pg_class_reltoastidxid - 1] = ObjectIdGetDatum(rd_rel->reltoastidxid);
  	values[Anum_pg_class_relhasindex - 1] = BoolGetDatum(rd_rel->relhasindex);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
new file mode 100644
index d7e94ff..ca09e3b
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
*************** formrdesc(const char *relationName, Oid
*** 1417,1422 ****
--- 1417,1423 ----
  
  	relation->rd_rel->relpages = 1;
  	relation->rd_rel->reltuples = 1;
+ 	relation->rd_rel->reloscache = 0;
  	relation->rd_rel->relkind = RELKIND_RELATION;
  	relation->rd_rel->relhasoids = hasoids;
  	relation->rd_rel->relnatts = (int16) natts;
*************** RelationSetNewRelfilenode(Relation relat
*** 2661,2666 ****
--- 2662,2668 ----
  	{
  		classform->relpages = 0;	/* it's empty until further notice */
  		classform->reltuples = 0;
+ 		classform->reloscache = 0;
  	}
  	classform->relfrozenxid = freezeXid;
  
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
new file mode 100644
index ffcce3c..93218ad
*** a/src/include/catalog/pg_class.h
--- b/src/include/catalog/pg_class.h
*************** CATALOG(pg_class,1259) BKI_BOOTSTRAP BKI
*** 45,50 ****
--- 45,51 ----
  	Oid			reltablespace;	/* identifier of table space for relation */
  	int4		relpages;		/* # of blocks (not always up-to-date) */
  	float4		reltuples;		/* # of tuples (not always up-to-date) */
+ 	float4		reloscache;		/* % of files in OS cache (not always up-to-date) */
  	Oid			reltoastrelid;	/* OID of toast table; 0 if none */
  	Oid			reltoastidxid;	/* if toast table, OID of chunk_id index */
  	bool		relhasindex;	/* T if has (or has had) any indexes */
*************** typedef FormData_pg_class *Form_pg_class
*** 92,98 ****
   * ----------------
   */
  
! #define Natts_pg_class					26
  #define Anum_pg_class_relname			1
  #define Anum_pg_class_relnamespace		2
  #define Anum_pg_class_reltype			3
--- 93,99 ----
   * ----------------
   */
  
! #define Natts_pg_class					27
  #define Anum_pg_class_relname			1
  #define Anum_pg_class_relnamespace		2
  #define Anum_pg_class_reltype			3
*************** typedef FormData_pg_class *Form_pg_class
*** 103,124 ****
  #define Anum_pg_class_reltablespace		8
  #define Anum_pg_class_relpages			9
  #define Anum_pg_class_reltuples			10
! #define Anum_pg_class_reltoastrelid		11
! #define Anum_pg_class_reltoastidxid		12
! #define Anum_pg_class_relhasindex		13
! #define Anum_pg_class_relisshared		14
! #define Anum_pg_class_relpersistence	15
! #define Anum_pg_class_relkind			16
! #define Anum_pg_class_relnatts			17
! #define Anum_pg_class_relchecks			18
! #define Anum_pg_class_relhasoids		19
! #define Anum_pg_class_relhaspkey		20
! #define Anum_pg_class_relhasrules		21
! #define Anum_pg_class_relhastriggers	22
! #define Anum_pg_class_relhassubclass	23
! #define Anum_pg_class_relfrozenxid		24
! #define Anum_pg_class_relacl			25
! #define Anum_pg_class_reloptions		26
  
  /* ----------------
   *		initial contents of pg_class
--- 104,126 ----
  #define Anum_pg_class_reltablespace		8
  #define Anum_pg_class_relpages			9
  #define Anum_pg_class_reltuples			10
! #define Anum_pg_class_reloscache		11
! #define Anum_pg_class_reltoastrelid		12
! #define Anum_pg_class_reltoastidxid		13
! #define Anum_pg_class_relhasindex		14
! #define Anum_pg_class_relisshared		15
! #define Anum_pg_class_relpersistence	16
! #define Anum_pg_class_relkind			17
! #define Anum_pg_class_relnatts			18
! #define Anum_pg_class_relchecks			19
! #define Anum_pg_class_relhasoids		20
! #define Anum_pg_class_relhaspkey		21
! #define Anum_pg_class_relhasrules		22
! #define Anum_pg_class_relhastriggers	23
! #define Anum_pg_class_relhassubclass	24
! #define Anum_pg_class_relfrozenxid		25
! #define Anum_pg_class_relacl			26
! #define Anum_pg_class_reloptions		27
  
  /* ----------------
   *		initial contents of pg_class
*************** typedef FormData_pg_class *Form_pg_class
*** 130,142 ****
   */
  
  /* Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId */
! DATA(insert OID = 1247 (  pg_type		PGNSP 71 0 PGUID 0 0 0 0 0 0 0 f f p r 29 0 t f f f f 3 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1249 (  pg_attribute	PGNSP 75 0 PGUID 0 0 0 0 0 0 0 f f p r 20 0 f f f f f 3 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1255 (  pg_proc		PGNSP 81 0 PGUID 0 0 0 0 0 0 0 f f p r 25 0 t f f f f 3 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1259 (  pg_class		PGNSP 83 0 PGUID 0 0 0 0 0 0 0 f f p r 26 0 t f f f f 3 _null_ _null_ ));
  DESCR("");
  
  
--- 132,144 ----
   */
  
  /* Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId */
! DATA(insert OID = 1247 (  pg_type		PGNSP 71 0 PGUID 0 0 0 0 0 0 0 0 f f p r 29 0 t f f f f 3 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1249 (  pg_attribute	PGNSP 75 0 PGUID 0 0 0 0 0 0 0 0 f f p r 20 0 f f f f f 3 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1255 (  pg_proc		PGNSP 81 0 PGUID 0 0 0 0 0 0 0 0 f f p r 25 0 t f f f f 3 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1259 (  pg_class		PGNSP 83 0 PGUID 0 0 0 0 0 0 0 0 f f p r 27 0 t f f f f 3 _null_ _null_ ));
  DESCR("");
  
  
-- 
1.7.5.3

From 1e2dffd1007d488b5848ee7b5fa0016914cc1b60 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr>
Date: Wed, 25 May 2011 23:43:27 +0200
Subject: [PATCH 2/7] Add a function to update the new pg_class cols

function is a copy of vac_update_relstats() but
just updating the oscache_percent column
---
 src/backend/commands/vacuum.c |   42 +++++++++++++++++++++++++++++++++++++++++
 src/include/commands/vacuum.h |    4 +++
 2 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
new file mode 100644
index 5cbf3a0..ba1f7bb
*** a/src/backend/commands/vacuum.c
--- b/src/backend/commands/vacuum.c
*************** vacuum_delay_point(void)
*** 1162,1164 ****
--- 1162,1206 ----
  		CHECK_FOR_INTERRUPTS();
  	}
  }
+ 
+ 
+ /*
+  *	oscache_update_relstats() -- update oscache statistics for one relation
+  *
+  *  /!\ Same comment as function vac_update_relstats()
+  */
+ void
+ oscache_update_relstats(Relation relation,
+ 						float4 per_oscache,
+ 						TransactionId frozenxid)
+ {
+ 	Oid			relid = RelationGetRelid(relation);
+ 	Relation	rd;
+ 	HeapTuple	ctup;
+ 	Form_pg_class pgcform;
+ 	bool		dirty;
+ 
+ 	rd = heap_open(RelationRelationId, RowExclusiveLock);
+ 
+ 	/* Fetch a copy of the tuple to scribble on */
+ 	ctup = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid));
+ 	if (!HeapTupleIsValid(ctup))
+ 		elog(ERROR, "pg_class entry for relid %u vanished during cache analyze",
+ 			 relid);
+ 	pgcform = (Form_pg_class) GETSTRUCT(ctup);
+ 
+ 	/* Apply required updates, if any, to copied tuple */
+ 
+ 	dirty = false;
+ 	if (pgcform->reloscache != (float4) per_oscache)
+ 	{
+ 		pgcform->reloscache = (float4) per_oscache;
+ 		dirty = true;
+ 	}
+ 
+ 	/* If anything changed, write out the tuple. */
+ 	if (dirty)
+ 		heap_inplace_update(rd, ctup);
+ 
+ 	heap_close(rd, RowExclusiveLock);
+ }
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
new file mode 100644
index cfbe0c4..5bceec0
*** a/src/include/commands/vacuum.h
--- b/src/include/commands/vacuum.h
*************** extern void vacuum_set_xid_limits(int fr
*** 159,164 ****
--- 159,168 ----
  extern void vac_update_datfrozenxid(void);
  extern void vacuum_delay_point(void);
  
+ extern void oscache_update_relstats(Relation relation,
+ 									float4 per_oscache,
+ 									TransactionId frozenxid);
+ 
  /* in commands/vacuumlazy.c */
  extern void lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
  				BufferAccessStrategy bstrategy);
-- 
1.7.5.3

From 76ed77241453adfb10894eed6b03ea799a36923b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr>
Date: Thu, 26 May 2011 00:35:13 +0200
Subject: [PATCH 3/7] Add "ANALYZE OSCACHE [VERBOSE] [relation];"

it updates the oscache column in pg_class
with, currently, dummy functions
---
 src/backend/commands/analyze.c      |  124 ++++++++++++++++++++++++++++++++++-
 src/backend/parser/gram.y           |   22 +++++--
 src/backend/storage/buffer/bufmgr.c |   13 ++++
 src/include/nodes/parsenodes.h      |    3 +-
 src/include/parser/kwlist.h         |    1 +
 src/include/storage/bufmgr.h        |    2 +
 6 files changed, 156 insertions(+), 9 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
new file mode 100644
index 57188bc..a78f46c
*** a/src/backend/commands/analyze.c
--- b/src/backend/commands/analyze.c
*************** static BufferAccessStrategy vac_strategy
*** 85,90 ****
--- 85,91 ----
  
  
  static void do_analyze_rel(Relation onerel, VacuumStmt *vacstmt, bool inh);
+ static void do_oscache_analyze_rel(Relation onerel, bool inh);
  static void BlockSampler_Init(BlockSampler bs, BlockNumber nblocks,
  				  int samplesize);
  static bool BlockSampler_HasMore(BlockSampler bs);
*************** analyze_rel(Oid relid, VacuumStmt *vacst
*** 228,240 ****
  	/*
  	 * Do the normal non-recursive ANALYZE.
  	 */
! 	do_analyze_rel(onerel, vacstmt, false);
  
  	/*
  	 * If there are child tables, do recursive ANALYZE.
  	 */
  	if (onerel->rd_rel->relhassubclass)
! 		do_analyze_rel(onerel, vacstmt, true);
  
  	/*
  	 * Close source relation now, but keep lock so that no one deletes it
--- 229,249 ----
  	/*
  	 * Do the normal non-recursive ANALYZE.
  	 */
! 	if (vacstmt->options & (VACOPT_OSCACHE))
! 		do_oscache_analyze_rel(onerel, false);
! 	else
! 		do_analyze_rel(onerel, vacstmt, false);
  
  	/*
  	 * If there are child tables, do recursive ANALYZE.
  	 */
  	if (onerel->rd_rel->relhassubclass)
! 	{
! 		if (vacstmt->options & (VACOPT_OSCACHE))
! 			do_oscache_analyze_rel(onerel, true);
! 		else
! 			do_analyze_rel(onerel, vacstmt, true);
! 	}
  
  	/*
  	 * Close source relation now, but keep lock so that no one deletes it
*************** cleanup:
*** 623,628 ****
--- 632,748 ----
  
  	/* Restore current context and release memory */
  	MemoryContextSwitchTo(caller_context);
+ 	MemoryContextDelete(anl_context);
+ 	anl_context = NULL;
+ }
+ 
+ /*
+  *	do_analyze_rel() -- analyze one relation, recursively or not
+  */
+ static void
+ do_oscache_analyze_rel(Relation onerel, bool inh)
+ {
+ 	int			ind;
+ 	Relation   *Irel;
+ 	int			nindexes;
+ 	bool		hasindex;
+ 	AnlIndexData *indexdata;
+ 	PGRUsage	ru0;
+ 	TimestampTz starttime = 0;
+ 	MemoryContext caller_context;
+ 	int			save_nestlevel;
+ 
+ 	if (inh)
+ 		ereport(elevel,
+ 				(errmsg("cache analyzing \"%s.%s\" inheritance tree",
+ 						get_namespace_name(RelationGetNamespace(onerel)),
+ 						RelationGetRelationName(onerel))));
+ 	else
+ 		ereport(elevel,
+ 				(errmsg("cache analyzing \"%s.%s\"",
+ 						get_namespace_name(RelationGetNamespace(onerel)),
+ 						RelationGetRelationName(onerel))));
+ 
+ 	/*
+ 	 * Set up a working context so that we can easily free whatever junk gets
+ 	 * created.
+ 	 */
+ 	anl_context = AllocSetContextCreate(CurrentMemoryContext,
+ 										"Analyze",
+ 										ALLOCSET_DEFAULT_MINSIZE,
+ 										ALLOCSET_DEFAULT_INITSIZE,
+ 										ALLOCSET_DEFAULT_MAXSIZE);
+ 	caller_context = MemoryContextSwitchTo(anl_context);
+ 
+ 	/*
+ 	 * Arrange to make GUC variable changes local to this command.
+ 	 */
+ 	save_nestlevel = NewGUCNestLevel();
+ 
+ 	/* measure elapsed time iff autovacuum logging requires it */
+ 	if (IsAutoVacuumWorkerProcess() && Log_autovacuum_min_duration >= 0)
+ 	{
+ 		pg_rusage_init(&ru0);
+ 		if (Log_autovacuum_min_duration > 0)
+ 			starttime = GetCurrentTimestamp();
+ 	}
+ 
+ 	/*
+ 	 * Open all indexes of the relation, and see if there are any analyzable
+ 	 * columns in the indexes.	We do not analyze index columns if there was
+ 	 * an explicit column list in the ANALYZE command, however.  If we are
+ 	 * doing a recursive scan, we don't want to touch the parent's indexes at
+ 	 * all.
+ 	 */
+ 	if (!inh)
+ 		vac_open_indexes(onerel, AccessShareLock, &nindexes, &Irel);
+ 	else
+ 	{
+ 		Irel = NULL;
+ 		nindexes = 0;
+ 	}
+ 	hasindex = (nindexes > 0);
+ 	indexdata = NULL;
+ 
+ 	/*
+ 	 * Update cache stats in pg_class.
+ 	 */
+ 	oscache_update_relstats(onerel,
+ 							RelationGetRelationOSCacheInFork(onerel, MAIN_FORKNUM),
+ 							InvalidTransactionId);
+ 
+ 	/*
+ 	 * Same for indexes.
+ 	 */
+ 	for (ind = 0; ind < nindexes; ind++)
+ 	{
+ 		oscache_update_relstats(Irel[ind],
+ 								RelationGetRelationOSCacheInFork(Irel[ind], MAIN_FORKNUM),
+ 								InvalidTransactionId);
+ 	}
+ 
+ 	/* Done with indexes */
+ 	vac_close_indexes(nindexes, Irel, NoLock);
+ 
+ 	/* Log the action if appropriate */
+ 	if (IsAutoVacuumWorkerProcess() && Log_autovacuum_min_duration >= 0)
+ 	{
+ 		if (Log_autovacuum_min_duration == 0 ||
+ 			TimestampDifferenceExceeds(starttime, GetCurrentTimestamp(),
+ 									   Log_autovacuum_min_duration))
+ 			ereport(LOG,
+ 					(errmsg("automatic cache analyze of table \"%s.%s.%s\" system usage: %s",
+ 							get_database_name(MyDatabaseId),
+ 							get_namespace_name(RelationGetNamespace(onerel)),
+ 							RelationGetRelationName(onerel),
+ 							pg_rusage_show(&ru0))));
+ 	}
+ 
+ 	/* Roll back any GUC changes executed by index functions */
+ 	AtEOXact_GUC(false, save_nestlevel);
+ 
+ 	/* Restore current context and release memory */
+ 	MemoryContextSwitchTo(caller_context);
  	MemoryContextDelete(anl_context);
  	anl_context = NULL;
  }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
new file mode 100644
index 1d39674..c312227
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
*************** static void SplitColQualList(List *qualL
*** 342,348 ****
  
  %type <boolean> opt_instead
  %type <boolean> opt_unique opt_concurrently opt_verbose opt_full
! %type <boolean> opt_freeze opt_default opt_recheck
  %type <defelt>	opt_binary opt_oids copy_delimiter
  
  %type <boolean> copy_from
--- 342,348 ----
  
  %type <boolean> opt_instead
  %type <boolean> opt_unique opt_concurrently opt_verbose opt_full
! %type <boolean> opt_freeze opt_oscache opt_default opt_recheck
  %type <defelt>	opt_binary opt_oids copy_delimiter
  
  %type <boolean> copy_from
*************** static void SplitColQualList(List *qualL
*** 529,535 ****
  	NULLS_P NUMERIC
  
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
! 	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
  	PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
--- 529,535 ----
  	NULLS_P NUMERIC
  
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
! 	ORDER OSCACHE OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
  	PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
*************** vacuum_option_elem:
*** 7801,7811 ****
  		;
  
  AnalyzeStmt:
! 			analyze_keyword opt_verbose
  				{
  					VacuumStmt *n = makeNode(VacuumStmt);
  					n->options = VACOPT_ANALYZE;
  					if ($2)
  						n->options |= VACOPT_VERBOSE;
  					n->freeze_min_age = -1;
  					n->freeze_table_age = -1;
--- 7801,7813 ----
  		;
  
  AnalyzeStmt:
! 			analyze_keyword opt_oscache opt_verbose
  				{
  					VacuumStmt *n = makeNode(VacuumStmt);
  					n->options = VACOPT_ANALYZE;
  					if ($2)
+ 						n->options |= VACOPT_OSCACHE;
+ 					if ($3)
  						n->options |= VACOPT_VERBOSE;
  					n->freeze_min_age = -1;
  					n->freeze_table_age = -1;
*************** AnalyzeStmt:
*** 7813,7828 ****
  					n->va_cols = NIL;
  					$$ = (Node *)n;
  				}
! 			| analyze_keyword opt_verbose qualified_name opt_name_list
  				{
  					VacuumStmt *n = makeNode(VacuumStmt);
  					n->options = VACOPT_ANALYZE;
  					if ($2)
  						n->options |= VACOPT_VERBOSE;
  					n->freeze_min_age = -1;
  					n->freeze_table_age = -1;
! 					n->relation = $3;
! 					n->va_cols = $4;
  					$$ = (Node *)n;
  				}
  		;
--- 7815,7832 ----
  					n->va_cols = NIL;
  					$$ = (Node *)n;
  				}
! 			| analyze_keyword opt_oscache opt_verbose qualified_name opt_name_list
  				{
  					VacuumStmt *n = makeNode(VacuumStmt);
  					n->options = VACOPT_ANALYZE;
  					if ($2)
+ 						n->options |= VACOPT_OSCACHE;
+ 					if ($3)
  						n->options |= VACOPT_VERBOSE;
  					n->freeze_min_age = -1;
  					n->freeze_table_age = -1;
! 					n->relation = $4;
! 					n->va_cols = $5;
  					$$ = (Node *)n;
  				}
  		;
*************** opt_freeze: FREEZE									{ $$ = TRUE;
*** 7845,7850 ****
--- 7849,7859 ----
  			| /*EMPTY*/								{ $$ = FALSE; }
  		;
  
+ opt_oscache:
+ 			OSCACHE									{ $$ = TRUE; }
+ 			| /*EMPTY*/                             { $$ = FALSE; }
+ 		;
+ 
  opt_name_list:
  			'(' name_list ')'						{ $$ = $2; }
  			| /*EMPTY*/								{ $$ = NIL; }
*************** type_func_name_keyword:
*** 12158,12163 ****
--- 12167,12173 ----
  			| LIKE
  			| NATURAL
  			| NOTNULL
+ 			| OSCACHE
  			| OUTER_P
  			| OVER
  			| OVERLAPS
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
new file mode 100644
index b12348b..670dbdf
*** a/src/backend/storage/buffer/bufmgr.c
--- b/src/backend/storage/buffer/bufmgr.c
*************** RelationGetNumberOfBlocksInFork(Relation
*** 1928,1933 ****
--- 1928,1946 ----
  	return smgrnblocks(relation->rd_smgr, forkNum);
  }
  
+ /*
+  * RelationGetRelationOSCacheInFork
+  *		Determines the current percentage of pages in OS cache for the
+  *		relation.
+  */
+ float4
+ RelationGetRelationOSCacheInFork(Relation relation, ForkNumber forkNum)
+ {
+ 	float4 percent = 0;
+ 
+ 	return percent;
+ }
+ 
  /* ---------------------------------------------------------------------
   *		DropRelFileNodeBuffers
   *
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
new file mode 100644
index 14937d4..bc191c2
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
*************** typedef enum VacuumOption
*** 2417,2423 ****
  	VACOPT_VERBOSE = 1 << 2,	/* print progress info */
  	VACOPT_FREEZE = 1 << 3,		/* FREEZE option */
  	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
! 	VACOPT_NOWAIT = 1 << 5
  } VacuumOption;
  
  typedef struct VacuumStmt
--- 2417,2424 ----
  	VACOPT_VERBOSE = 1 << 2,	/* print progress info */
  	VACOPT_FREEZE = 1 << 3,		/* FREEZE option */
  	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
! 	VACOPT_NOWAIT = 1 << 5,
! 	VACOPT_OSCACHE = 1 << 6		/* do OSCACHE stats analyze */
  } VacuumOption;
  
  typedef struct VacuumStmt
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
new file mode 100644
index 12c2faf..95a7e3d
*** a/src/include/parser/kwlist.h
--- b/src/include/parser/kwlist.h
*************** PG_KEYWORD("option", OPTION, UNRESERVED_
*** 264,269 ****
--- 264,270 ----
  PG_KEYWORD("options", OPTIONS, UNRESERVED_KEYWORD)
  PG_KEYWORD("or", OR, RESERVED_KEYWORD)
  PG_KEYWORD("order", ORDER, RESERVED_KEYWORD)
+ PG_KEYWORD("oscache", OSCACHE, TYPE_FUNC_NAME_KEYWORD)
  PG_KEYWORD("out", OUT_P, COL_NAME_KEYWORD)
  PG_KEYWORD("outer", OUTER_P, TYPE_FUNC_NAME_KEYWORD)
  PG_KEYWORD("over", OVER, TYPE_FUNC_NAME_KEYWORD)
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
new file mode 100644
index b8fc87e..7d22b5a
*** a/src/include/storage/bufmgr.h
--- b/src/include/storage/bufmgr.h
*************** extern void CheckPointBuffers(int flags)
*** 179,184 ****
--- 179,186 ----
  extern BlockNumber BufferGetBlockNumber(Buffer buffer);
  extern BlockNumber RelationGetNumberOfBlocksInFork(Relation relation,
  								ForkNumber forkNum);
+ extern float4 RelationGetRelationOSCacheInFork(Relation relation,
+ 								ForkNumber forkNum);
  extern void FlushRelationBuffers(Relation rel);
  extern void FlushDatabaseBuffers(Oid dbid);
  extern void DropRelFileNodeBuffers(RelFileNodeBackend rnode,
-- 
1.7.5.3

From 5a024cbbc3261454c49d8a77afd121605abc3390 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr>
Date: Thu, 26 May 2011 00:46:03 +0200
Subject: [PATCH 4/7] Add a Hook to handle OSCache stats

The hook is in RelationGetRelationOSCacheInFork
---
 src/backend/storage/buffer/bufmgr.c |    9 +++++++++
 src/include/storage/bufmgr.h        |    7 +++++++
 src/tools/pgindent/typedefs.list    |    1 +
 3 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
new file mode 100644
index 670dbdf..9bfa78c
*** a/src/backend/storage/buffer/bufmgr.c
--- b/src/backend/storage/buffer/bufmgr.c
*************** static volatile BufferDesc *BufferAlloc(
*** 106,111 ****
--- 106,116 ----
  static void FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln);
  static void AtProcExit_Buffers(int code, Datum arg);
  
+ /*
+  * Hooks for plugin to get control in
+  * RelationGetRelationOSCacheInFork
+  */
+ oscache_hook_type OSCache_hook = NULL;
  
  /*
   * PrefetchBuffer -- initiate asynchronous read of a block of a relation
*************** RelationGetRelationOSCacheInFork(Relatio
*** 1938,1943 ****
--- 1943,1952 ----
  {
  	float4 percent = 0;
  
+ 	/* if a plugin is present, let it manage things */
+ 	if (OSCache_hook)
+ 		percent = (*OSCache_hook) (relation, forkNum);
+ 
  	return percent;
  }
  
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
new file mode 100644
index 7d22b5a..3098275
*** a/src/include/storage/bufmgr.h
--- b/src/include/storage/bufmgr.h
*************** extern void AtProcExit_LocalBuffers(void
*** 217,220 ****
--- 217,227 ----
  extern BufferAccessStrategy GetAccessStrategy(BufferAccessStrategyType btype);
  extern void FreeAccessStrategy(BufferAccessStrategy strategy);
  
+ /*
+ * Hooks for plugin to get control in
+ * RelationGetRelationOSCacheInFork
+ */
+ typedef float4 (*oscache_hook_type) (Relation relation, ForkNumber forkNum);
+ extern PGDLLIMPORT oscache_hook_type OSCache_hook;
+ 
  #endif
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
new file mode 100644
index f2982cd..a1bf8fe
*** a/src/tools/pgindent/typedefs.list
--- b/src/tools/pgindent/typedefs.list
*************** OprCacheKey
*** 963,968 ****
--- 963,969 ----
  OprInfo
  OprProofCacheEntry
  OprProofCacheKey
+ OSCache_hook_type
  OutputContext
  OverrideSearchPath
  OverrideStackEntry
-- 
1.7.5.3

From d742c0f1c16672c1632cea9936cc846c8b039187 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr>
Date: Thu, 2 Jun 2011 02:27:35 +0200
Subject: [PATCH 5/7] Add reloscache to (Index|Rel)OptInfo

So that costsize.c can use the values
---
 src/backend/optimizer/util/plancat.c |    7 +++++++
 src/include/nodes/relation.h         |    2 ++
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
new file mode 100644
index b286816..a9aa4b1
*** a/src/backend/optimizer/util/plancat.c
--- b/src/backend/optimizer/util/plancat.c
*************** static List *get_relation_constraints(Pl
*** 64,69 ****
--- 64,70 ----
   *	indexlist	list of IndexOptInfos for relation's indexes
   *	pages		number of pages
   *	tuples		number of tuples
+  *	oscache		density of page in cache
   *
   * Also, initialize the attr_needed[] and attr_widths[] arrays.  In most
   * cases these are left as zeroes, but sometimes we need to compute attr
*************** get_relation_info(PlannerInfo *root, Oid
*** 111,120 ****
--- 112,125 ----
  	 * case the size will be computed later in set_append_rel_pathlist, and we
  	 * must leave it zero for now to avoid bollixing the total_table_pages
  	 * calculation.
+ 	 * XXX: we need to handle oscache for inherited tables....check that
  	 */
  	if (!inhparent)
+ 	{
+ 		rel->oscache = (float4) relation->rd_rel->reloscache;
  		estimate_rel_size(relation, rel->attr_widths - rel->min_attr,
  						  &rel->pages, &rel->tuples);
+ 	}
  
  	/*
  	 * Make list of indexes.  Ignore indexes on system catalogs if told to.
*************** get_relation_info(PlannerInfo *root, Oid
*** 327,337 ****
--- 332,344 ----
  			 */
  			if (info->indpred == NIL)
  			{
+ 				info->oscache = (float4) indexRelation->rd_rel->reloscache;
  				info->pages = RelationGetNumberOfBlocks(indexRelation);
  				info->tuples = rel->tuples;
  			}
  			else
  			{
+ 				info->oscache = (float4) indexRelation->rd_rel->reloscache;
  				estimate_rel_size(indexRelation, NULL,
  								  &info->pages, &info->tuples);
  				if (info->tuples > rel->tuples)
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
new file mode 100644
index f659269..e94b48b
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
*************** typedef struct RelOptInfo
*** 408,413 ****
--- 408,414 ----
  	List	   *indexlist;		/* list of IndexOptInfo */
  	BlockNumber pages;
  	double		tuples;
+ 	float4		oscache;		/* density of the relation in cache */
  	struct Plan *subplan;		/* if subquery */
  	List	   *subrtable;		/* if subquery */
  	List	   *subrowmark;		/* if subquery */
*************** typedef struct IndexOptInfo
*** 466,471 ****
--- 467,473 ----
  	/* statistics from pg_class */
  	BlockNumber pages;			/* number of disk pages in index */
  	double		tuples;			/* number of index tuples in index */
+ 	float4		oscache;		/* density of the relation in cache */
  
  	/* index descriptor information */
  	int			ncolumns;		/* number of columns in index */
-- 
1.7.5.3

From 4bea6a4d150b694f3d16546ec865e4be948745ed Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Villemain?= <ced...@2ndquadrant.fr>
Date: Thu, 2 Jun 2011 02:28:01 +0200
Subject: [PATCH 6/7] Add cache_page_cost GUC

this is used to improve cache estimate and impact in IO cost estimations
---
 doc/src/sgml/config.sgml                      |   13 +++++++++++++
 src/backend/optimizer/path/costsize.c         |    1 +
 src/backend/utils/misc/guc.c                  |   10 ++++++++++
 src/backend/utils/misc/postgresql.conf.sample |    1 +
 src/include/optimizer/cost.h                  |    2 ++
 5 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index e835e4b..8bc5560
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*************** SET ENABLE_SEQSCAN TO OFF;
*** 2514,2519 ****
--- 2514,2532 ----
        </listitem>
       </varlistentry>
  
+      <varlistentry id="guc-cache-page-cost" xreflabel="cache_page_cost">
+       <term><varname>cache_page_cost</varname> (<type>floating point</type>)</term>
+       <indexterm>
+        <primary><varname>cache_page_cost</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+         Sets the planner's estimate of the cost of a
+         fetched page from cache.  The default is 0.1.
+        </para>
+       </listitem>
+      </varlistentry>
+ 
       <varlistentry id="guc-cpu-tuple-cost" xreflabel="cpu_tuple_cost">
        <term><varname>cpu_tuple_cost</varname> (<type>floating point</type>)</term>
        <indexterm>
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
new file mode 100644
index bb38768..a80dc0c
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
***************
*** 100,105 ****
--- 100,106 ----
  
  double		seq_page_cost = DEFAULT_SEQ_PAGE_COST;
  double		random_page_cost = DEFAULT_RANDOM_PAGE_COST;
+ double		cache_page_cost = DEFAULT_CACHE_PAGE_COST;
  double		cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST;
  double		cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST;
  double		cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
new file mode 100644
index 92391ed..7465c08
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
*************** static struct config_real ConfigureNames
*** 2392,2397 ****
--- 2392,2407 ----
  		NULL, NULL, NULL
  	},
  	{
+ 		{"cache_page_cost", PGC_USERSET, QUERY_TUNING_COST,
+ 			gettext_noop("Sets the planner's estimate of the cost of a "
+ 			"fetched page from cache."),
+ 			NULL
+ 		},
+ 		&cache_page_cost,
+ 		DEFAULT_CACHE_PAGE_COST, 0, DBL_MAX,
+ 		NULL, NULL, NULL
+ 	},
+ 	{
  		{"cpu_tuple_cost", PGC_USERSET, QUERY_TUNING_COST,
  			gettext_noop("Sets the planner's estimate of the cost of "
  						 "processing each tuple (row)."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
new file mode 100644
index 655dad4..82159e1
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 233,238 ****
--- 233,239 ----
  
  #seq_page_cost = 1.0			# measured on an arbitrary scale
  #random_page_cost = 4.0			# same scale as above
+ #cache_page_cost = 0.1			# same scale as above
  #cpu_tuple_cost = 0.01			# same scale as above
  #cpu_index_tuple_cost = 0.005		# same scale as above
  #cpu_operator_cost = 0.0025		# same scale as above
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
new file mode 100644
index 2763863..f4babd0
*** a/src/include/optimizer/cost.h
--- b/src/include/optimizer/cost.h
***************
*** 23,28 ****
--- 23,29 ----
  /* If you change these, update backend/utils/misc/postgresql.sample.conf */
  #define DEFAULT_SEQ_PAGE_COST  1.0
  #define DEFAULT_RANDOM_PAGE_COST  4.0
+ #define DEFAULT_CACHE_PAGE_COST  0.1
  #define DEFAULT_CPU_TUPLE_COST	0.01
  #define DEFAULT_CPU_INDEX_TUPLE_COST 0.005
  #define DEFAULT_CPU_OPERATOR_COST  0.0025
*************** typedef enum
*** 45,50 ****
--- 46,52 ----
  /* parameter variables and flags */
  extern PGDLLIMPORT double seq_page_cost;
  extern PGDLLIMPORT double random_page_cost;
+ extern PGDLLIMPORT double cache_page_cost;
  extern PGDLLIMPORT double cpu_tuple_cost;
  extern PGDLLIMPORT double cpu_index_tuple_cost;
  extern PGDLLIMPORT double cpu_operator_cost;
-- 
1.7.5.3

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to