Re: Protect syscache from bloating with negative cache entries

Kyotaro HORIGUCHI Thu, 13 Sep 2018 05:42:34 -0700

Hello. Thank you for looking this.

At Wed, 12 Sep 2018 05:16:52 +0000, "Ideriha, Takeshi" 
<ideriha.take...@jp.fujitsu.com> wrote in 
<4E72940DA2BF16479384A86D54D0988A6F197012@G01JPEXMBKW04>
> Hi, 
> 
> >Subject: Re: Protect syscache from bloating with negative cache entries
> >
> >Hello. The previous v4 patchset was just broken.
> 
> >Somehow the 0004 was merged into the 0003 and applying 0004 results in 
> >failure. I
> >removed 0004 part from the 0003 and rebased and repost it.
> 
> I have some questions about syscache and relcache pruning
> though they may be discussed at upper thread or out of point.
> 
> Can I confirm about catcache pruning?
> syscache_memory_target is the max figure per CatCache.
> (Any CatCache has the same max value.)
> So the total max size of catalog caches is estimated around or 
> slightly more than # of SysCache array times syscache_memory_target.


Right.

> If correct, I'm thinking writing down the above estimation to the document 
> would help db administrators with estimation of memory usage.
> Current description might lead misunderstanding that syscache_memory_target
> is the total size of catalog cache in my impression.

Honestly I'm not sure that is the right design. Howerver, I don't
think providing such formula to users helps users, since they
don't know exactly how many CatCaches and brothres live in their
server and it is a soft limit, and finally only few or just one
catalogs can reach the limit.

The current design based on the assumption that we would have
only one extremely-growable cache in one use case.

> Related to the above I just thought changing sysycache_memory_target per 
> CatCache
> would make memory usage more efficient.

We could easily have per-cache settings in CatCache, but how do
we provide the knobs for them? I can guess only too much
solutions for that.

> Though I haven't checked if there's a case that each system catalog cache 
> memory usage varies largely,
> pg_class cache might need more memory than others and others might need less.
> But it would be difficult for users to check each CatCache memory usage and 
> tune it
> because right now postgresql hasn't provided a handy way to check them.

I supposed that this is used without such a means. Someone
suffers syscache bloat just can set this GUC to avoid the
bloat. End.

Apart from that, in the current patch, syscache_memory_target is
not exact at all in the first place to avoid overhead to count
the correct size. The major difference comes from the size of
cache tuple itself. But I came to think it is too much to omit.

As a *PoC*, in the attached patch (which applies to current
master), size of CTups are counted as the catcache size.

It also provides pg_catcache_size system view just to give a
rough idea of how such view looks. I'll consider more on that but
do you have any opinion on this?

=# select relid::regclass, indid::regclass, size from pg_syscache_sizes order 
by size desc;
          relid          |                   indid                   |  size  
-------------------------+-------------------------------------------+--------
 pg_class                | pg_class_oid_index                        | 131072
 pg_class                | pg_class_relname_nsp_index                | 131072
 pg_cast                 | pg_cast_source_target_index               |   5504
 pg_operator             | pg_operator_oprname_l_r_n_index           |   4096
 pg_statistic            | pg_statistic_relid_att_inh_index          |   2048
 pg_proc                 | pg_proc_proname_args_nsp_index            |   2048
..


> Another option is that users only specify the total memory target size and 
> postgres 
> dynamically change each CatCache memory target size according to a certain 
> metric.
> (, which still seems difficult and expensive to develop per benefit)
> What do you think about this?

Given that few caches bloat at once, it's effect is not so
different from the current design.

> As you commented here, guc variable syscache_memory_target and
> syscache_prune_min_age are used for both syscache and relcache (HTAB), right?

Right, just not to add knobs for unclear reasons. Since ...

> Do syscache and relcache have the similar amount of memory usage?

They may be different but would make not so much in the case of
cache bloat.

> If not, I'm thinking that introducing separate guc variable would be fine.
> So as syscache_prune_min_age.

I implemented that so that it is easily replaceable in case, but
I'm not sure separating them makes significant difference..

Thanks for the opinion, I'll put consideration on this more.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bee4afbe4e..6a00141fc9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1617,6 +1617,44 @@ include_dir 'conf.d'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-syscache-memory-target" xreflabel="syscache_memory_target">
+      <term><varname>syscache_memory_target</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>syscache_memory_target</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the maximum amount of memory to which syscache is expanded
+        without pruning. The value defaults to 0, indicating that pruning is
+        always considered. After exceeding this size, syscache pruning is
+        considered according to
+        <xref linkend="guc-syscache-prune-min-age"/>. If you need to keep
+        certain amount of syscache entries with intermittent usage, try
+        increase this setting.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry id="guc-syscache-prune-min-age" xreflabel="syscache_prune_min_age">
+      <term><varname>syscache_prune_min_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>syscache_prune_min_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the minimum amount of unused time in seconds at which a
+        syscache entry is considered to be removed. -1 indicates that syscache
+        pruning is disabled at all. The value defaults to 600 seconds
+        (<literal>10 minutes</literal>). The syscache entries that are not
+        used for the duration can be removed to prevent syscache bloat. This
+        behavior is suppressed until the size of syscache exceeds
+        <xref linkend="guc-syscache-memory-target"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-max-stack-depth" xreflabel="max_stack_depth">
       <term><varname>max_stack_depth</varname> (<type>integer</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 875be180fe..df4256466c 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -713,6 +713,9 @@ void
 SetCurrentStatementStartTimestamp(void)
 {
 	stmtStartTimestamp = GetCurrentTimestamp();
+
+	/* Set this timestamp as aproximated current time */
+	SetCatCacheClock(stmtStartTimestamp);
 }
 
 /*
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 7251552419..1a1acd9bc7 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -938,6 +938,11 @@ REVOKE ALL ON pg_subscription FROM public;
 GRANT SELECT (subdbid, subname, subowner, subenabled, subslotname, subpublications)
     ON pg_subscription TO public;
 
+-- XXXXXXXXXXXXXXXXXXXXXX
+CREATE VIEW pg_syscache_sizes AS
+  SELECT *
+  FROM pg_get_syscache_sizes();
+
 
 --
 -- We have a few function definitions in here, too.
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 5ddbf6eab1..aafdc4f8f2 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -71,9 +71,24 @@
 #define CACHE6_elog(a,b,c,d,e,f,g)
 #endif
 
+/*
+ * GUC variable to define the minimum size of hash to cosider entry eviction.
+ * This variable is shared among various cache mechanisms.
+ */
+int cache_memory_target = 0;
+
+/* GUC variable to define the minimum age of entries that will be cosidered to
+ * be evicted in seconds. This variable is shared among various cache
+ * mechanisms.
+ */
+int cache_prune_min_age = 600;
+
 /* Cache management header --- pointer is NULL until created */
 static CatCacheHeader *CacheHdr = NULL;
 
+/* Timestamp used for any operation on caches. */
+TimestampTz	catcacheclock = 0;
+
 static inline HeapTuple SearchCatCacheInternal(CatCache *cache,
 					   int nkeys,
 					   Datum v1, Datum v2,
@@ -498,6 +513,7 @@ CatCacheRemoveCTup(CatCache *cache, CatCTup *ct)
 		CatCacheFreeKeys(cache->cc_tupdesc, cache->cc_nkeys,
 						 cache->cc_keyno, ct->keys);
 
+	cache->cc_tupsize -= ct->size;
 	pfree(ct);
 
 	--cache->cc_ntup;
@@ -849,6 +865,7 @@ InitCatCache(int id,
 	cp->cc_nkeys = nkeys;
 	for (i = 0; i < nkeys; ++i)
 		cp->cc_keyno[i] = key[i];
+	cp->cc_tupsize = 0;
 
 	/*
 	 * new cache is initialized as far as we can go for now. print some
@@ -866,9 +883,129 @@ InitCatCache(int id,
 	 */
 	MemoryContextSwitchTo(oldcxt);
 
+	/* initilize catcache reference clock if haven't done yet */
+	if (catcacheclock == 0)
+		catcacheclock = GetCurrentTimestamp();
+
 	return cp;
 }
 
+/*
+ * CatCacheCleanupOldEntries - Remove infrequently-used entries
+ *
+ * Catcache entries can be left alone for several reasons. We remove them if
+ * they are not accessed for a certain time to prevent catcache from
+ * bloating. The eviction is performed with the similar algorithm with buffer
+ * eviction using access counter. Entries that are accessed several times can
+ * live longer than those that have had no access in the same duration.
+ */
+static bool
+CatCacheCleanupOldEntries(CatCache *cp)
+{
+	int			i;
+	int			nremoved = 0;
+	size_t		hash_size;
+#ifdef CATCACHE_STATS
+	/* These variables are only for debugging purpose */
+	int			ntotal = 0;
+	/*
+	 * nth element in nentries stores the number of cache entries that have
+	 * lived unaccessed for corresponding multiple in ageclass of
+	 * cache_prune_min_age. The index of nremoved_entry is the value of the
+	 * clock-sweep counter, which takes from 0 up to 2.
+	 */
+	double		ageclass[] = {0.05, 0.1, 1.0, 2.0, 3.0, 0.0};
+	int			nentries[] = {0, 0, 0, 0, 0, 0};
+	int			nremoved_entry[3] = {0, 0, 0};
+	int			j;
+#endif
+
+	/* Return immediately if no pruning is wanted */
+	if (cache_prune_min_age < 0)
+		return false;
+
+	/*
+	 * Return without pruning if the size of the hash is below the target.
+	 */
+	hash_size = cp->cc_nbuckets * sizeof(dlist_head);
+	if (hash_size + cp->cc_tupsize < (Size) cache_memory_target * 1024L)
+		return false;
+	
+	/* Search the whole hash for entries to remove */
+	for (i = 0; i < cp->cc_nbuckets; i++)
+	{
+		dlist_mutable_iter iter;
+
+		dlist_foreach_modify(iter, &cp->cc_bucket[i])
+		{
+			CatCTup    *ct = dlist_container(CatCTup, cache_elem, iter.cur);
+			long entry_age;
+			int us;
+
+
+			/*
+			 * Calculate the duration from the time of the last access to the
+			 * "current" time. Since catcacheclock is not advanced within a
+			 * transaction, the entries that are accessed within the current
+			 * transaction won't be pruned.
+			 */
+			TimestampDifference(ct->lastaccess, catcacheclock, &entry_age, &us);
+
+#ifdef CATCACHE_STATS
+			/* count catcache entries for each age class */
+			ntotal++;
+			for (j = 0 ;
+				 ageclass[j] != 0.0 &&
+					 entry_age > cache_prune_min_age * ageclass[j] ;
+				 j++);
+			if (ageclass[j] == 0.0) j--;
+			nentries[j]++;
+#endif
+
+			/*
+			 * Try to remove entries older than cache_prune_min_age seconds.
+			 * Entries that are not accessed after last pruning are removed in
+			 * that seconds, and that has been accessed several times are
+			 * removed after leaving alone for up to three times of the
+			 * duration. We don't try shrink buckets since pruning effectively
+			 * caps catcache expansion in the long term.
+			 */
+			if (entry_age > cache_prune_min_age)
+			{
+#ifdef CATCACHE_STATS
+				Assert (ct->naccess >= 0 && ct->naccess <= 2);
+				nremoved_entry[ct->naccess]++;
+#endif
+				if (ct->naccess > 0)
+					ct->naccess--;
+				else
+				{
+					if (!ct->c_list || ct->c_list->refcount == 0)
+					{
+						CatCacheRemoveCTup(cp, ct);
+						nremoved++;
+					}
+				}
+			}
+		}
+	}
+
+#ifdef CATCACHE_STATS
+	ereport(DEBUG1,
+			(errmsg ("removed %d/%d, age(-%.0fs:%d, -%.0fs:%d, *-%.0fs:%d, -%.0fs:%d, -%.0fs:%d) naccessed(0:%d, 1:%d, 2:%d)",
+					 nremoved, ntotal,
+					 ageclass[0] * cache_prune_min_age, nentries[0],
+					 ageclass[1] * cache_prune_min_age, nentries[1],
+					 ageclass[2] * cache_prune_min_age, nentries[2],
+					 ageclass[3] * cache_prune_min_age, nentries[3],
+					 ageclass[4] * cache_prune_min_age, nentries[4],
+					 nremoved_entry[0], nremoved_entry[1], nremoved_entry[2]),
+			 errhidestmt(true)));
+#endif
+
+	return nremoved > 0;
+}
+
 /*
  * Enlarge a catcache, doubling the number of buckets.
  */
@@ -1282,6 +1419,11 @@ SearchCatCacheInternal(CatCache *cache,
 		 */
 		dlist_move_head(bucket, &ct->cache_elem);
 
+		/* Update access information for pruning */
+		if (ct->naccess < 2)
+			ct->naccess++;
+		ct->lastaccess = catcacheclock;
+
 		/*
 		 * If it's a positive entry, bump its refcount and return it. If it's
 		 * negative, we can report failure to the caller.
@@ -1813,7 +1955,6 @@ ReleaseCatCacheList(CatCList *list)
 		CatCacheRemoveCList(list->my_cache, list);
 }
 
-
 /*
  * CatalogCacheCreateEntry
  *		Create a new CatCTup entry, copying the given HeapTuple and other
@@ -1827,11 +1968,13 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
 	CatCTup    *ct;
 	HeapTuple	dtp;
 	MemoryContext oldcxt;
+	int			tupsize = 0;
 
 	/* negative entries have no tuple associated */
 	if (ntp)
 	{
 		int			i;
+		int			tupsize;
 
 		Assert(!negative);
 
@@ -1850,13 +1993,14 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
 		/* Allocate memory for CatCTup and the cached tuple in one go */
 		oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
 
-		ct = (CatCTup *) palloc(sizeof(CatCTup) +
-								MAXIMUM_ALIGNOF + dtp->t_len);
+		tupsize = sizeof(CatCTup) +	MAXIMUM_ALIGNOF + dtp->t_len;
+		ct = (CatCTup *) palloc(tupsize);
 		ct->tuple.t_len = dtp->t_len;
 		ct->tuple.t_self = dtp->t_self;
 		ct->tuple.t_tableOid = dtp->t_tableOid;
 		ct->tuple.t_data = (HeapTupleHeader)
 			MAXALIGN(((char *) ct) + sizeof(CatCTup));
+		ct->size = tupsize;
 		/* copy tuple contents */
 		memcpy((char *) ct->tuple.t_data,
 			   (const char *) dtp->t_data,
@@ -1884,8 +2028,8 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
 	{
 		Assert(negative);
 		oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
-		ct = (CatCTup *) palloc(sizeof(CatCTup));
-
+		tupsize = sizeof(CatCTup);
+		ct = (CatCTup *) palloc(tupsize);
 		/*
 		 * Store keys - they'll point into separately allocated memory if not
 		 * by-value.
@@ -1906,17 +2050,24 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
 	ct->dead = false;
 	ct->negative = negative;
 	ct->hash_value = hashValue;
+	ct->naccess = 0;
+	ct->lastaccess = catcacheclock;
+	ct->size = tupsize;
 
 	dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
 
 	cache->cc_ntup++;
 	CacheHdr->ch_ntup++;
+	cache->cc_tupsize += tupsize;
 
 	/*
-	 * If the hash table has become too full, enlarge the buckets array. Quite
-	 * arbitrarily, we enlarge when fill factor > 2.
+	 * If the hash table has become too full, try cleanup by removing
+	 * infrequently used entries to make a room for the new entry. If it
+	 * failed, enlarge the bucket array instead.  Quite arbitrarily, we try
+	 * this when fill factor > 2.
 	 */
-	if (cache->cc_ntup > cache->cc_nbuckets * 2)
+	if (cache->cc_ntup > cache->cc_nbuckets * 2 &&
+		!CatCacheCleanupOldEntries(cache))
 		RehashCatCache(cache);
 
 	return ct;
@@ -2118,3 +2269,9 @@ PrintCatCacheListLeakWarning(CatCList *list)
 		 list->my_cache->cc_relname, list->my_cache->id,
 		 list, list->refcount);
 }
+
+int
+CatCacheGetSize(CatCache *cache)
+{
+	return cache->cc_tupsize + cache->cc_nbuckets * sizeof(dlist_head);
+}
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 7271b5880b..490cb8ec8a 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -63,12 +63,14 @@
 #include "storage/lmgr.h"
 #include "tcop/pquery.h"
 #include "tcop/utility.h"
+#include "utils/catcache.h"
 #include "utils/inval.h"
 #include "utils/memutils.h"
 #include "utils/resowner_private.h"
 #include "utils/rls.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
+#include "utils/timestamp.h"
 
 
 /*
@@ -86,6 +88,12 @@
  * guarantee to save a CachedPlanSource without error.
  */
 static CachedPlanSource *first_saved_plan = NULL;
+static CachedPlanSource *last_saved_plan = NULL;
+static int				 num_saved_plans = 0;
+static TimestampTz		 oldest_saved_plan = 0;
+
+/* GUC variables */
+int						 min_cached_plans = 1000;
 
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
@@ -105,6 +113,7 @@ static TupleDesc PlanCacheComputeResultDesc(List *stmt_list);
 static void PlanCacheRelCallback(Datum arg, Oid relid);
 static void PlanCacheFuncCallback(Datum arg, int cacheid, uint32 hashvalue);
 static void PlanCacheSysCallback(Datum arg, int cacheid, uint32 hashvalue);
+static void PruneCachedPlan(void);
 
 /* GUC parameter */
 int	plan_cache_mode;
@@ -210,6 +219,8 @@ CreateCachedPlan(RawStmt *raw_parse_tree,
 	plansource->generic_cost = -1;
 	plansource->total_custom_cost = 0;
 	plansource->num_custom_plans = 0;
+	plansource->last_access = GetCatCacheClock();
+	
 
 	MemoryContextSwitchTo(oldcxt);
 
@@ -425,6 +436,28 @@ CompleteCachedPlan(CachedPlanSource *plansource,
 	plansource->is_valid = true;
 }
 
+/* moves the plansource to the first in the list */
+static inline void
+MovePlansourceToFirst(CachedPlanSource *plansource)
+{
+	if (first_saved_plan != plansource)
+	{
+		/* delink this element */
+		if (plansource->next_saved)
+			plansource->next_saved->prev_saved = plansource->prev_saved;
+		if (plansource->prev_saved)
+			plansource->prev_saved->next_saved = plansource->next_saved;
+		if (last_saved_plan == plansource)
+			last_saved_plan = plansource->prev_saved;
+
+		/* insert at the beginning */
+		first_saved_plan->prev_saved = plansource;
+		plansource->next_saved = first_saved_plan;
+		plansource->prev_saved = NULL;
+		first_saved_plan = plansource;
+	}
+}
+
 /*
  * SaveCachedPlan: save a cached plan permanently
  *
@@ -472,6 +505,11 @@ SaveCachedPlan(CachedPlanSource *plansource)
 	 * Add the entry to the global list of cached plans.
 	 */
 	plansource->next_saved = first_saved_plan;
+	if (first_saved_plan)
+		first_saved_plan->prev_saved = plansource;
+	else
+		last_saved_plan = plansource;
+	plansource->prev_saved = NULL;
 	first_saved_plan = plansource;
 
 	plansource->is_saved = true;
@@ -494,7 +532,11 @@ DropCachedPlan(CachedPlanSource *plansource)
 	if (plansource->is_saved)
 	{
 		if (first_saved_plan == plansource)
+		{
 			first_saved_plan = plansource->next_saved;
+			if (first_saved_plan)
+				first_saved_plan->prev_saved = NULL;
+		}
 		else
 		{
 			CachedPlanSource *psrc;
@@ -504,10 +546,19 @@ DropCachedPlan(CachedPlanSource *plansource)
 				if (psrc->next_saved == plansource)
 				{
 					psrc->next_saved = plansource->next_saved;
+					if (psrc->next_saved)
+						psrc->next_saved->prev_saved = psrc;
 					break;
 				}
 			}
 		}
+
+		if (last_saved_plan == plansource)
+		{
+			last_saved_plan = plansource->prev_saved;
+			if (last_saved_plan)
+				last_saved_plan->next_saved = NULL;
+		}
 		plansource->is_saved = false;
 	}
 
@@ -539,6 +590,13 @@ ReleaseGenericPlan(CachedPlanSource *plansource)
 		Assert(plan->magic == CACHEDPLAN_MAGIC);
 		plansource->gplan = NULL;
 		ReleaseCachedPlan(plan, false);
+
+		/* decrement "saved plans" counter */
+		if (plansource->is_saved)
+		{
+			Assert (num_saved_plans > 0);
+			num_saved_plans--;
+		}
 	}
 }
 
@@ -1156,6 +1214,17 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (useResOwner && !plansource->is_saved)
 		elog(ERROR, "cannot apply ResourceOwner to non-saved cached plan");
 
+	/*
+	 * set last-accessed timestamp and move this plan to the first of the list
+	 */
+	if (plansource->is_saved)
+	{
+		plansource->last_access = GetCatCacheClock();
+
+		/* move this plan to the first of the list */
+		MovePlansourceToFirst(plansource);
+	}
+
 	/* Make sure the querytree list is valid and we have parse-time locks */
 	qlist = RevalidateCachedQuery(plansource, queryEnv);
 
@@ -1164,6 +1233,11 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
+		/* Prune cached plans if needed */
+		if (plansource->is_saved &&
+			min_cached_plans >= 0 && num_saved_plans > min_cached_plans)
+				PruneCachedPlan();
+
 		if (CheckCachedPlan(plansource))
 		{
 			/* We want a generic plan, and we already have a valid one */
@@ -1176,6 +1250,11 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
+
+			/* count this new saved plan */
+			if (plansource->is_saved)
+				num_saved_plans++;
+
 			/* Link the new generic plan into the plansource */
 			plansource->gplan = plan;
 			plan->refcount++;
@@ -1864,6 +1943,90 @@ PlanCacheSysCallback(Datum arg, int cacheid, uint32 hashvalue)
 	ResetPlanCache();
 }
 
+/*
+ * PrunePlanCache: removes generic plan of "old" saved plans.
+ */
+static void
+PruneCachedPlan(void)
+{
+	CachedPlanSource *plansource;
+	TimestampTz		  currclock = GetCatCacheClock();
+	long			  age;
+	int				  us;
+	int				  nremoved = 0;
+
+	/* do nothing if not wanted */
+	if (cache_prune_min_age < 0 || num_saved_plans <= min_cached_plans)
+		return;
+
+	/* Fast check for oldest cache */
+	if (oldest_saved_plan > 0)
+	{
+		TimestampDifference(oldest_saved_plan, currclock, &age, &us);
+		if (age < cache_prune_min_age)
+			return;
+	}		
+
+	/* last plan is the oldest. */
+	for (plansource = last_saved_plan; plansource; plansource = plansource->prev_saved)
+	{
+		long	plan_age;
+		int		us;
+
+		Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
+
+		/* we want to prune no more plans */
+		if (num_saved_plans <= min_cached_plans)
+			break;
+
+		/*
+		 * No work if it already doesn't have gplan and move it to the
+		 * beginning so that we don't see it at the next time
+		 */
+		if (!plansource->gplan)
+			continue;
+
+		/*
+		 * Check age for pruning. Can exit immediately when finding a
+		 * not-older element.
+		 */
+		TimestampDifference(plansource->last_access, currclock, &plan_age, &us);
+		if (plan_age <= cache_prune_min_age)
+		{
+			/* this entry is the next oldest */
+			oldest_saved_plan = plansource->last_access;
+			break;
+		}
+
+		/*
+		 * Here, remove generic plans of this plansrouceif it is not actually
+		 * used and move it to the beginning of the list. Just update
+		 * last_access and move it to the beginning if the plan is used.
+		 */
+		if (plansource->gplan->refcount <= 1)
+		{
+			ReleaseGenericPlan(plansource);
+			nremoved++;
+		}
+
+		plansource->last_access = currclock;
+	}
+
+	/* move the "removed" plansrouces altogehter to the beginning of the list */
+	if (plansource != last_saved_plan && plansource)
+	{
+		plansource->next_saved->prev_saved = NULL;
+		first_saved_plan->prev_saved = last_saved_plan;
+ 		last_saved_plan->next_saved = first_saved_plan;
+		first_saved_plan = plansource->next_saved;
+		plansource->next_saved = NULL;
+		last_saved_plan = plansource;
+	}
+
+	if (nremoved > 0)
+		elog(DEBUG1, "plancache removed %d/%d", nremoved, num_saved_plans);
+}
+
 /*
  * ResetPlanCache: invalidate all cached plans.
  */
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 2b381782a3..9cdb75afb8 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -73,9 +73,14 @@
 #include "catalog/pg_ts_template.h"
 #include "catalog/pg_type.h"
 #include "catalog/pg_user_mapping.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "nodes/execnodes.h"
 #include "utils/rel.h"
 #include "utils/catcache.h"
 #include "utils/syscache.h"
+#include "utils/tuplestore.h"
+#include "utils/fmgrprotos.h"
 
 
 /*---------------------------------------------------------------------------
@@ -1530,6 +1535,64 @@ RelationSupportsSysCache(Oid relid)
 }
 
 
+/*
+ * rough size of this syscache
+ */
+Datum
+pg_get_syscache_sizes(PG_FUNCTION_ARGS)
+{
+#define PG_GET_SYSCACHE_SIZE 3
+	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+	TupleDesc	tupdesc;
+	Tuplestorestate *tupstore;
+	MemoryContext per_query_ctx;
+	MemoryContext oldcontext;
+	int	cacheId;
+
+	if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("set-valued function called in context that cannot accept a set")));
+	if (!(rsinfo->allowedModes & SFRM_Materialize))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("materialize mode required, but it is not " \
+						"allowed in this context")));
+
+	/* Build a tuple descriptor for our result type */
+	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+		elog(ERROR, "return type must be a row type");
+	
+	per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+	oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+	tupstore = tuplestore_begin_heap(true, false, work_mem);
+	rsinfo->returnMode = SFRM_Materialize;
+	rsinfo->setResult = tupstore;
+	rsinfo->setDesc = tupdesc;
+
+	MemoryContextSwitchTo(oldcontext);
+
+	for (cacheId = 0 ; cacheId < SysCacheSize ; cacheId++)
+	{
+		Datum values[PG_GET_SYSCACHE_SIZE];
+		bool nulls[PG_GET_SYSCACHE_SIZE];
+		int i;
+
+		memset(nulls, 0, sizeof(nulls));
+
+		i = 0;
+		values[i++] = cacheinfo[cacheId].reloid;
+		values[i++] = cacheinfo[cacheId].indoid;
+		values[i++] = Int64GetDatum(CatCacheGetSize(SysCache[cacheId]));
+		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+	}
+
+	tuplestore_donestoring(tupstore);
+
+	return (Datum) 0;
+}
+
 /*
  * OID comparator for pg_qsort
  */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 0625eff219..3154574f62 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -79,6 +79,7 @@
 #include "tsearch/ts_cache.h"
 #include "utils/builtins.h"
 #include "utils/bytea.h"
+#include "utils/catcache.h"
 #include "utils/guc_tables.h"
 #include "utils/float.h"
 #include "utils/memutils.h"
@@ -2113,6 +2114,38 @@ static struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"cache_memory_target", PGC_USERSET, RESOURCES_MEM,
+			gettext_noop("Sets the minimum syscache size to keep."),
+			gettext_noop("Cache is not pruned before exceeding this size."),
+			GUC_UNIT_KB
+		},
+		&cache_memory_target,
+		0, 0, MAX_KILOBYTES,
+		NULL, NULL, NULL
+	},
+
+	{
+		{"cache_prune_min_age", PGC_USERSET, RESOURCES_MEM,
+			gettext_noop("Sets the minimum unused duration of cache entries before removal."),
+			gettext_noop("Cache entries that live unused for longer than this seconds are considered to be removed."),
+			GUC_UNIT_S
+		},
+		&cache_prune_min_age,
+		600, -1, INT_MAX,
+		NULL, NULL, NULL
+	},
+
+	{
+		{"min_cached_plans", PGC_USERSET, RESOURCES_MEM,
+			gettext_noop("Sets the minimum number of cached plans kept on memory."),
+			gettext_noop("Timeout invalidation of plancache is not activated until the number of plancaches reaches this value. -1 means timeout invalidation is always active.")
+		},
+		&min_cached_plans,
+		1000, -1, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	/*
 	 * We use the hopefully-safely-small value of 100kB as the compiled-in
 	 * default for max_stack_depth.  InitializeGUCOptions will increase it if
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 7486d20a34..917d7cb5cf 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -126,6 +126,8 @@
 #work_mem = 4MB				# min 64kB
 #maintenance_work_mem = 64MB		# min 1MB
 #autovacuum_work_mem = -1		# min 1MB, or -1 to use maintenance_work_mem
+#cache_memory_target = 0kB	# in kB
+#cache_prune_min_age = 600s	# -1 disables pruning
 #max_stack_depth = 2MB			# min 100kB
 #dynamic_shared_memory_type = posix	# the default is the first option
 					# supported by the operating system:
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 860571440a..c0bfcc9f70 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -9800,6 +9800,15 @@
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o}',
   proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn}',
   prosrc => 'pg_get_replication_slots' },
+{ oid => '3423',
+  descr => 'syscache size',
+  proname => 'pg_get_syscache_sizes', prorows => '100', proisstrict => 'f',
+  proretset => 't', provolatile => 'v', prorettype => 'record',
+  proargtypes => '',
+  proallargtypes => '{oid,oid,int8}',
+  proargmodes => '{o,o,o}',
+  proargnames => '{relid,indid,size}',
+  prosrc => 'pg_get_syscache_sizes' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
   proparallel => 'u', prorettype => 'record', proargtypes => 'name name bool',
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index 7b22f9c7bc..9c326d6af6 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -22,6 +22,7 @@
 
 #include "access/htup.h"
 #include "access/skey.h"
+#include "datatype/timestamp.h"
 #include "lib/ilist.h"
 #include "utils/relcache.h"
 
@@ -61,6 +62,7 @@ typedef struct catcache
 	slist_node	cc_next;		/* list link */
 	ScanKeyData cc_skey[CATCACHE_MAXKEYS];	/* precomputed key info for heap
 											 * scans */
+	int			cc_tupsize;		/* total amount of catcache tuples */
 
 	/*
 	 * Keep these at the end, so that compiling catcache.c with CATCACHE_STATS
@@ -119,7 +121,9 @@ typedef struct catctup
 	bool		dead;			/* dead but not yet removed? */
 	bool		negative;		/* negative cache entry? */
 	HeapTupleData tuple;		/* tuple management header */
-
+	int			naccess;		/* # of access to this entry, up to 2  */
+	TimestampTz	lastaccess;		/* approx. timestamp of the last usage */
+	int			size;			/* palloc'ed size off this tuple */
 	/*
 	 * The tuple may also be a member of at most one CatCList.  (If a single
 	 * catcache is list-searched with varying numbers of keys, we may have to
@@ -189,6 +193,28 @@ typedef struct catcacheheader
 /* this extern duplicates utils/memutils.h... */
 extern PGDLLIMPORT MemoryContext CacheMemoryContext;
 
+/* for guc.c, not PGDLLPMPORT'ed */
+extern int cache_prune_min_age;
+extern int cache_memory_target;
+
+/* to use as access timestamp of catcache entries */
+extern TimestampTz catcacheclock;
+
+/*
+ * SetCatCacheClock - set timestamp for catcache access record
+ */
+static inline void
+SetCatCacheClock(TimestampTz ts)
+{
+	catcacheclock = ts;
+}
+
+static inline TimestampTz
+GetCatCacheClock(void)
+{
+	return catcacheclock;
+}
+
 extern void CreateCacheMemoryContext(void);
 
 extern CatCache *InitCatCache(int id, Oid reloid, Oid indexoid,
@@ -227,5 +253,6 @@ extern void PrepareToInvalidateCacheTuple(Relation relation,
 
 extern void PrintCatCacheLeakWarning(HeapTuple tuple);
 extern void PrintCatCacheListLeakWarning(CatCList *list);
+extern int CatCacheGetSize(CatCache *cache);
 
 #endif							/* CATCACHE_H */
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 5fc7903a06..338b3470b7 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -110,11 +110,13 @@ typedef struct CachedPlanSource
 	bool		is_valid;		/* is the query_list currently valid? */
 	int			generation;		/* increments each time we create a plan */
 	/* If CachedPlanSource has been saved, it is a member of a global list */
-	struct CachedPlanSource *next_saved;	/* list link, if so */
+	struct CachedPlanSource *prev_saved;	/* list prev link, if so */
+	struct CachedPlanSource *next_saved;	/* list next link, if so */
 	/* State kept to help decide whether to use custom or generic plans: */
 	double		generic_cost;	/* cost of generic plan, or -1 if not known */
 	double		total_custom_cost;	/* total cost of custom plans so far */
 	int			num_custom_plans;	/* number of plans included in total */
+	TimestampTz	last_access;	/* timestamp of the last usage */
 } CachedPlanSource;
 
 /*
@@ -143,6 +145,9 @@ typedef struct CachedPlan
 	MemoryContext context;		/* context containing this CachedPlan */
 } CachedPlan;
 
+/* GUC variables */
+extern int min_cached_plans;
+extern int plancache_prune_min_age;
 
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);

Re: Protect syscache from bloating with negative cache entries

Reply via email to