Getting better results from valgrind leak tracking

Tom Lane Tue, 16 Mar 2021 16:36:41 -0700

[ starting a new thread for this ]

Andres Freund <and...@anarazel.de> writes:
> I wonder if it'd be worth starting to explicitly annotate all the places
> that do allocations and are fine with leaking them. E.g. by introducing
> malloc_permanently() or such. Right now it's hard to use valgrind et al
> to detect leaks because of all the false positives due to such "ok to
> leak" allocations.


Out of curiosity I poked at this for a little while.  It doesn't appear
to me that we leak much at all, at least not if you are willing to take
"still reachable" blocks as not-leaked.  Most of the problem is
more subtle than that.

I found just a couple of things that really seem like leaks of permanent
data structures to valgrind:

* Where ps_status.c copies the original "environ" array (on
PS_USE_CLOBBER_ARGV platforms), valgrind thinks that's all leaked,
implying that it doesn't count the "environ" global as a valid
reference to leakable data.  I was able to shut that up by also saving
the pointer into an otherwise-unused static variable.  (This is sort of
a poor man's implementation of your "malloc_permanently" idea; but I
doubt it's worth working harder, given the small number of use-cases.)

* The postmaster's sock_paths and lock_files lists appear to be leaked,
but we're doing that to ourselves by throwing away the pointers to them
without physically freeing the lists.  We can just not do that.

What I found out is that we have a lot of issues that seem to devolve
to valgrind not being sure that a block is referenced.  I identified
two main causes of that:

(1) We have a pointer, but it doesn't actually point right at the start
of the block.  A primary culprit here is lists of thingies that use the
slist and dlist infrastructure.  As an experiment, I moved the dlist_node
fields of some popular structs to the beginning, and verified that that
silences associated complaints.  I'm not sure that we want to insist on
put-the-link-first as policy (although if we did, it could provide some
notational savings perhaps).  However, unless someone knows of a way to
teach valgrind about this situation, there may be no other way to silence
those leakage complaints.  A secondary culprit is people randomly applying
CACHELINEALIGN or the like to a palloc'd address, so that the address we
have isn't pointing right at the block start.

(2) The only pointer to the start of a block is actually somewhere within
the block.  This is common in dynahash tables, where we allocate a slab
of entries in a single palloc and then thread them together.  Each entry
should have exactly one referencing pointer, but that pointer is more
likely to be elsewhere within the same palloc block than in the external
hash bucket array.  AFAICT, all cases except where the slab's first entry
is pointed to by a hash bucket pointer confuse valgrind to some extent.
I was able to hack around this by preventing dynahash from allocating
more than one hash entry per palloc, but I wonder if there's a better way.


Attached is a very crude hack, not meant for commit, that hacks things
up enough to greatly reduce the number of complaints with
"--leak-check=full".

One thing I've failed to silence so far is a bunch of entries like

==00:00:03:56.088 3467702== 1,861 bytes in 67 blocks are definitely lost in 
loss record 1,290 of 1,418
==00:00:03:56.088 3467702==    at 0x950650: MemoryContextAlloc (mcxt.c:827)
==00:00:03:56.088 3467702==    by 0x951710: MemoryContextStrdup (mcxt.c:1179)
==00:00:03:56.088 3467702==    by 0x91C86E: RelationInitIndexAccessInfo 
(relcache.c:1444)
==00:00:03:56.088 3467702==    by 0x91DA9C: RelationBuildDesc (relcache.c:1200)

which is complaining about the memory context identifiers for system
indexes' rd_indexcxt contexts.  Those are surely not being leaked in
any real sense.  I suspect that this has something to do with valgrind
not counting the context->ident fields as live pointers, but I don't
have enough valgrind-fu to fix that.

Anyway, the bottom line is that I do not think that we have all that
many uses of the pattern you postulated originally.  It's more that
we've designed some valgrind-unfriendly data structures.  We need to
improve that situation to make much progress here.

                        regards, tom lane

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 4c7b1e7bfd..cd984929a6 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -888,7 +888,6 @@ RemoveSocketFiles(void)
 		(void) unlink(sock_path);
 	}
 	/* Since we're about to exit, no need to reclaim storage */
-	sock_paths = NIL;
 }
 
 
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 55c9445898..2abdd44190 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -814,7 +814,7 @@ InitCatCache(int id,
 	 * Note: we rely on zeroing to initialize all the dlist headers correctly
 	 */
 	sz = sizeof(CatCache) + PG_CACHE_LINE_SIZE;
-	cp = (CatCache *) CACHELINEALIGN(palloc0(sz));
+	cp = (CatCache *) palloc0(sz);
 	cp->cc_bucket = palloc0(nbuckets * sizeof(dlist_head));
 
 	/*
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 6546e3c7c7..df39bc77df 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -1713,6 +1713,10 @@ element_alloc(HTAB *hashp, int nelem, int freelist_idx)
 	if (hashp->isfixed)
 		return false;
 
+	/* Force separate allocations to de-confuse valgrind */
+	if (!hashp->isshared)
+		nelem = 1;
+
 	/* Each element has a HASHELEMENT header plus user data. */
 	elementSize = MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(hctl->entrysize);
 
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 8b73850d0d..be37e8b312 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -944,7 +944,6 @@ UnlinkLockFiles(int status, Datum arg)
 		/* Should we complain if the unlink fails? */
 	}
 	/* Since we're about to exit, no need to reclaim storage */
-	lock_files = NIL;
 
 	/*
 	 * Lock file removal should always be the last externally visible action
diff --git a/src/backend/utils/misc/ps_status.c b/src/backend/utils/misc/ps_status.c
index 5819faaf2d..2d0ef37f7f 100644
--- a/src/backend/utils/misc/ps_status.c
+++ b/src/backend/utils/misc/ps_status.c
@@ -113,6 +113,11 @@ static size_t ps_buffer_fixed_size; /* size of the constant prefix */
 static int	save_argc;
 static char **save_argv;
 
+/* We use this to convince Valgrind that replacement environ is referenced */
+#ifdef PS_USE_CLOBBER_ARGV
+static char ** volatile fake_environ;
+#endif
+
 
 /*
  * Call this early in startup to save the original argc/argv values.
@@ -192,6 +197,8 @@ save_ps_display_args(int argc, char **argv)
 		}
 		new_environ[i] = NULL;
 		environ = new_environ;
+		/* Valgrind tends to think this memory is leaked, so fool it */
+		fake_environ = new_environ;
 	}
 #endif							/* PS_USE_CLOBBER_ARGV */
 
diff --git a/src/include/postmaster/bgworker_internals.h b/src/include/postmaster/bgworker_internals.h
index fc7706314b..f32a13f786 100644
--- a/src/include/postmaster/bgworker_internals.h
+++ b/src/include/postmaster/bgworker_internals.h
@@ -32,6 +32,7 @@
  */
 typedef struct RegisteredBgWorker
 {
+	slist_node	rw_lnode;		/* list link (first to placate valgrind) */
 	BackgroundWorker rw_worker; /* its registry entry */
 	struct bkend *rw_backend;	/* its BackendList entry, or NULL */
 	pid_t		rw_pid;			/* 0 if not running */
@@ -39,7 +40,6 @@ typedef struct RegisteredBgWorker
 	TimestampTz rw_crashed_at;	/* if not 0, time it last crashed */
 	int			rw_shmem_slot;
 	bool		rw_terminate;
-	slist_node	rw_lnode;		/* list link */
 } RegisteredBgWorker;
 
 extern slist_head BackgroundWorkerList;
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index ddc2762eb3..4f2e631562 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -85,6 +85,13 @@ typedef struct catcache
 
 typedef struct catctup
 {
+	/*
+	 * Each tuple in a cache is a member of a dlist that stores the elements
+	 * of its hash bucket.  We keep each dlist in LRU order to speed repeated
+	 * lookups.
+	 */
+	dlist_node	cache_elem;		/* list member of per-bucket list */
+
 	int			ct_magic;		/* for identifying CatCTup entries */
 #define CT_MAGIC   0x57261502
 
@@ -96,13 +103,6 @@ typedef struct catctup
 	 */
 	Datum		keys[CATCACHE_MAXKEYS];
 
-	/*
-	 * Each tuple in a cache is a member of a dlist that stores the elements
-	 * of its hash bucket.  We keep each dlist in LRU order to speed repeated
-	 * lookups.
-	 */
-	dlist_node	cache_elem;		/* list member of per-bucket list */
-
 	/*
 	 * A tuple marked "dead" must not be returned by subsequent searches.
 	 * However, it won't be physically deleted from the cache until its
@@ -156,13 +156,13 @@ typedef struct catctup
  */
 typedef struct catclist
 {
+	dlist_node	cache_elem;		/* list member of per-catcache list */
+
 	int			cl_magic;		/* for identifying CatCList entries */
 #define CL_MAGIC   0x52765103
 
 	uint32		hash_value;		/* hash value for lookup keys */
 
-	dlist_node	cache_elem;		/* list member of per-catcache list */
-
 	/*
 	 * Lookup keys for the entry, with the first nkeys elements being valid.
 	 * All by-reference are separately allocated.

Getting better results from valgrind leak tracking

Reply via email to