Re: Why our Valgrind reports suck

2025-05-23 Thread Tom Lane
Here's a v3 patchset that's rebased up to HEAD (on top of the extracted fixes I already pushed) and responds to your review comments. I adopted your suggestions except for a couple: Andres Freund writes: > 0007: > + /* Run the rest in xact context to avoid Valgrind leak complaints */ > +

Re: Why our Valgrind reports suck

2025-05-23 Thread Tom Lane
Peter Geoghegan writes: > On Fri, May 23, 2025 at 11:42 AM Andres Freund wrote: >>> I'm envisioning this patch series as v19 work, were you >>> thinking we should be more aggressive? >> Mostly agreed - but I am wondering if the AV fix should be backpatched? > I think that it probably should be.

Re: Why our Valgrind reports suck

2025-05-23 Thread Peter Geoghegan
On Fri, May 23, 2025 at 11:42 AM Andres Freund wrote: > > I'm envisioning this patch series as v19 work, were you > > thinking we should be more aggressive? > > Mostly agreed - but I am wondering if the AV fix should be backpatched? I think that it probably should be. -- Peter Geoghegan

Re: Why our Valgrind reports suck

2025-05-23 Thread Andres Freund
Hi, On 2025-05-22 21:48:24 -0400, Tom Lane wrote: > Andres Freund writes: > >> But this is the last step to get to zero reported leaks in a run of the > >> core > >> regression tests, so let's do it. > > > I assume that's just about the core tests, not more? I.e. I can't make skink > > enable l

Re: Why our Valgrind reports suck

2025-05-22 Thread Tom Lane
Andres Freund writes: > [ review ] Thanks for the comments! I'll go through them and post an updated version tomorrow. The cfbot is already nagging me for a rebase now that 0013 is moot. >> But this is the last step to get to zero reported leaks in a run of the core >> regression tests, so let

Re: Why our Valgrind reports suck

2025-05-22 Thread Andres Freund
Hi, 0001: This is rather wild, I really have only the dimmest memory of that other thread even though I apparently resorted to reading valgrind's source code... I think the vchunk/vpool naming, while not necessarily elegant, is helpful. 0002: Makes sense. 0003: 0004: 0005: Ugh, that's rathe

Re: Why our Valgrind reports suck

2025-05-22 Thread Yasir
On Wed, May 21, 2025 at 10:14 PM Tom Lane wrote: > Here's a v2 patchset that reaches the goal of zero reported leaks > in the core regression tests, with some caveats: > > * Rather than completely fixing the function-cache and > TS-dictionary-cache issues, I just added suppression rules to > hide

Re: Why our Valgrind reports suck

2025-05-21 Thread Tom Lane
I wrote: > Here's a v2 patchset that reaches the goal of zero reported leaks > in the core regression tests, with some caveats: Oh, another caveat is that I ran this with a fairly minimalistic set of build options. In a more complete build, I observed a small leak in xml.c, which I posted a fix f

Re: Why our Valgrind reports suck

2025-05-21 Thread Tom Lane
Here's a v2 patchset that reaches the goal of zero reported leaks in the core regression tests, with some caveats: * Rather than completely fixing the function-cache and TS-dictionary-cache issues, I just added suppression rules to hide them. I am not convinced it is worth working harder than tha

Re: Why our Valgrind reports suck

2025-05-12 Thread Yasir
On Mon, May 12, 2025 at 12:11 AM Tom Lane wrote: > I wrote: > > And, since there's nothing new under the sun around here, > > we already had a discussion about that back in 2021: > > > https://www.postgresql.org/message-id/flat/3471359.1615937770%40sss.pgh.pa.us > > That thread seems to have led

Re: Why our Valgrind reports suck

2025-05-12 Thread Tom Lane
=?utf-8?Q?=C3=81lvaro?= Herrera writes: > On 2025-May-11, Tom Lane wrote: >> In particular, I had not realized that autovacuum >> leaks a nontrivial amount of memory per relation processed (cf 0009), >> and apparently has done for a few releases now. This is horrid in >> databases with many table

Re: Why our Valgrind reports suck

2025-05-12 Thread Álvaro Herrera
On 2025-May-11, Tom Lane wrote: > Okay, here is a patch series that updates the > 0001-Make-memory-contexts-themselves-more-visible-to-valg.patch > patch you posted in that thread, and makes various follow-up > fixes that either fix or paper over various leaks. Wow, that's a lot of extra fixes.

Re: Why our Valgrind reports suck

2025-05-11 Thread Tom Lane
I wrote: > Okay, here is a patch series that updates the > 0001-Make-memory-contexts-themselves-more-visible-to-valg.patch > patch you posted in that thread, I forgot to mention that I did try to implement the "two-level pool" scheme that the Valgrind documentation talks about, and could not make

Re: Why our Valgrind reports suck

2025-05-11 Thread Tom Lane
I wrote: > And, since there's nothing new under the sun around here, > we already had a discussion about that back in 2021: > https://www.postgresql.org/message-id/flat/3471359.1615937770%40sss.pgh.pa.us > That thread seems to have led to fixing some specific bugs, > but we never committed any of t

Re: Why our Valgrind reports suck

2025-05-09 Thread Tom Lane
I wrote: > One thing I noticed while reading the Valgrind manual is that > they describe a facility for "two level" tracking of custom > allocators such as ours. And, since there's nothing new under the sun around here, we already had a discussion about that back in 2021: https://www.postgresql.o

Re: Why our Valgrind reports suck

2025-05-09 Thread Tom Lane
Andres Freund writes: > On 2025-05-08 22:04:06 -0400, Tom Lane wrote: >> A nearby thread [1] reminded me to wonder why we seem to have >> so many false-positive leaks reported by Valgrind these days. > Huh. We use the memory pool client requests to inform valgrind about memory > contexts. I seem

Re: Why our Valgrind reports suck

2025-05-09 Thread Tom Lane
Andres Freund writes: > Briefly looking through the leaks indeed quickly found a real seeming leak, > albeit of limited size: > ProcessStartupPacket() does > buf = palloc(len + 1); > in TopMemoryContext() without ever freeing it. Yeah, I saw that too. Didn't seem worth doing anything about

Re: Why our Valgrind reports suck

2025-05-09 Thread Andres Freund
Hi, On 2025-05-09 11:29:43 -0400, Andres Freund wrote: > We currently don't reset TopMemoryContext at exit, which, obviously, does > massively increase the number of leaks. But OTOH, without that there's not a > whole lot of value in the leak check... Briefly looking through the leaks indeed quic

Re: Why our Valgrind reports suck

2025-05-09 Thread Andres Freund
Hi, On 2025-05-08 22:04:06 -0400, Tom Lane wrote: > A nearby thread [1] reminded me to wonder why we seem to have > so many false-positive leaks reported by Valgrind these days. > For example, at exit of a backend that's executed a couple of > trivial queries, I see > > ==00:00:00:25.515 260013==

Re: Why our Valgrind reports suck

2025-05-09 Thread Yasir
On Fri, May 9, 2025 at 7:04 AM Tom Lane wrote: > A nearby thread [1] reminded me to wonder why we seem to have > so many false-positive leaks reported by Valgrind these days. > For example, at exit of a backend that's executed a couple of > trivial queries, I see > > ==00:00:00:25.515 260013== LE

Why our Valgrind reports suck

2025-05-08 Thread Tom Lane
A nearby thread [1] reminded me to wonder why we seem to have so many false-positive leaks reported by Valgrind these days. For example, at exit of a backend that's executed a couple of trivial queries, I see ==00:00:00:25.515 260013== LEAK SUMMARY: ==00:00:00:25.515 260013==definitely lost: 3