On Mon, 2023-01-09 at 18:31 -0800, Andres Freund wrote:
> Hi,
> 
> On 2023-01-05 13:44:20 -0500, Reid Thompson wrote:
> > This new field displays the current bytes of memory allocated to the
> > backend process. It is updated as memory for the process is
> > malloc'd/free'd. Memory allocated to items on the freelist is included in
> > the displayed value.
> 
> It doesn't actually malloc/free. It tracks palloc/pfree.

I will update the message

> 
> > Dynamic shared memory allocations are included only in the value displayed
> > for the backend that created them, they are not included in the value for
> > backends that are attached to them to avoid double counting.
> 
> As mentioned before, I don't think accounting DSM this way makes sense.

Understood, previously you noted 'There are a few uses of DSMs that track
shared resources, with the biggest likely being the stats for relations
etc'.  I'd like to come up with a solution to address this; identifying the
long term allocations to shared state and accounting for them such that they
don't get 'lost' when the allocating backend exits.  Any guidance or
direction would be appreciated. 

> > --- a/src/backend/postmaster/autovacuum.c
> > +++ b/src/backend/postmaster/autovacuum.c
> > @@ -407,6 +407,9 @@ StartAutoVacLauncher(void)
> >  
> >  #ifndef EXEC_BACKEND
> >             case 0:
> > +                   /* Zero allocated bytes to avoid double counting parent 
> > allocation */
> > +                   pgstat_zero_my_allocated_bytes();
> > +
> >                     /* in postmaster child ... */
> >                     InitPostmasterChild();
> 
> 
> 
> > @@ -1485,6 +1488,9 @@ StartAutoVacWorker(void)
> >  
> >  #ifndef EXEC_BACKEND
> >             case 0:
> > +                   /* Zero allocated bytes to avoid double counting parent 
> > allocation */
> > +                   pgstat_zero_my_allocated_bytes();
> > +
> >                     /* in postmaster child ... */
> >                     InitPostmasterChild();
> >  
> > diff --git a/src/backend/postmaster/postmaster.c 
> > b/src/backend/postmaster/postmaster.c
> > index eac3450774..24278e5c18 100644
> > --- a/src/backend/postmaster/postmaster.c
> > +++ b/src/backend/postmaster/postmaster.c
> > @@ -4102,6 +4102,9 @@ BackendStartup(Port *port)
> >     {
> >             free(bn);
> >  
> > +           /* Zero allocated bytes to avoid double counting parent 
> > allocation */
> > +           pgstat_zero_my_allocated_bytes();
> > +
> >             /* Detangle from postmaster */
> >             InitPostmasterChild();
> 
> 
> It doesn't at all seem right to call pgstat_zero_my_allocated_bytes() here,
> before even InitPostmasterChild() is called. Nor does it seem right to add the
> call to so many places.
> 
> Note that this is before we even delete postmaster's memory, see e.g.:
>       /*
>        * If the PostmasterContext is still around, recycle the space; we don't
>        * need it anymore after InitPostgres completes.  Note this does not 
> trash
>        * *MyProcPort, because ConnCreate() allocated that space with malloc()
>        * ... else we'd need to copy the Port data first.  Also, subsidiary 
> data
>        * such as the username isn't lost either; see ProcessStartupPacket().
>        */
>       if (PostmasterContext)
>       {
>               MemoryContextDelete(PostmasterContext);
>               PostmasterContext = NULL;
>       }
> 
> calling pgstat_zero_my_allocated_bytes() before we do this will lead to
> undercounting memory usage, afaict.
> 

OK - I'll trace back through these and see if I can better locate and reduce the
number of invocations.

> > +/* Enum helper for reporting memory allocated bytes */
> > +enum allocation_direction
> > +{
> > +   PG_ALLOC_DECREASE = -1,
> > +   PG_ALLOC_IGNORE,
> > +   PG_ALLOC_INCREASE,
> > +};
> 
> What's the point of this?
> 
> 
> > +/* ----------
> > + * pgstat_report_allocated_bytes() -
> > + *
> > + *  Called to report change in memory allocated for this backend.
> > + *
> > + * my_allocated_bytes initially points to local memory, making it safe to 
> > call
> > + * this before pgstats has been initialized. allocation_direction is a
> > + * positive/negative multiplier enum defined above.
> > + * ----------
> > + */
> > +static inline void
> > +pgstat_report_allocated_bytes(int64 allocated_bytes, int 
> > allocation_direction)
> 
> I don't think this should take allocation_direction as a parameter, I'd make
> it two different functions.

Originally it was two functions, a suggestion was made in the thread to
maybe consolidate them to a single function with a direction indicator,
hence the above.  I'm fine with converting it back to separate functions.

> 
> > +   if (allocation_direction == PG_ALLOC_DECREASE &&
> > +           pg_sub_u64_overflow(*my_allocated_bytes, allocated_bytes, 
> > &temp))
> > +   {
> > +           *my_allocated_bytes = 0;
> > +           ereport(LOG,
> > +                           errmsg("Backend %d deallocated %lld bytes, 
> > exceeding the %llu bytes it is currently reporting allocated. Setting 
> > reported to 0.",
> > +                                      MyProcPid, (long long) 
> > allocated_bytes,
> > +                                      (unsigned long long) 
> > *my_allocated_bytes));
> 
> We certainly shouldn't have an ereport in here. This stuff really needs to be
> cheap.

I will remove the ereport.

> 
> > +           *my_allocated_bytes += (allocated_bytes) * allocation_direction;
> 
> Superfluous parens? 

I will remove these.

> 
> 
> > +/* ----------
> > + * pgstat_get_all_memory_allocated() -
> > + *
> > + * Return a uint64 representing the current shared memory allocated to all
> > + * backends.  This looks directly at the BackendStatusArray, and so will
> > + * provide current information regardless of the age of our transaction's
> > + * snapshot of the status array.
> > + * In the future we will likely utilize additional values - perhaps limit
> > + * backend allocation by user/role, etc.
> > + * ----------
> 
> I think it's completely unfeasible to execute something as expensive as
> pgstat_get_all_backend_memory_allocated() on every allocation. Like,
> seriously, no.

Ok. Do we check every nth allocation/try to implement a scheme of checking
more often as we we get closer to the declared max_total_bkend_mem?

> 
> And we absolutely definitely shouldn't just add CHECK_FOR_INTERRUPT() calls
> into the middle of allocator code.

I'm open to guidance/suggestions/pointers to remedying these.

> Greetings,
> 
> Andres Freund
> 


Thanks,
Reid





Reply via email to