Re: Handling memory contexts in aggregate function invoking other built-in aggregate functions

2021-12-04 Thread Tom Lane
Matt Magoffin  writes:
> So far, I have been working on average support via the vec_to_mean() 
> aggregate, and my aggregate's [2] transition function sets up a 
> FunctionCallInfo for the numeric_avg_accum() [3] function and then loops over 
> the input array elements, calling numeric_avg_accum() and saving its result 
> state object in my aggregate’s state. Before looping, I switch the memory 
> context to the aggregate’s context, i.e. there is stuff like

> MemoryContext aggContext;
> AggCheckCallContext(fcinfo, &aggContext);
> old = MemoryContextSwitchTo(aggContext);
> for (i = 0; i < arrayLength; i++) {
>   // invoke numeric_avg_accum() for each array element, store result in my 
> state
> }
> MemoryContextSwitchTo(old);

Calling numeric_avg_accum in the agg_context is unnecessary, and possibly
counterproductive (it might leak memory in that context, since like all
other aggregates it assumes it's called in a short-lived context).
That doesn't seem to explain your crash though.

Are you testing in an --enable-cassert build?  If not, do that;
it might make the cause of the crashes more apparent, thanks to
CLOBBER_FREED_MEMORY and other debug support.

regards, tom lane




Re: Handling memory contexts in aggregate function invoking other built-in aggregate functions

2021-12-04 Thread Matt Magoffin
On 5/12/2021, at 5:16 AM, Tom Lane  wrote:

> Calling numeric_avg_accum in the agg_context is unnecessary, and possibly
> counterproductive (it might leak memory in that context, since like all
> other aggregates it assumes it's called in a short-lived context).


OK, thanks for that, I’ll remove the context switch before calling 
numeric_avg_accum and test more. 

> Are you testing in an --enable-cassert build?  If not, do that;
> it might make the cause of the crashes more apparent, thanks to
> CLOBBER_FREED_MEMORY and other debug support.

I did build with --enable-cassert, and I did see the state argument pointer 
passed to numeric_avg_accum
 as 0x7f7f7f7f7f, so now I understand why that was thanks to seeing the 
information about what that means on the Dev FAQ, thanks for that.

So given you didn’t say I shouldn’t be trying to invoke these aggregate 
functions as I’m trying to, does that mean in theory there isn’t anything 
inappropriate about doing this as far as you know?

Cheers,
Matt



Re: Handling memory contexts in aggregate function invoking other built-in aggregate functions

2021-12-04 Thread Tom Lane
Matt Magoffin  writes:
> On 5/12/2021, at 5:16 AM, Tom Lane  wrote:
>> Are you testing in an --enable-cassert build?  If not, do that;
>> it might make the cause of the crashes more apparent, thanks to
>> CLOBBER_FREED_MEMORY and other debug support.

> I did build with --enable-cassert, and I did see the state argument pointer 
> passed to numeric_avg_accum
>  as 0x7f7f7f7f7f, so now I understand why that was thanks to seeing the 
> information about what that means on the Dev FAQ, thanks for that.

So that probably means that you weren't careful about allocating your
own state data in the long-lived context (agg_context), and so it
got freed between calls.

regards, tom lane




Re: libpq: Which functions may hang due to network issues?

2021-12-04 Thread Laurenz Albe
On Fri, 2021-12-03 at 21:33 +0100, Daniel Frey wrote:
> But the real issue, at least for me, is PQfinish(). Considering that my 
> application is not
> allowed to hang (or crash, leak, ...), what should I do in case of a timeout?

I am tempted to say that you shouldn't use TCP with the requirement that it 
should not hang.

> I have existing
> connections and at some point the network connections stop working (e.g. due 
> to a firewall
> issue/reboot), etc. If I don't want a resource leak, I *must* call 
> PQfinish(), correct?
> But I have no idea whether it might hang. If you don't want to guarantee that 
> PQfinish()
> will not hang, then please advise how to use libpq properly in this 
> situation. If there
> some asynchronous version of PQfinish()? Or should I handle hanging 
> connections differently?

You could start a separate process that has your PostgreSQL connection and kill 
it if it
times out.  But then you'd have a similar problem communicating with that 
process.

A normal thing to do when your database call times out or misbehaves in other 
ways is
to give up, report an error and die (after some retries perhaps).

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Handling memory contexts in aggregate function invoking other built-in aggregate functions

2021-12-04 Thread Matt Magoffin
On 5/12/2021, at 9:04 AM, Tom Lane  wrote:
> So that probably means that you weren't careful about allocating your
> own state data in the long-lived context (agg_context), and so it
> got freed between calls.

It turns out I wasn’t careful about setting isnull on the passed in state 
argument. After I fixed that [1] everything appears to be running smoothly!

— m@

[1] 
https://github.com/SolarNetwork/aggs_for_vecs/commit/a734ab211f9660f8a5a3d13870f089b46df56e7c

Re: Max connections reached without max connections reached

2021-12-04 Thread Dilip Kumar
On Fri, Dec 3, 2021 at 9:02 PM Tom Lane  wrote:
>
> Dilip Kumar  writes:
> > On Thu, Dec 2, 2021 at 9:35 AM Dilip Kumar  wrote:
> >> I think there is no such view or anything which tells about which
> >> backend or transaction has more than 64 sub transaction.  But if we
> >> are ready to modify the code then we can LOG that information in
> >> GetNewTransactionId(), when first time we are marking it overflown.
>
> > I have prepared a small patch to log this information.
>
> Putting an elog call into GetNewTransactionId seems like a completely
> horrid idea from a performance standpoint.  Especially if you put it
> inside the XidGenLock hold, where it can block the entire system not just
> the one process.  But even without that, this seems like a performance
> penalty with basically no real-world benefit.  People who have issues
> like this are not going to want to trawl the postmaster log for such
> messages.

Agreed with both points.  What about we add, subxid count and overflow
status in LocalPgBackendStatus and through that, we can show in
pg_stat_activity.  That way we don't have to report it ever and
whenever the user is running pg_stat_activity they can fetch it
directly from "proc->subxidStatus", along with fetching the proc.xid
and proc.xmin.  Does this make sense?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com