> On 23 Sep 2020, at 21:33, Tom Lane <t...@sss.pgh.pa.us> wrote: > > I wrote: >> So the attached adds a pstrdup/pfree to ensure that "curline" >> has its own storage, putting us right back at two palloc/pfree >> cycles per line. I don't think there's a lot of choice though; >> in fact, I'm leaning to the idea that we need to back-patch >> that part of this. The odds of trouble in a production build >> probably aren't high, but still... > > So I did that, but while looking at the main patch I realized that > a few things were still being left on the table:
> 2. In the case where encoding conversion is performed, we still have > to pstrdup the result to have a safe copy for "curline". But I > realized that we're making a poor choice of which copy to return to > the caller. The output of encoding conversion is likely to be a good > bit bigger than necessary, cf. pg_do_encoding_conversion. So if the > caller is one that saves the output string directly into a long-lived > dictionary structure, this wastes space. We should return the pstrdup > result instead, and keep the conversion result as "curline", where > we'll free it next time through. I wonder if we have other callsites of pg_any_to_server which could benefit from lowering the returned allocation, a quick look didn't spot any but today has become yesterday here and tiredness might interfere. > So those considerations lead me to the attached. Eyeing memory used by callers of tsearch_readline validates your observations, I don't see any case where this patch isn't safe and nothing sticks out. +1 on this, nice one! cheers ./daniel