On 21 December 2017 at 15:24, Andres Freund <and...@anarazel.de> wrote:
> Hi, > > On 2017-12-21 15:13:13 +0800, Craig Ringer wrote: > > There tons of callers to enlargeStringInfo, so a 'noerror' parameter > would > > be viable. > > Not sure what you mean with that sentence? > Mangled in editing and sent prematurely, disregard. There are NOT tons of callers to enlargeStringInfo, so adding a parameter that allowed it to return a failure result rather than ERROR on OOM seems to be a reasonable option. But it relies on repalloc(), which will ERROR on OOM. AFAICS we don't have "no error" variants of the memory allocation routines and I'm not eager to add them. Especially since we can't trust that we're not on misconfigured Linux where the OOM killer will go wild instead of giving us a nice NULL result. So I guess that means we should probably just do individual elog(...)s and live with the ugliness of scraping the resulting mess out of the logs. After all, a log destination that could possibly copy and modify the string being logged a couple of times it's not a good idea to try to drop the whole thing into the logs in one blob. And we can't trust things like syslog with large messages. I'll follow up with a revised patch that uses individual elog()s soon. BTW, I also pgindented it in my tree, so it'll have formatting fixed up. pgindent's handling of static void printwrapper_stringinfo(void *extra, const char *fmt,...) pg_attribute_printf(2, 3); upsets me a little, since if I break the line it mangles it into static void printwrapper_stringinfo(void *extra, const char *fmt,...) pg_attribute_printf(2, 3); so it'll break line-length rules a little, but otherwise conform. > But I'm not convinced it's worth it personally. If we OOM in response to a > > ProcSignal request for memory context output, we're having pretty bad > luck. > > The output is 8k in my test. But even if it were a couple of hundred kb, > > happening to hit OOM just then isn't great luck on modern systems with > many > > gigabytes of RAM. > > I've seen plenty memory dumps in the dozens to hundreds of > megabytes. And imo such cases are more likely to invite use of this > facility. > OK, that's more concerning then. Also impressively huge. As written it's using MemoryContextStatsDetail(TopMemoryContext, 100) so it shouldn't really be *getting* multimegabyte dumps, but I'm not thrilled by hardcoding that. It doesn't merit a GUC though, so I'm not sure what to do about it and figured I'd make it a "later" problem. > > If that *does* happen, repalloc(...) will call > > MemoryContextStats(TopMemoryContext) before returning NULL. So we'll get > > our memory context dump anyway, albeit to stderr. > > That would still abort the query that might otherwise continue to work, > so that seems no excuse. Eh. It'd probably die soon enough anyway. But yeah, I agree it's not safe to hope we can sling around up to a couple of hundred MB under presumed memory pressure. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services