On Sun, Apr 14, 2019 at 3:51 PM Gunther <r...@gusw.net> wrote:
>
> For weeks now, I am banging my head at an "out of memory" situation. There is 
> only one query I am running on an 8 GB system, whatever I try, I get knocked 
> out on this out of memory. It is extremely impenetrable to understand and fix 
> this error. I guess I could add a swap file, and then I would have to take 
> the penalty of swapping. But how can I actually address an out of memory 
> condition if the system doesn't tell me where it is happening?
> We can't really see anything too worrisome. There is always lots of memory 
> used by cache, which could have been mobilized. The only possible explanation 
> I can think of is that in that moment of the crash the memory utilization 
> suddenly skyrocketed in less than a second, so that the 2 second vmstat 
> interval wouldn't show it??? Nah.
>
> I have already much reduced work_mem, which has helped in some other cases 
> before. Now I am going to reduce the shared_buffers now, but that seems 
> counter-intuitive because we are sitting on all that cache memory unused!
>
> Might this be a bug? It feels like a bug. It feels like those out of memory 
> issues should be handled more gracefully (garbage collection attempt?) and 
> that somehow there should be more information so the person can do anything 
> about it.

I kind of agree that nothing according to vmstat suggests you have a
problem.  One thing you left out is the precise mechanics of the
failure; is the database getting nuked by the oom killer?  Do you have
the logs?

*) what are values of shared_buffers and work_mem and maintenance_work_mem?

*) Is this a 32 bit build? (I'm guessing no, but worth asking)

*) I see that you've disabled swap.  Maybe it should be enabled?

*) Can you get the query to run through?  an 'explain analyze' might
point to gross misses in plan; say, sort memory overuse

*) If you're still getting failures, maybe we need to look at sampling
frequency of memory usage.

*) iowait is super high.

*) I see optimization potential in this query; explain analyze would
help here too.

merlin


Reply via email to