Re: [HACKERS] Linux kernel impact on PostgreSQL performance

Josh Berkus Mon, 13 Jan 2014 15:25:14 -0800

On 01/13/2014 02:26 PM, Mel Gorman wrote:
> Really?
> 
> zone_reclaim_mode is often a complete disaster unless the workload is
> partitioned to fit within NUMA nodes. On older kernels enabling it would
> sometimes cause massive stalls. I'm actually very surprised to hear it
> fixes anything and would be interested in hearing more about what sort
> of circumstnaces would convince you to enable that thing.


So the problem with the default setting is that it pretty much isolates
all FS cache for PostgreSQL to whichever socket the postmaster is
running on, and makes the other FS cache unavailable.  This means that,
for example, if you have two memory banks, then only one of them is
available for PostgreSQL filesystem caching ... essentially cutting your
available cache in half.

And however slow moving cached pages between memory banks is, it's an
order of magnitude faster than moving them from disk.  But this isn't
how the NUMA stuff is configured; it seems to assume that it's less
expensive to get pages from disk than to move them between banks, so
whatever you've got cached on the other bank, it flushes it to disk as
fast as possible.  I understand the goal was to make memory usage local
to the processors stuff was running on, but that includes an implicit
assumption that no individual process will ever want more than one
memory bank worth of cache.

So disabling all of the NUMA optimizations is the way to go for any
workload I personally deal with.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Linux kernel impact on PostgreSQL performance

Reply via email to