Hi Ivar,

I agree with Magnus. As addition, also i would change:

 effective_cache_size

and

 maintenance_work_mem

Then disable OOM killer, change logging in order to log files on disk to see if 
your work_mem is too low and spills on disk, rethink your data structure and, 
overall, I think that you can have a look to this page, which I find a good 
starting point for a proper tuning:

https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server


regards,

fabio pardi


On 26/06/18 12:13, Magnus Hagander wrote:
>
>
> On Tue, Jun 26, 2018 at 11:40 AM, Ivar Fredriksen <ivar.fredrik...@gmail.com 
> <mailto:ivar.fredrik...@gmail.com>> wrote:
>
>     A single large query is able to spend all the system memory (as reported 
> by top), and the oom_reaper kills postgres. See bottom of email for an 
> example query and logs.
>
>      
>
>     Expected behavior would be that postgres is not killed and the query is 
> limited to the shared_buffer memory, potentially failing, but not killing the 
> postgres process.
>
>
> Then your expectations are completely wrong. shared_buffers have nothing to 
> do with limiting the memory of a query. shared_buffers set the size of the 
> PostgreSQL cache, not the working set. That's controlled by work_mem (see 
> below).
>
>  
>
>     Do I have some fundamental flaw in my understanding of this, or is there 
> some sort of configuration that should be in place that is missing? The 
> offending sample query has been
>
>     rewritten to not use joins, and will no longer crash postgres. I am not 
> asking anyone to spend a lot of time analyzing the query itself, it is just 
> provided as an example for when the problem will occur.
>
>     My question is more in a broader sense why one query is eating through 
> all system memory, and is there anything I can do to prevent this from 
> happening? 
>
>     We have set shared_buffers to 256MB on a test-system that has 1GB memory. 
> Production machines have more resources, but the errors are happening in 
> exactly the same way, so I assume (perhaps wrongly) that using the test
>
>     system specs and logs might give me the answers I need to figure out what 
> is happening.
>
>      
>
>     Technical details are provided below, a big thanks to anyone who can 
> provide any help or insight to this.
>
>      
>
>     Regards,
>
>     Ivar Fredriksen
>
>      
>
>     PostgreSQL version number:
>
>     PostgreSQL 10.4 (Debian 10.4-2.pgdg90+1) on x86_64-pc-linux-gnu, compiled 
> by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
>
>     Installed with the debian packages for postgres found at: deb 
> http://apt.postgresql.org/pub/repos/apt/ 
> <http://apt.postgresql.org/pub/repos/apt/>
>
>      
>
>     Changes made to the settings in the postgresql.conf file:
>
>                  name             |             current_setting             | 
>        source
>
>     
> ------------------------------+-----------------------------------------+----------------------
>
>      application_name             | psql                                    | 
> client
>
>      checkpoint_completion_target | 0.9                                     | 
> configuration file
>
>      client_encoding              | UTF8                                    | 
> client
>
>      cluster_name                 | 10/main                                 | 
> configuration file
>
>      DateStyle                    | ISO, MDY                                | 
> configuration file
>
>      default_statistics_target    | 100                                     | 
> configuration file
>
>      default_text_search_config   | pg_catalog.english                      | 
> configuration file
>
>      dynamic_shared_memory_type   | posix                                   | 
> configuration file
>
>      effective_cache_size         | 1536MB                                  | 
> configuration file
>
>      external_pid_file            | /var/run/postgresql/10-main.pid         | 
> configuration file
>
>      lc_messages                  | C                                       | 
> configuration file
>
>      lc_monetary                  | C                                       | 
> configuration file
>
>      lc_numeric                   | en_US.UTF-8                             | 
> configuration file
>
>      lc_time                      | C                                       | 
> configuration file
>
>      listen_addresses             | *                                       | 
> configuration file
>
>      log_line_prefix              | %m [%p] %q%u@%d                         | 
> configuration file
>
>      log_timezone                 | UTC                                     | 
> configuration file
>
>      maintenance_work_mem         | 128MB                                   | 
> configuration file
>
>      max_connections              | 100                                     | 
> configuration file
>
>      max_stack_depth              | 2MB                                     | 
> environment variable
>
>      max_wal_size                 | 2GB                                     | 
> configuration file
>
>      min_wal_size                 | 1GB                                     | 
> configuration file
>
>      port                         | 5432                                    | 
> configuration file
>
>      search_path                  | "$user", public, pg_catalog             | 
> configuration file
>
>      shared_buffers               | 256MB                                   | 
> configuration file
>
>      ssl                          | on                                      | 
> configuration file
>
>      ssl_cert_file                | /etc/ssl/certs/ssl-cert-snakeoil.pem    | 
> configuration file
>
>      ssl_key_file                 | /etc/ssl/private/ssl-cert-snakeoil.key  | 
> configuration file
>
>      standard_conforming_strings  | off                                     | 
> configuration file
>
>      stats_temp_directory         | /var/run/postgresql/10-main.pg_stat_tmp | 
> configuration file
>
>      tcp_keepalives_idle          | 0                                       | 
> configuration file
>
>      TimeZone                     | UTC                                     | 
> configuration file
>
>      unix_socket_directories      | /var/run/postgresql                     | 
> configuration file
>
>      wal_buffers                  | 16MB                                    | 
> configuration file
>
>      work_mem                     | 1310kB                                  | 
> configuration file
>
>     (35 rows)
>
>      
>
>     Operating system and version:
>
>     Debian GNU/Linux 9.4 (stretch)
>
>     Linux ip-172-31-30-104 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1 (2018-04-29) 
> x86_64 GNU/Linux
>
>      
>
>     Machine specs:
>
>     CPU: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
>
>     Memory: 1GB
>
>     HDD: General purpose SSD from amazon ec2, only one disk
>
>
> With only 1GB of memory, your value for work_mem is probably too high for a 
> query like yours.
>
> Even with 1MB, that's 1MB per join. You seem to have about 350 joins? That 
> would be 350Mb right there. Plus a lot of more overhead.
>
> In general, what's up with 350 joins?! That seems to be, ahem, quite insane.
>
> For a query like that you probably have to reduce everything to bare minimums 
> -- I'd try with shared_buffers at maybe 64MB and work_mem at 100kB and see if 
> it passes. (it'll probably still be very slow, but that's to be expected from 
> a query like that)
>
> But I'd sum it up as (1) don't do 350 joins. Just don't. And (2) if you have 
> to, then a machine with just 1Gb is simply too small.
>
>
> //Magnus
>

Reply via email to