On Fri, Jan 10, 2014 at 11:09 AM, Steve Atkins <st...@blighty.com> wrote:

>
> On Jan 10, 2014, at 8:35 AM, Preston Hagar <prest...@gmail.com> wrote:
>
> > tl;dr: Moved from 8.3 to 9.3 and are now getting out of memory errors
> despite the server now having 32 GB instead of 4 GB of RAM and the workload
> and number of clients remaining the same.
> >
> >
> > Details:
> >
> > We have been using Postgresql for some time internally with much
> success.  Recently, we completed a migration off of an older server running
> 8.3 to a new server running 9.3.  The older server had 4GB of RAM, the new
> server has 32 GB.
> >
> > For some reason, since migrating we are getting lots of "out of memory"
> and "cannot allocate memory" errors on the new server when the server gets
> under a decent load.  We have upped shmmax to 17179869184 and shmall to
> 4194304.
>
> What are the exact error messages you’re getting, and where are you seeing
> them?
>
>
Thanks for the reply.  I'm seeing them in the main postgresql log
(/var/log/postgresql/postgresql-9.3-main.log)

Here are a couple of examples from the incident we had this morning:

2014-01-10 06:14:40 CST  30176    LOG:  could not fork new process for
connection: Cannot allocate memory
2014-01-10 06:14:40 CST  30176    LOG:  could not fork new process for
connection: Cannot allocate memory


TopMemoryContext: 160408 total in 19 blocks; 12984 free (41 chunks); 147424
used
  TopTransactionContext: 8192 total in 1 blocks; 7392 free (0 chunks); 800
used
  Btree proof lookup cache: 8192 total in 1 blocks; 1680 free (0 chunks);
6512 used
  TableSpace cache: 8192 total in 1 blocks; 3216 free (0 chunks); 4976 used
  Type information cache: 24240 total in 2 blocks; 3744 free (0 chunks);
20496 used
  Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 chunks);
12688 used
  MessageContext: 524288 total in 7 blocks; 225240 free (3 chunks); 299048
used
  Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512
used
  smgr relation table: 57344 total in 3 blocks; 34320 free (10 chunks);
23024 used
  TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks);
32 used
  Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used
    PortalHeapMemory: 1024 total in 1 blocks; 768 free (0 chunks); 256 used
      ExecutorState: 329080 total in 8 blocks; 105944 free (4 chunks);
223136 used
        TupleSort: 32816 total in 2 blocks; 176 free (2 chunks); 32640 used
        printtup: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 8192 total in 1 blocks; 8128 free (3 chunks); 64 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 8192 total in 1 blocks; 7896 free (0 chunks); 296 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
  Relcache by OID: 24576 total in 2 blocks; 11792 free (3 chunks); 12784
used
  CacheMemoryContext: 9815680 total in 25 blocks; 8143416 free (1 chunks);
1672264 used
    pg_toast_3598032_index: 1024 total in 1 blocks; 16 free (0 chunks);
1008 used
    mls_office_uid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    staff_primary: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    staff_desk_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    person_primary: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    person_person_workphone_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    person_person_mobilephone_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    person_person_lastname_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    person_person_homephone_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    person_person_homeofficephone_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    person_person_firstname_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    agent_primary: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    agent_membertype_id: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    agent_floydtraining_id: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    agent_agent_sync_realtorid_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    agent_agent_sync_oagentid_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    agent_agent_personid_idx: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    agent_agent_license_idx: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    pg_toast_2619_index: 1024 total in 1 blocks; 16 free (0 chunks); 1008
used
    mls_agent_uid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_uidprp_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_res_uid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_streetnum_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_res_streetname: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_res_sqfttotal_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_res_ownername_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_res_officelist_idx: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    mls_res_modified: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_mlsnumber_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_res_liststatus_idx: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    mls_res_listprice_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_res_city_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_beds_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_bathsfull_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_prp_subdivision: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_prp_zipcode: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_uidprp: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_uid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_priority: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_officelist: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_prp_modified: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_mlsnum: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_longitude_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_prp_liststatus: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_prp_listprice: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_latitude_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_prp_ishud_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_city_upcase_idx: 3072 total in 2 blocks; 1848 free (0 chunks);
1224 used
    mls_prp_city: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_agentlist2: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_prp_agentlist: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_media_uid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_media_tableuid: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    mls_media_mediasource_idx: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    mls_media_custompic_idx: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    listing_primary: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listing_sync_ilresid_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    listing_listing_streetnumber_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    listing_listing_streetname_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    listing_listing_mlsnumber_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    listing_listing_longitude_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    listing_listing_listingid_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    listing_listing_latitude_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    listing_listing_brokerageid_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    listing_listing_agentid_idx: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    listing_listdate_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    code_primary: 1024 total in 1 blocks; 16 free (0 chunks); 1008 used
    class_primary: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_index_indrelid_index: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    pg_attrdef_adrelid_adnum_index: 1024 total in 1 blocks; 16 free (0
chunks); 1008 used
    pg_db_role_setting_databaseid_rol_index: 1024 total in 1 blocks; 64
free (0 chunks); 960 used
    pg_opclass_am_name_nsp_index: 3072 total in 2 blocks; 1736 free (2
chunks); 1336 used
    pg_foreign_data_wrapper_name_index: 1024 total in 1 blocks; 200 free (0
chunks); 824 used
    pg_enum_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_class_relname_nsp_index: 1024 total in 1 blocks; 16 free (0 chunks);
1008 used
    pg_foreign_server_oid_index: 1024 total in 1 blocks; 200 free (0
chunks); 824 used
    pg_statistic_relid_att_inh_index: 3072 total in 2 blocks; 1736 free (2
chunks); 1336 used
    pg_cast_source_target_index: 1024 total in 1 blocks; 16 free (0
chunks); 1008 used
    pg_language_name_index: 1024 total in 1 blocks; 200 free (0 chunks);
824 used
    pg_collation_oid_index: 1024 total in 1 blocks; 200 free (0 chunks);
824 used
    pg_amop_fam_strat_index: 3072 total in 2 blocks; 1736 free (2 chunks);
1336 used
    pg_index_indexrelid_index: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    pg_ts_template_tmplname_index: 1024 total in 1 blocks; 64 free (0
chunks); 960 used
    pg_ts_config_map_index: 3072 total in 2 blocks; 1784 free (2 chunks);
1288 used
    pg_opclass_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    pg_foreign_data_wrapper_oid_index: 1024 total in 1 blocks; 200 free (0
chunks); 824 used
    pg_event_trigger_evtname_index: 1024 total in 1 blocks; 200 free (0
chunks); 824 used
    pg_ts_dict_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824
used
    pg_event_trigger_oid_index: 1024 total in 1 blocks; 200 free (0
chunks); 824 used
    pg_conversion_default_index: 3072 total in 2 blocks; 1784 free (2
chunks); 1288 used
    pg_operator_oprname_l_r_n_index: 3072 total in 2 blocks; 1736 free (2
chunks); 1336 used
    pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 16 free (0
chunks); 1008 used
    pg_enum_typid_label_index: 1024 total in 1 blocks; 64 free (0 chunks);
960 used
    pg_ts_config_oid_index: 1024 total in 1 blocks; 200 free (0 chunks);
824 used
    pg_user_mapping_oid_index: 1024 total in 1 blocks; 200 free (0 chunks);
824 used
    pg_opfamily_am_name_nsp_index: 3072 total in 2 blocks; 1784 free (2
chunks); 1288 used
    pg_foreign_table_relid_index: 1024 total in 1 blocks; 200 free (0
chunks); 824 used
    pg_type_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_aggregate_fnoid_index: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    pg_constraint_oid_index: 1024 total in 1 blocks; 200 free (0 chunks);
824 used
    pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 64 free (0
chunks); 960 used
    pg_ts_parser_prsname_index: 1024 total in 1 blocks; 64 free (0 chunks);
960 used
    pg_ts_config_cfgname_index: 1024 total in 1 blocks; 64 free (0 chunks);
960 used
    pg_ts_parser_oid_index: 1024 total in 1 blocks; 200 free (0 chunks);
824 used
    pg_operator_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    pg_namespace_nspname_index: 1024 total in 1 blocks; 152 free (0
chunks); 872 used
    pg_ts_template_oid_index: 1024 total in 1 blocks; 200 free (0 chunks);
824 used
    pg_amop_opr_fam_index: 3072 total in 2 blocks; 1736 free (2 chunks);
1336 used
    pg_default_acl_role_nsp_obj_index: 3072 total in 2 blocks; 1784 free (2
chunks); 1288 used
    pg_collation_name_enc_nsp_index: 3072 total in 2 blocks; 1784 free (2
chunks); 1288 used
    pg_range_rngtypid_index: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    pg_ts_dict_dictname_index: 1024 total in 1 blocks; 64 free (0 chunks);
960 used
    pg_type_typname_nsp_index: 1024 total in 1 blocks; 16 free (0 chunks);
1008 used
    pg_opfamily_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824
used
    pg_class_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    pg_proc_proname_args_nsp_index: 3072 total in 2 blocks; 1736 free (2
chunks); 1336 used
    pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 16 free (0
chunks); 1008 used
    pg_proc_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_language_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824
used
    pg_namespace_oid_index: 1024 total in 1 blocks; 200 free (0 chunks);
824 used
    pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1736 free (2 chunks);
1336 used
    pg_foreign_server_name_index: 1024 total in 1 blocks; 200 free (0
chunks); 824 used
    pg_attribute_relid_attnam_index: 1024 total in 1 blocks; 64 free (0
chunks); 960 used
    pg_conversion_oid_index: 1024 total in 1 blocks; 200 free (0 chunks);
824 used
    pg_user_mapping_user_server_index: 1024 total in 1 blocks; 64 free (0
chunks); 960 used
    pg_conversion_name_nsp_index: 1024 total in 1 blocks; 64 free (0
chunks); 960 used
    pg_authid_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    pg_auth_members_member_role_index: 1024 total in 1 blocks; 64 free (0
chunks); 960 used
    pg_tablespace_oid_index: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    pg_database_datname_index: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
    pg_auth_members_role_member_index: 1024 total in 1 blocks; 64 free (0
chunks); 960 used
    pg_database_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872
used
    pg_authid_rolname_index: 1024 total in 1 blocks; 152 free (0 chunks);
872 used
  MdSmgr: 8192 total in 1 blocks; 5440 free (0 chunks); 2752 used
  ident parser context: 0 total in 0 blocks; 0 free (0 chunks); 0 used
  hba parser context: 15360 total in 4 blocks; 6544 free (2 chunks); 8816
used
  LOCALLOCK hash: 24576 total in 2 blocks; 13920 free (4 chunks); 10656 used
  Timezones: 83472 total in 2 blocks; 3744 free (0 chunks); 79728 used
  ErrorContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
2014-01-10 06:18:46 CST 10.1.1.6 16669 [unknown] production
 10.1.1.6(36680)ERROR:  out of memory
2014-01-10 06:18:46 CST 10.1.1.6 16669 [unknown] production
 10.1.1.6(36680)DETAIL:  Failed on request of size 500.



After some of the connections cleared up, (we load a lot of data around
that time), things got "back to normal".  Our Nagios monitoring showed a 30
minute window or so when no new connections could be made and the log is
filled with more or less the same lines.




> >
> > We had originally copied our shared_buffers, work_mem, wal_buffers and
> other similar settings from our old config, but after getting the memory
> errors have tweaked them to the following:
> >
> > shared_buffers            = 7680MB
> > temp_buffers              = 12MB
> > max_prepared_transactions = 0
> > work_mem                  = 80MB
> > maintenance_work_mem      = 1GB
> > wal_buffers = 8MB
> > max_connections = 350
> >
> > The current settings seem to have helped, but we are still occasionally
> getting the errors.
> >
> > The weird thing is that our old server had 1/8th the RAM, was set to
> max_connections = 600 and had the same clients connecting in the same way
> to the same databases and we never saw any errors like this in the several
> years we have been using it.
> >
> > One issue I could see is that one of our main applications that connects
> to the database, opens a connection on startup, holds it open the entire
> time it is running, and doesn't close it until the app is closed.  In daily
> usage, for much of our staff it is opened first thing in the morning and
> left open all day (meaning the connection is held open for 8+ hours).  This
> was never an issue with 8.3, but I know it isn't a "best practice" in
> general.
>
> That’s probably not related to the problems you’re seeing - I have apps
> that hold a connection to the database open for years. As long as it
> doesn’t keep a transaction open for a long time, you’re fine.
>
>
Good to know, thanks.



> >
> > We are working to update our application to be able to use pgbouncer
> with transaction connections to try to alleviate the long held connections,
> but it will take some time.
>
> Using pgbouncer is probably a good idea - to reduce the number of
> concurrent connections, rather than the length of connections, though.
>
>
I'm testing a updated version of our application against pgbouncer now so
hopefully that will help some.



> >
> > In the meantime, is there some other major difference or setting in 9.3
> that we should look out for that could be causing this?  Like I said, the
> same database with the same load and number of clients has been running on
> a 8.3 install for years (pretty much since 2008 when 8.3 was released) with
> lesser hardware with no issues.
> >
> > Let me know if any other information would help out or if anyone has
> suggestions of things to check.
>
> Cheers,
>   Steve
>
>
>
Thanks for your help.

Preston

Reply via email to