We have not experienced any downsides to this approach performance or stability-wise, if you prefer you can experiment with the values, but I see no real advantage in doing so.
Regards, Maciej Bonin Systems Engineer | M247 Limited M247.com Connected with our Customers Contact us today to discuss your hosting and connectivity requirements ISO 27001 | ISO 9001 | Deloitte Technology Fast 50 | Deloitte Technology Fast 500 EMEA | Sunday Times Tech Track 100 M247 Ltd, registered in England & Wales #4968341. 1 Ball Green, Cobra Court, Manchester, M32 0QT ISO 27001 Data Protection Classification: A - Public -----Original Message----- From: Cao, Buddy [mailto:buddy....@intel.com] Sent: 11 June 2014 17:00 To: Maciej Bonin; ceph-users@lists.ceph.com Subject: RE: pid_max value? Thanks Bonin. Do you have totally 48 OSDs or there are 48 OSDs on each storage node? Do you think "kernel.pid_max = 4194303" is reasonable since it increase a lot from the default OS setting. Wei Cao (Buddy) -----Original Message----- From: Maciej Bonin [mailto:maciej.bo...@m247.com] Sent: Wednesday, June 11, 2014 10:07 PM To: Cao, Buddy; ceph-users@lists.ceph.com Subject: RE: pid_max value? Hello, The values we use are as follows: # sysctl -p net.ipv4.ip_local_port_range = 1024 65535 net.core.netdev_max_backlog = 30000 net.core.somaxconn = 16384 net.ipv4.tcp_max_syn_backlog = 252144 net.ipv4.tcp_max_tw_buckets = 360000 net.ipv4.tcp_fin_timeout = 3 net.ipv4.tcp_max_orphans = 262144 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syn_retries = 2 net.core.rmem_max = 8388608 net.core.wmem_max = 8388608 net.core.rmem_default = 65536 net.core.wmem_default = 65536 net.ipv4.tcp_rmem = 4096 87380 8388608 net.ipv4.tcp_wmem = 4096 65536 8388608 net.ipv4.tcp_mem = 8388608 8388608 8388608 net.ipv4.route.flush = 1 kernel.pid_max = 4194303 The timeouts don't really make sense without tw reuse/recycling but we found increasing the max and letting the old ones hang gives better performance. Somaxconn was the most important value we had to increase as with 3 mons, 3 storage nodes, 3 vm hypervisors, 16vms and 48 OSDs we've started running into major problems with servers dying left and right. Most of those values are lifted from some openstack python script IIRC, please let us know if you find a more efficient/stable configuration, however we're quite happy with this one. Regards, Maciej Bonin Systems Engineer | M247 Limited M247.com Connected with our Customers Contact us today to discuss your hosting and connectivity requirements ISO 27001 | ISO 9001 | Deloitte Technology Fast 50 | Deloitte Technology Fast 500 EMEA | Sunday Times Tech Track 100 M247 Ltd, registered in England & Wales #4968341. 1 Ball Green, Cobra Court, Manchester, M32 0QT ISO 27001 Data Protection Classification: A - Public From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Cao, Buddy Sent: 11 June 2014 15:00 To: ceph-users@lists.ceph.com Subject: [ceph-users] pid_max value? Hi, what is the recommended value for /proc/sys/kernel/pid_max? Is 32768 enough for Ceph cluster with 4 nodes (40 1T OSDs on each node)? My ceph node already run into "create thread fail" problem in osd log which root cause at pid_max. Wei Cao (Buddy) _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com