Re: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu core?

Jan Schermer Tue, 30 Jun 2015 08:50:58 -0700

Hi all,
our script is available on GitHub

https://github.com/prozeta/pincpus <https://github.com/prozeta/pincpus>


I haven’t had much time to do a proper README, but I hope the configuration is 
self explanatory enough for now.
What it does is pin each OSD into the most “empty” cgroup assigned to a NUMA 
node.

Let me know how it works for you!

Jan


> On 30 Jun 2015, at 10:50, Huang Zhiteng <winsto...@gmail.com> wrote:
> 
> 
> 
> On Tue, Jun 30, 2015 at 4:25 PM, Jan Schermer <j...@schermer.cz 
> <mailto:j...@schermer.cz>> wrote:
> Not having OSDs and KVMs compete against each other is one thing.
> But there are more reasons to do this
> 
> 1) not moving the processes and threads between cores that much (better cache 
> utilization)
> 2) aligning the processes with memory on NUMA systems (that means all modern 
> dual socket systems) - you don’t want your OSD running on CPU1 with memory 
> allocated to CPU2
> 3) the same goes for other resources like NICs or storage controllers - but 
> that’s less important and not always practical to do
> 4) you can limit the scheduling domain on linux if you limit the cpuset for 
> your OSDs (I’m not sure how important this is, just best practice)
> 5) you can easily limit memory or CPU usage, set priority, with much greater 
> granularity than without cgroups
> 6) if you have HyperThreading enabled you get the most gain when the 
> workloads on the threads are dissimiliar - so to have the higher throughput 
> you have to pin OSD to thread1 and KVM to thread2 on the same core. We’re not 
> doing that because latency and performance of the core can vary depending on 
> what the other thread is doing. But it might be useful to someone.
> 
> Some workloads exhibit >100% performance gain when everything aligns in a 
> NUMA system, compared to a SMP mode on the same hardware. You likely won’t 
> notice it on light workloads, as the interconnects (QPI) are very fast and 
> there’s a lot of bandwidth, but for stuff like big OLAP databases or other 
> data-manipulation workloads there’s a huge difference. And with CEPH being 
> CPU hungy and memory intensive, we’re seeing some big gains here just by 
> co-locating the memory with the processes….
> Could you elaborate a it on this?  I'm interested to learn in what situation 
> memory locality helps Ceph to what extend. 
> 
> 
> Jan
> 
>  
>> On 30 Jun 2015, at 08:12, Ray Sun <xiaoq...@gmail.com 
>> <mailto:xiaoq...@gmail.com>> wrote:
>> 
>> Sound great, any update please let me know.
>> 
>> Best Regards
>> -- Ray
>> 
>> On Tue, Jun 30, 2015 at 1:46 AM, Jan Schermer <j...@schermer.cz 
>> <mailto:j...@schermer.cz>> wrote:
>> I promised you all our scripts for automatic cgroup assignment - they are in 
>> our production already and I just need to put them on github, stay tuned 
>> tomorrow :-)
>> 
>> Jan
>> 
>> 
>>> On 29 Jun 2015, at 19:41, Somnath Roy <somnath....@sandisk.com 
>>> <mailto:somnath....@sandisk.com>> wrote:
>>> 
>>> Presently, you have to do it by using tool like ‘taskset’ or ‘numactl’…
>>>  
>>> Thanks & Regards
>>> Somnath
>>>  
>>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com 
>>> <mailto:ceph-users-boun...@lists.ceph.com>] On Behalf Of Ray Sun
>>> Sent: Monday, June 29, 2015 9:19 AM
>>> To: ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>>> Subject: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu 
>>> core?
>>>  
>>> Cephers,
>>> I want to bind each of my ceph-osd to a specific cpu core, but I didn't 
>>> find any document to explain that, could any one can provide me some 
>>> detailed information. Thanks.
>>>  
>>> Currently, my ceph is running like this:
>>>  
>>> oot      28692      1  0 Jun23 ?        00:37:26 /usr/bin/ceph-mon -i 
>>> seed.econe.com <http://seed.econe.com/> --pid-file 
>>> /var/run/ceph/mon.seed.econe.com.pid -c /etc/ceph/ceph.conf --cluster ceph
>>> root      40063      1  1 Jun23 ?        02:13:31 /usr/bin/ceph-osd -i 0 
>>> --pid-file /var/run/ceph/osd.0.pid -c /etc/ceph/ceph.conf --cluster ceph
>>> root      42096      1  0 Jun23 ?        01:33:42 /usr/bin/ceph-osd -i 1 
>>> --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf --cluster ceph
>>> root      43263      1  0 Jun23 ?        01:22:59 /usr/bin/ceph-osd -i 2 
>>> --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph
>>> root      44527      1  0 Jun23 ?        01:16:53 /usr/bin/ceph-osd -i 3 
>>> --pid-file /var/run/ceph/osd.3.pid -c /etc/ceph/ceph.conf --cluster ceph
>>> root      45863      1  0 Jun23 ?        01:25:18 /usr/bin/ceph-osd -i 4 
>>> --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph
>>> root      47462      1  0 Jun23 ?        01:20:36 /usr/bin/ceph-osd -i 5 
>>> --pid-file /var/run/ceph/osd.5.pid -c /etc/ceph/ceph.conf --cluster ceph
>>>  
>>> Best Regards
>>> -- Ray
>>> 
>>> 
>>> PLEASE NOTE: The information contained in this electronic mail message is 
>>> intended only for the use of the designated recipient(s) named above. If 
>>> the reader of this message is not the intended recipient, you are hereby 
>>> notified that you have received this message in error and that any review, 
>>> dissemination, distribution, or copying of this message is strictly 
>>> prohibited. If you have received this communication in error, please notify 
>>> the sender by telephone or e-mail (as shown above) immediately and destroy 
>>> any and all copies of this message in your possession (whether hard copies 
>>> or electronically stored copies).
>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> 
> 
> 
> 
> -- 
> Regards
> Huang Zhiteng

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu core?

Reply via email to