On a single node where network transfers are cheaper, and a small object size request rate oriented workload - a good load generator should be able to reach cpu limits with enough concurrency. If you're targeting a disk saturating throughput oriented workload - larger objects sizes (1-10MB) is the way to go.
Is the load generator also running on the same box? You should try to validate your observations with a well know swift benchmarking tool like ssbench. What's your total requests per second? My profiling in the past has revealed that the md5 checksumming in the object server(s) is the largest (but by far not the only) consumer of cpu - all of the other things you mentioned take cpu cycles - tanstaafl. On a single node the problem is exasperated per replica - what's your goals? Are you sure you're saturating all the cores evenly - what's it look like with like `htop` - have you tried tuning your worker counts or any other other config settings? -Clay On Thu, Apr 2, 2015 at 10:12 PM, Shrinand Javadekar <shrin...@maginatics.com > wrote: > Top shows the CPUs pegged at ~100%. Writes are done by a tool built > in-house which is similar in functionality to other object store > benchmarking tools. As I mentioned, there are 256 parallel object > writes (PUTS), each of 256K bytes. > > On Thu, Apr 2, 2015 at 9:21 PM, Yogesh Girikumar <yogeshg1...@gmail.com> > wrote: > > Also how are you doing the object writes to benchmark it? Are you using > dd? > > > > On 3 April 2015 at 09:50, Yogesh Girikumar <yogeshg1...@gmail.com> > wrote: > >> > >> What does top say? > >> > >> On 3 April 2015 at 02:34, Shrinand Javadekar <shrin...@maginatics.com> > >> wrote: > >>> > >>> Hi, > >>> > >>> I have a single node Swift instance. It has 16 cpus, 8 disks and 64GB > >>> memory. As part of testing, I am doing 256 object writes in parallel > >>> for ~10 mins. Each object is also 256K bytes in size. > >>> > >>> While my experiment is running, I see that the CPU utilization of the > >>> box is always ~100%. I am trying to understand what is causing this > >>> high CPU utilization. Some of this could be attributed to: > >>> > >>> 1. MD5 checksum calculation done to verify every PUT. > >>> 2. MD5 checksum calculation by the auditor (if it runs during this > >>> interval). > >>> 3. Hash calculation of the path to decide which partition the object > goes > >>> to. > >>> > >>> Are there any other CPU intensive operations happening on the system > >>> that I should be aware of? > >>> > >>> I see that the proxy-server has a "PUT" queue. Is there some > >>> processing of the data in this queue? Would simply putting data in and > >>> out of the queue, streaming the data between the proxy and object > >>> server use considerable CPU? > >>> > >>> Thanks in advance. > >>> -Shri > >>> > >>> _______________________________________________ > >>> Mailing list: > >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >>> Post to : openstack@lists.openstack.org > >>> Unsubscribe : > >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >> > >> > > > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack@lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack