Re: [ceph-users] Ceph random read IOPS

Maged Mokhtar Mon, 26 Jun 2017 08:24:00 -0700

On 2017-06-26 15:34, Willem Jan Withagen wrote:


> On 26-6-2017 09:01, Christian Wuerdig wrote: 
> 
>> Well, preferring faster clock CPUs for SSD scenarios has been floated
>> several times over the last few months on this list. And realistic or
>> not, Nick's and Kostas' setup are similar enough (testing single disk)
>> that it's a distinct possibility.
>> Anyway, as mentioned measuring the performance counters would probably
>> provide more insight.
> 
> I read the advise as:
> prefer GHz over cores.
> 
> And especially since there is a sort of balance between either GHz or
> cores, that can be an expensive one. Getting both means you have to pay
> relatively substantial more money.
> 
> And for an average Ceph server with plenty OSDs, I personally just don't
> buy that. There you'd have to look at the total throughput of the the
> system, and latency is only one of the many factors.
> 
> Let alone in a cluster with several hosts (and or racks). There the
> latency is dictated by the network. So a bad choice of network card or
> switch will out do any extra cycles that your CPU can burn.
> 
> I think that just testing 1 OSD is testing artifacts, and has very
> little to do with running an actual ceph cluster.
> 
> So if one would like to test this, the test setup should be something
> like: 3 hosts with something like 3 disks per host, min_disk=2  and a
> nice workload.
> Then turn the GHz-knob and see what happens with client latency and
> throughput.
> 
> --WjW 
> 
> In a high concurrency/queue depth situation, which is probably the most 
> common workload, there is no question that adding more cores will increase 
> IOPS almost linearly in case you have enough disk and network bandwidth, ie 
> your disk and network % utilization is low and your cpu is near 100%. Adding 
> more cores is also more economic to increase IOPS versus increasing 
> frequency. 
> But adding more cores will not lower latency below the value you get from the 
> QD=1 test. To achieve lower latency you need faster cpu freq. Yes it is 
> expensive and as you said you need lower latency switches and so on but you 
> just have to pay more to achieve this.  
> 
> /Maged 
> 
> On Sun, Jun 25, 2017 at 4:53 AM, Willem Jan Withagen <w...@digiware.nl
> <mailto:w...@digiware.nl>> wrote:
> 
> Op 24 jun. 2017 om 14:17 heeft Maged Mokhtar <mmokh...@petasan.org
> <mailto:mmokh...@petasan.org>> het volgende geschreven:
> 
> My understanding was this test is targeting latency more than
> IOPS. This is probably why its was run using QD=1. It also makes
> sense that cpu freq will be more important than cores. 
> 
> But then it is not generic enough to be used as an advise!
> It is just a line in 3D-space. 
> As there are so many
> 
> --WjW
> 
> On 2017-06-24 12:52, Willem Jan Withagen wrote:
> 
> On 24-6-2017 05:30, Christian Wuerdig wrote:     The general advice floating 
> around is that your want CPUs with high
> clock speeds rather than more cores to reduce latency and
> increase IOPS
> for SSD setups (see also
> http://www.sys-pro.co.uk/ceph-storage-fast-cpus-ssd-performance/
> <http://www.sys-pro.co.uk/ceph-storage-fast-cpus-ssd-performance/>)
> So
> something like a E5-2667V4 might bring better results in that
> situation.
> Also there was some talk about disabling the processor C states
> in order
> to bring latency down (something like this should be easy to test:
> https://stackoverflow.com/a/22482722/220986
> <https://stackoverflow.com/a/22482722/220986>) 
> I would be very careful to call this a general advice...
> 
> Although the article is interesting, it is rather single sided.
> 
> The only thing is shows that there is a lineair relation between
> clockspeed and write or read speeds???
> The article is rather vague on how and what is actually tested.
> 
> By just running a single OSD with no replication a lot of the
> functionality is left out of the equation.
> Nobody is running just 1 osD on a box in a normal cluster host.
> 
> Not using a serious SSD is another source of noise on the conclusion.
> More Queue depth can/will certainly have impact on concurrency.
> 
> I would call this an observation, and nothing more.
> 
> --WjW 
> On Sat, Jun 24, 2017 at 1:28 AM, Kostas Paraskevopoulos
> <reverend...@gmail.com <mailto:reverend...@gmail.com>
> <mailto:reverend...@gmail.com <mailto:reverend...@gmail.com>>>
> wrote:
> 
> Hello,
> 
> We are in the process of evaluating the performance of a testing
> cluster (3 nodes) with ceph jewel. Our setup consists of:
> 3 monitors (VMs)
> 2 physical servers each connected with 1 JBOD running Ubuntu
> Server
> 16.04
> 
> Each server has 32 threads @2.1GHz and 128GB RAM.
> The disk distribution per server is:
> 38 * HUS726020ALS210 (SAS rotational)
> 2 * HUSMH8010BSS200 (SAS SSD for journals)
> 2 * ST1920FM0043 (SAS SSD for data)
> 1 * INTEL SSDPEDME012T4 (NVME measured with fio ~300K iops)
> 
> Since we don't currently have a 10Gbit switch, we test the
> performance
> with the cluster in a degraded state, the noout flag set and
> we mount
> rbd images on the powered on osd node. We confirmed that the
> network
> is not saturated during the tests.
> 
> We ran tests on the NVME disk and the pool created on this
> disk where
> we hoped to get the most performance without getting limited
> by the
> hardware specs since we have more disks than CPU threads.
> 
> The nvme disk was at first partitioned with one partition
> and the
> journal on the same disk. The performance on random 4K reads was
> topped at 50K iops. We then removed the osd and partitioned
> with 4
> data partitions and 4 journals on the same disk. The performance
> didn't increase significantly. Also, since we run read
> tests, the
> journals shouldn't cause performance issues.
> 
> We then ran 4 fio processes in parallel on the same rbd
> mounted image
> and the total iops reached 100K. More parallel fio processes
> didn't
> increase the measured iops.
> 
> Our ceph.conf is pretty basic (debug is set to 0/0 for
> everything) and
> the crushmap just defines the different buckets/rules for
> the disk
> separation (rotational, ssd, nvme) in order to create the
> required
> pools
> 
> Is the performance of 100.000 iops for random 4K read normal
> for a
> disk that on the same benchmark runs at more than 300K iops
> on the
> same hardware or are we missing something?
> 
> Best regards,
> Kostas
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> <mailto:ceph-users@lists.ceph.com
> <mailto:ceph-users@lists.ceph.com>>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>>
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph random read IOPS

Reply via email to