> On 02 Sep 2015, at 17:50, Robert LeBlanc <rob...@leblancnet.us> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > Thanks for the responses. > > I forgot to include the fio test for completeness: > > 8 job QD=8 > [ext4-test] > runtime=150 > name=ext4-test > readwrite=randrw > size=15G > blocksize=4k > ioengine=sync > iodepth=8 > numjobs=8 > thread > group_reporting > time_based > direct=1 > > > 1 job QD=1 > [ext4-test] > runtime=150 > name=ext4-test > readwrite=randrw > size=15G > blocksize=4k > ioengine=sync > iodepth=1 > numjobs=1 > thread > group_reporting > time_based > direct=1 > > I have not disabled all of the power management, I've only prevented the CPU > from going to an idle state below C1. I'll have to check on Jan's suggestion > of swapping out the intel_idle driver to see what difference it makes. I did > not run powertop as I did the testing because it (or cpupower monitor) > impacted performance and would have thrown off the results. I'll do some runs > with lower clocks and make sure that it is staying at the lower speeds. Here > is some additional output:
AFAIK TurboBoost doesn't kick in unless some cores are in C2, someone should go and take a look at the specs :-) > > # cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > userspace > # cpupower monitor > |Nehalem || Mperf || Idle_Stats > CPU | C3 | C6 | PC3 | PC6 || C0 | Cx | Freq || POLL | C1-A | C6-A > 0| 0.00| 94.19| 0.00| 0.00|| 5.70| 94.30| 1299|| 0.00| 0.00| 94.32 > 1| 0.00| 99.39| 0.00| 0.00|| 0.53| 99.47| 1298|| 0.00| 0.00| 99.48 > 2| 0.00| 99.60| 0.00| 0.00|| 0.38| 99.62| 1299|| 0.00| 0.00| 99.61 > 3| 0.00| 99.63| 0.00| 0.00|| 0.36| 99.64| 1299|| 0.00| 0.00| 99.64 > 4| 0.00| 99.84| 0.00| 0.00|| 0.11| 99.89| 1301|| 0.00| 0.00| 99.97 > 5| 0.00| 99.57| 0.00| 0.00|| 0.40| 99.60| 1299|| 0.00| 0.00| 99.61 > 6| 0.00| 99.72| 0.00| 0.00|| 0.27| 99.73| 1299|| 0.00| 0.00| 99.73 > 7| 0.00| 99.98| 0.00| 0.00|| 0.01| 99.99| 1321|| 0.00| 0.00| 99.99 > # cat /sys/devices/system/cpu/cpuidle/current_driver > intel_idle > > I then echo "1" into /dev/cpu_dma_latency. We can see that the idle time > moves from C6 to C1 > This should not work. You need to leave the file descriptor open after writing the value, it's not a sysfs/proc-type tunable. > # cpupower monitor > |Nehalem || Mperf || Idle_Stats > CPU | C3 | C6 | PC3 | PC6 || C0 | Cx | Freq || POLL | C1-A | C6-A > 0| 0.00| 0.00| 0.00| 0.00|| 0.37| 99.63| 1299|| 0.00| 99.63| 0.00 > 1| 0.00| 0.00| 0.00| 0.00|| 0.16| 99.84| 1299|| 0.00| 99.84| 0.00 > 2| 0.00| 0.00| 0.00| 0.00|| 0.47| 99.53| 1299|| 0.00| 99.53| 0.00 > 3| 0.00| 0.00| 0.00| 0.00|| 0.43| 99.57| 1299|| 0.00| 99.57| 0.00 > 4| 0.00| 0.00| 0.00| 0.00|| 0.09| 99.91| 1300|| 0.00| 99.91| 0.00 > 5| 0.00| 0.00| 0.00| 0.00|| 0.06| 99.94| 1298|| 0.00| 99.94| 0.00 > 6| 0.00| 0.00| 0.00| 0.00|| 0.09| 99.91| 1300|| 0.00| 99.91| 0.00 > 7| 0.00| 0.00| 0.00| 0.00|| 0.28| 99.72| 1299|| 0.00| 99.72| 0.00 > # cat /sys/devices/system/cpu/cpu0/cpuidle/state*/latency > 0 > 2 > 15 > # cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_{min,max,cur}_freq > 1200000 > 1200000 > 1200000 > 1200000 > 1200000 > 1200000 > 1200000 > 1200000 > 1200000 > 2401000 > 2401000 > 2401000 > 2401000 > 2401000 > 2401000 > 2401000 > 1200000 > 1200000 > 1200000 > 1600000 > 1200000 > 1200000 > 1200000 > 1200000 > > Thanks for taking the time to collaborate with me on this. > -----BEGIN PGP SIGNATURE----- > Version: Mailvelope v1.0.2 > Comment: https://www.mailvelope.com <https://www.mailvelope.com/> > > wsFcBAEBCAAQBQJV5xrBCRDmVDuy+mK58QAAWaoP/2bIKlsp+fmlViP4pFV7 > Sv+y/1nCQdNs0l2AJdiDX2l7OQrYavDh5LldJBkcmTyB74KjDJ+i88VGYkdG > n8Q6tTbF4erw8P/gPf3DIrvQazdQm+a/6rUBpkM+MNTRyKRczxeyCu8kCNzb > jDP7erwnj0WzCZMAA1uFLa9sMKBNxOfpK9wQR5NbQCkOcsDtprNL2KPfxrFV > Rgk0OBGBSLtz9BE/PMYpbeqr9o1nChCp4hkg5AUcFrAuceOKdA7R8lKPIUZ6 > 0zTL1OjGsGfy/sp856poqmF02bANF9LXzmcBMKBNMO0iS89xv0YyIgRBlt/Z > lXc4M7IWtYzbbUVAtSLcOtWrzS8Yp0hMKlPrhA7LZFrhZ4+t45mvyrS3RbiP > RG8osdvjz58ZBS7/jk1gDZd8Xbj5bsU3n01DTFJ3CeAE2etAqgheAGlj4OTR > kfs/g1jbYArEgnfX3jTJ2wECjfVRTrgXJGjceoYtJYbQ4Ns/0dBWpZBrkEu0 > AX4VU1dk9R1B0rootvKsWedcKvof4cSOyKRtQxGHS7ipqtkyep+1JquO41mr > cBC9p/TOXgh90M8476G1CpMqWwWHneHJ6bjO5V1W8uWGXTNFnaGbqS4v3mWk > ge1qukr9et0Su0llUb8Rz3hCDqD6PfMJpquBTAB/kaanS+t0pi+00wxu7zzB > zVQ/ > =v4sY > -----END PGP SIGNATURE----- > > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > On Wed, Sep 2, 2015 at 3:21 AM, Nick Fisk <n...@fisk.me.uk > <mailto:n...@fisk.me.uk>> wrote: > I think this may be related to what I had to do, it rings a bell at least. > > http://unix.stackexchange.com/questions/153693/cant-use-userspace-cpufreq-governor-and-set-cpu-frequency > > <http://unix.stackexchange.com/questions/153693/cant-use-userspace-cpufreq-governor-and-set-cpu-frequency> > > The P-state drive doesn't support userspace, so you need to disable it and > make Linux use the old acpi drive instead. > > > -----Original Message----- > > From: Nick Fisk [mailto:n...@fisk.me.uk <mailto:n...@fisk.me.uk>] > > Sent: 01 September 2015 22:21 > > To: 'Robert LeBlanc' <rob...@leblancnet.us <mailto:rob...@leblancnet.us>> > > Cc: ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > Subject: RE: [ceph-users] Ceph SSD CPU Frequency Benchmarks > > > > > -----Original Message----- > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com > > > <mailto:ceph-users-boun...@lists.ceph.com>] On Behalf > > > Of Robert LeBlanc > > > Sent: 01 September 2015 21:48 > > > To: Nick Fisk <n...@fisk.me.uk <mailto:n...@fisk.me.uk>> > > > Cc: ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > > Subject: Re: [ceph-users] Ceph SSD CPU Frequency Benchmarks > > > > > > -----BEGIN PGP SIGNED MESSAGE----- > > > Hash: SHA256 > > > > > > Nick, > > > > > > I've been trying to replicate your results without success. Can you > > > help me understand what I'm doing that is not the same as your test? > > > > > > My setup is two boxes, one is a client and the other is a server. The > > > server has Intel(R) Atom(TM) CPU C2750 @ 2.40GHz, 32 GB RAM and 2 > > > Intel S3500 > > > 240 GB SSD drives. The boxes have Infiniband FDR cards connected to a > > > QDR switch using IPoIB. I set up OSDs on the 2 SSDs and set pool > > > size=1. I mapped a 200GB RBD using the kernel module ran fio on the > > > RBD. I adjusted the number of cores, clock speed and C-states of the > > > server and here are my > > > results: > > > > > > Adjusted core number and set the processor to a set frequency using > > > the userspace governor. > > > > > > 8 jobs 8 depth Cores > > > 1 2 3 4 5 6 7 8 > > > Frequency 2.4 387 762 1121 1432 1657 1900 2092 2260 > > > GHz 2 386 758 1126 1428 1657 1890 2090 2232 > > > 1.6 382 756 1127 1428 1656 1894 2083 2201 > > > 1.2 385 756 1125 1431 1656 1885 2093 2244 > > > > > > > I tested at QD=1 as this tends to highlight the difference in clock speed, > > whereas a higher queue depth will probably scale with both frequency and > > cores. I'm not sure this is your problem, but to make sure your environment > > is doing what you want I would suggest QD=1 and 1 job to start with. > > > > But thank you for sharing these results regardless of your current frequency > > scaling issues. Information like this is really useful for people trying to > > decide > > on hardware purchases. Those Atom boards look like they could support 12x > > normal HDD's quite happily, assuming 80 IOPsx12. > > > > I wonder if we can get enough data from various people to generate a > > IOPs/CPU Freq for various CPU architectures? > > > > > > > I then adjusted the processor to not go in a deeper sleep state than > > > C1 and also tested setting the highest CPU frequency with the ondemand > > governor. > > > > > > 1 job 1 depth > > > Cores 1 > > > <=C1, feq range C0-C6, freq range C0-C6, static freq > > > <=C1, static > > > freq > > > Frequency 2.4 381 381 379 381 > > > GHz 2 382 380 381 381 > > > 1.6 380 381 379 382 > > > 1.2 383 378 379 383 > > > Cores 8 > > > <=C1, feq range C0-C6, freq range C0-C6, static freq > > > <=C1, static > > > freq > > > Frequency 2.4 629 580 584 629 > > > GHz 2 630 579 584 634 > > > 1.6 630 579 584 634 > > > 1.2 632 581 582 634 > > > > > > Here I'm see a correlation between # cores and C-states, but not > > frequency. > > > > > > Frequency was controlled with: > > > cpupower frequency-set -d 1.2GHz -u 1.2GHz -g userspace and cpupower > > > frequency-set -d 1.2GHz -u 2.0GHz -g ondemand > > > > > > Core count adjusted by: > > > for i in {1..7}; do echo 0 > /sys/devices/system/cpu/cpu$i/online; > > > done > > > > > > C-states controlled by: > > > # python > > > Python 2.7.5 (default, Jun 24 2015, 00:41:19) [GCC 4.8.3 20140911 (Red > > > Hat 4.8.3-9)] on linux2 Type "help", "copyright", "credits" or > > > "license" for more information. > > > >>> fd = open('/dev/cpu_dma_latency','wb') > > > >>> fd.write('1') > > > >>> fd.flush() > > > >>> fd.close() # Don't run this until the tests are completed (the > > > >>> handle has > > > to stay open). > > > >>> > > > > > > I'd like to replicate your results. I'd also like if you can verify > > > some of mine in your set-up around C-States and cores. > > > > I can't remember exactly, but I think I had to do something to get the > > userspace governor to behave as I expected it to. I tend to recall setting > > the > > frequency low and yet still seeing it bursting up to max. I will have a look > > through my notes tomorrow and see if I can recall anything. One thing I do > > remember though is that the Intel powertop utility was very useful in > > confirming what the actual CPU frequency was. It might be worth installing > > and running this and seeing what the CPU cores are doing. > > > > > > > > > > Thanks, > > > > > > -----BEGIN PGP SIGNATURE----- > > > Version: Mailvelope v1.0.2 > > > Comment: https://www.mailvelope.com <https://www.mailvelope.com/> > > > > > > > > wsFcBAEBCAAQBQJV5g8GCRDmVDuy+mK58QAAe6YP/j+SNGFI2z7ndnbOk87 > > > D > > > UjxG+hiZT5bkdt2/wVfI6QiH0UGDA3rLBsttOHPgfxP6/CEy801q8/fO0QOk > > > tLxIgX01K4ECls2uhiFAM3bhKalFsKDM6rHYFx96tIGWonQeou36ouDG8pfz > > > YsprvQ2XZEX1+G4dfZZ4lc3A3mfIY6Wsn7DC0tup9eRp3cl9hQLXEu4Zg8CZ > > > 7867FNaud4S4f6hYV0KUC0fv+hZvyruMCt/jgl8gVr8bAdNgiW5u862gsk5b > > > sO9mb7H679G8t47m3xd89jTh9siMshbcakF9PXKzrN7DxBb/sBuN3GykesZA > > > +5jdUTzPCxFu+LocJ91by8FybatpLwxycmfP2gRxd/owclXk5BqqJUnrdYVm > > > > > n2GcHobdHVv9k/s+iBVV0xbwqOY+IO9UNUfLAKNy7E1xtpXdTpQBuokmu/4D > > > > > WXg3C4u+DsZNvcziO4s/edQ1koOQm1Fcj5VnbouSqmsHpB5nHeJbGmiKNTB > > > A > > > 9pE/hTph56YRqOE3bq3X/ohjtziL7/e/MVF3VUisDJieaLxV9weLxKIf0W9t > > > L7NMhX7iUIMps5ulA9qzd8qJK6yBa65BVXtk5M0A5oTA/VvxHQT6e5nSZS+Z > > > > > WLjavMnmSSJT1BQZ5GkVbVqo4UVjndcXEvkBm3+McaGKliO2xvxP+U3nCKpZ > > > js+h > > > =4WAa > > > -----END PGP SIGNATURE----- > > > > > > > > > ---------------- > > > Robert LeBlanc > > > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > > > > On Sat, Jun 13, 2015 at 8:58 AM, Nick Fisk <n...@fisk.me.uk > > > <mailto:n...@fisk.me.uk>> wrote: > > > Hi All, > > > > > > I know there has been lots of discussions around needing fast CPU's to > > > get the most out of SSD's. However I have never really ever seen an > > > solid numbers to make a comparison about how much difference a faster > > > CPU makes and if Ceph scales linearly with clockspeed. So I did a > > > little experiment today. > > > > > > I setup a 1 OSD Ceph instance on a Desktop PC. The Desktop has a i5 > > > Sandbybridge CPU with the CPU turbo overclocked to 4.3ghz. By using > > > the userspace governor in Linux, I was able to set static clock speeds > > > to see the possible performance effects on Ceph. My pc only has an old > > > X25M-G2 SSD, so I had to limit the IO testing to 4kb QD=1, as > > > otherwise the SSD ran out of puff when I got to the higher clock > > > speeds. > > > > > > CPU Mhz 4Kb Write IO Min Latency (us) Avg Latency (us) > > > CPU > > > usr CPU sys > > > 1600 797 886 1250 > > > 10.14 2.35 > > > 2000 815 746 1222 > > > 8.45 1.82 > > > 2400 1161 630 857 > > > 9.5 1.6 > > > 2800 1227 549 812 > > > 8.74 1.24 > > > 3300 1320 482 755 > > > 7.87 1.08 > > > 4300 1548 437 644 > > > 7.72 0.9 > > > > > > The figures show a fairly linear trend right through the clock range > > > and clearly shows the importance of having fast CPU's (Ghz not cores) > > > if you want to achieve high IO, especially at low queue depths. > > > > > > > > > Things to Note > > > These figures are from a desktop CPU, no doubt Xeons will be slightly > > > faster at the same clock speed I assuming using the userspace governor > > > in this way is a realistic way to simulate different CPU clock speeds? > > > My old SSD is probably skewing the figures slightly I have complete > > > control over the turbo settings and big cooling, many server CPU's > > > will limit the max turbo if multiple cores are under load or get too > > > hot Ceph SSD OSD nodes are probably best with high end E3 CPU's as > > > they have the highest clock speeds HDD's with Journals will probably > > > benefit slightly from higher clock speeds, if the disk isn't the > > > bottleneck (ie small block sequential writes) These numbers are for > > > Replica=1, at 2 or 3 these numbers will be at least half I would > > > imagine > > > > > > > > > I hope someone finds this useful > > > > > > Nick > > > > > > > > > > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > > > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com