It's all about the disk accesses. What's the slow part when you dump historic and in-progress ops? On Sat, Nov 8, 2014 at 2:30 PM Loic Dachary <l...@dachary.org> wrote:
> Hi Greg, > > On 08/11/2014 20:19, Gregory Farnum wrote:> When acting as a cache pool it > needs to go do a lookup on the base pool for every object it hasn't > encountered before. I assume that's why it's slower. > > (The penalty should not be nearly as high as you're seeing here, but > based on the low numbers I imagine you're running everything on an > overloaded laptop or something.) > > It's running on a small cluster that is busy but not to a point that I > expect such a difference: > > # dsh --concurrent-shell --show-machine-names --remoteshellopt=-p2222 -m > g1 -m g2 -m g3 -m n7 -m stri dstat -c 10 3 > g1: ----total-cpu-usage---- > g1: usr sys idl wai hiq siq > g1: 6 1 88 6 0 0 > g2: ----total-cpu-usage---- > g2: usr sys idl wai hiq siq > g2: 4 1 88 7 0 0 > n7: ----total-cpu-usage---- > n7: usr sys idl wai hiq siq > n7: 18 3 58 20 0 1 > stri: ----total-cpu-usage---- > stri: usr sys idl wai hiq siq > stri: 6 1 86 6 0 0 > g3: ----total-cpu-usage---- > g3: usr sys idl wai hiq siq > g3: 37 2 55 5 0 1 > g1: 2 0 93 4 0 0 > g2: 2 0 92 6 0 0 > n7: 13 2 65 20 0 1 > stri: 4 1 92 3 0 0 > g3: 32 2 62 4 0 1 > g1: 3 0 94 3 0 0 > g2: 3 1 94 3 0 0 > n7: 13 3 61 22 0 1 > stri: 4 1 90 5 0 0 > g3: 31 2 61 4 0 1 > g1: 3 0 89 7 0 0 > g2: 2 1 89 8 0 0 > n7: 20 3 50 25 0 1 > stri: 6 1 87 5 0 0 > g3: 57 2 36 3 0 1 > > # ceph tell osd.\* version > osd.0: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.1: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.2: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.3: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.4: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.5: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.6: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.7: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.8: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.9: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.10: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.11: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.12: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.13: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.14: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > osd.15: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee > 715ac528ae)"} > > Cheers > > > -Greg > > On Sat, Nov 8, 2014 at 11:14 AM Loic Dachary <l...@dachary.org <mailto: > l...@dachary.org>> wrote: > > > > Hi, > > > > This is a first attempt, it is entirely possible that the solution > is simple or RTFM ;-) > > > > Here is the problem observed: > > > > rados --pool ec4p1 bench 120 write # the erasure coded pool > > Total time run: 147.207804 > > Total writes made: 458 > > Write size: 4194304 > > Bandwidth (MB/sec): 12.445 > > > > rados --pool disks bench 120 write # same crush ruleset at the cache > tier > > Total time run: 126.312601 > > Total writes made: 1092 > > Write size: 4194304 > > Bandwidth (MB/sec): 34.581 > > > > There must be something wrong in how the cache tier is setup: one > would expect the same write speed since the total size written (a few GB) > is lower than the size of the cache pool. Instead the write speed is > consistently at least twice slower (12.445 * 2 < 34.581). > > > > root@g1:~# ceph osd dump | grep disks > > pool 58 'disks' replicated size 3 min_size 2 crush_ruleset 0 > object_hash rjenkins pg_num 128 pgp_num 128 last_change 15110 lfor 12228 > flags hashpspool stripe_width 0 > > root@g1:~# ceph osd dump | grep ec4 > > pool 74 'ec4p1' erasure size 5 min_size 4 crush_ruleset 2 > object_hash rjenkins pg_num 32 pgp_num 32 last_change 15604 lfor 15604 > flags hashpspool tiers 75 read_tier 75 write_tier 75 stripe_width 4096 > > pool 75 'ec4p1c' replicated size 3 min_size 2 crush_ruleset 0 > object_hash rjenkins pg_num 12 pgp_num 12 last_change 15613 flags > hashpspool,incomplete_clones tier_of 74 cache_mode writeback target_bytes > 1000000000 target_objects 1000000000 hit_set > bloom{false_positive___probability: > 0.05, target_size: 0, seed: 0} 3600s x1 stripe_width 0 > > > > root@g1:~# ceph df > > GLOBAL: > > SIZE AVAIL RAW USED %RAW USED > > 26955G 18850G 6735G 24.99 > > POOLS: > > NAME ID USED %USED MAX AVAIL OBJECTS > > .. > > disks 58 1823G 6.76 5305G 471080 > > .. > > ec4p1 74 589G 2.19 12732G 153623 > > ec4p1c 75 57501k 0 5305G 491 > > > > > > Cheers > > -- > > Loïc Dachary, Artisan Logiciel Libre > > > > _________________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com < > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > > > -- > Loïc Dachary, Artisan Logiciel Libre > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com