Maybe it is possible if done via gateway-nfs export? Settings for gateway allow read osd selection?
v On Sun, Nov 11, 2018 at 1:01 AM Martin Verges <martin.ver...@croit.io> wrote: > Hello Vlad, > > If you want to read from the same data, then it ist not possible (as far I > know). > > -- > Martin Verges > Managing director > > Mobile: +49 174 9335695 > E-Mail: martin.ver...@croit.io > Chat: https://t.me/MartinVerges > > croit GmbH, Freseniusstr. 31h, 81247 Munich > CEO: Martin Verges - VAT-ID: DE310638492 > Com. register: Amtsgericht Munich HRB 231263 > > Web: https://croit.io > YouTube: https://goo.gl/PGE1Bx > > Am Sa., 10. Nov. 2018, 03:47 hat Vlad Kopylov <vladk...@gmail.com> > geschrieben: > >> Maybe i missed something but FS is explicitly selecting pools to put >> files and metadata, like I did below. >> So if I create new pools - data in them will be different. If I apply the >> rule dc1_primary to cfs_data pool, and client from dc3 connects to fs t01 - >> it will start using dc1 hosts >> >> >> ceph osd pool create cfs_data 100 >> ceph osd pool create cfs_meta 100 >> ceph fs new t01 cfs_data cfs_meta >> sudo mount -t ceph ceph1:6789:/ /mnt/t01 -o >> name=admin,secretfile=/home/mciadmin/admin.secret >> >> rule dc1_primary { >> id 1 >> type replicated >> min_size 1 >> max_size 10 >> step take dc1 >> step chooseleaf firstn 1 type host >> step emit >> step take dc2 >> step chooseleaf firstn -2 type host >> step emit >> step take dc3 >> step chooseleaf firstn -2 type host >> step emit >> } >> >> On Fri, Nov 9, 2018 at 9:32 PM Vlad Kopylov <vladk...@gmail.com> wrote: >> >>> Just to confirm - it will still populate 3 copies in each datacenter? >>> Thought this map was to select where to write to, guess it does write >>> replication on the back end. >>> >>> I thought pools are completely separate and clients would not see each >>> others data? >>> >>> Thank you Martin! >>> >>> >>> >>> >>> On Fri, Nov 9, 2018 at 2:10 PM Martin Verges <martin.ver...@croit.io> >>> wrote: >>> >>>> Hello Vlad, >>>> >>>> you can generate something like this: >>>> >>>> rule dc1_primary_dc2_secondary { >>>> id 1 >>>> type replicated >>>> min_size 1 >>>> max_size 10 >>>> step take dc1 >>>> step chooseleaf firstn 1 type host >>>> step emit >>>> step take dc2 >>>> step chooseleaf firstn 1 type host >>>> step emit >>>> step take dc3 >>>> step chooseleaf firstn -2 type host >>>> step emit >>>> } >>>> >>>> rule dc2_primary_dc1_secondary { >>>> id 2 >>>> type replicated >>>> min_size 1 >>>> max_size 10 >>>> step take dc1 >>>> step chooseleaf firstn 1 type host >>>> step emit >>>> step take dc2 >>>> step chooseleaf firstn 1 type host >>>> step emit >>>> step take dc3 >>>> step chooseleaf firstn -2 type host >>>> step emit >>>> } >>>> >>>> After you added such crush rules, you can configure the pools: >>>> >>>> ~ $ ceph osd pool set <pool_for_dc1> crush_ruleset 1 >>>> ~ $ ceph osd pool set <pool_for_dc2> crush_ruleset 2 >>>> >>>> Now you place your workload from dc1 to the dc1 pool, and workload >>>> from dc2 to the dc2 pool. You could also use HDD with SSD journal (if >>>> your workload issn't that write intensive) and save some money in dc3 >>>> as your client would always read from a SSD and write to Hybrid. >>>> >>>> Btw. all this could be done with a few simple clicks through our web >>>> frontend. Even if you want to export it via CephFS / NFS / .. it is >>>> possible to set it on a per folder level. Feel free to take a look at >>>> https://www.youtube.com/watch?v=V33f7ipw9d4 to see how easy it could >>>> be. >>>> >>>> -- >>>> Martin Verges >>>> Managing director >>>> >>>> Mobile: +49 174 9335695 >>>> E-Mail: martin.ver...@croit.io >>>> Chat: https://t.me/MartinVerges >>>> >>>> croit GmbH, Freseniusstr. 31h, 81247 Munich >>>> CEO: Martin Verges - VAT-ID: DE310638492 >>>> Com. register: Amtsgericht Munich HRB 231263 >>>> >>>> Web: https://croit.io >>>> YouTube: https://goo.gl/PGE1Bx >>>> >>>> >>>> 2018-11-09 17:35 GMT+01:00 Vlad Kopylov <vladk...@gmail.com>: >>>> > Please disregard pg status, one of test vms was down for some time it >>>> is >>>> > healing. >>>> > Question only how to make it read from proper datacenter >>>> > >>>> > If you have an example. >>>> > >>>> > Thanks >>>> > >>>> > >>>> > On Fri, Nov 9, 2018 at 11:28 AM Vlad Kopylov <vladk...@gmail.com> >>>> wrote: >>>> >> >>>> >> Martin, thank you for the tip. >>>> >> googling ceph crush rule examples doesn't give much on rules, just >>>> static >>>> >> placement of buckets. >>>> >> this all seems to be for placing data, not to giving client in >>>> specific >>>> >> datacenter proper read osd >>>> >> >>>> >> maybe something wrong with placement groups? >>>> >> >>>> >> I added datacenter dc1 dc2 dc3 >>>> >> Current replicated_rule is >>>> >> >>>> >> rule replicated_rule { >>>> >> id 0 >>>> >> type replicated >>>> >> min_size 1 >>>> >> max_size 10 >>>> >> step take default >>>> >> step chooseleaf firstn 0 type host >>>> >> step emit >>>> >> } >>>> >> >>>> >> # buckets >>>> >> host ceph1 { >>>> >> id -3 # do not change unnecessarily >>>> >> id -2 class ssd # do not change unnecessarily >>>> >> # weight 1.000 >>>> >> alg straw2 >>>> >> hash 0 # rjenkins1 >>>> >> item osd.0 weight 1.000 >>>> >> } >>>> >> datacenter dc1 { >>>> >> id -9 # do not change unnecessarily >>>> >> id -4 class ssd # do not change unnecessarily >>>> >> # weight 1.000 >>>> >> alg straw2 >>>> >> hash 0 # rjenkins1 >>>> >> item ceph1 weight 1.000 >>>> >> } >>>> >> host ceph2 { >>>> >> id -5 # do not change unnecessarily >>>> >> id -6 class ssd # do not change unnecessarily >>>> >> # weight 1.000 >>>> >> alg straw2 >>>> >> hash 0 # rjenkins1 >>>> >> item osd.1 weight 1.000 >>>> >> } >>>> >> datacenter dc2 { >>>> >> id -10 # do not change unnecessarily >>>> >> id -8 class ssd # do not change unnecessarily >>>> >> # weight 1.000 >>>> >> alg straw2 >>>> >> hash 0 # rjenkins1 >>>> >> item ceph2 weight 1.000 >>>> >> } >>>> >> host ceph3 { >>>> >> id -7 # do not change unnecessarily >>>> >> id -12 class ssd # do not change unnecessarily >>>> >> # weight 1.000 >>>> >> alg straw2 >>>> >> hash 0 # rjenkins1 >>>> >> item osd.2 weight 1.000 >>>> >> } >>>> >> datacenter dc3 { >>>> >> id -11 # do not change unnecessarily >>>> >> id -13 class ssd # do not change unnecessarily >>>> >> # weight 1.000 >>>> >> alg straw2 >>>> >> hash 0 # rjenkins1 >>>> >> item ceph3 weight 1.000 >>>> >> } >>>> >> root default { >>>> >> id -1 # do not change unnecessarily >>>> >> id -14 class ssd # do not change unnecessarily >>>> >> # weight 3.000 >>>> >> alg straw2 >>>> >> hash 0 # rjenkins1 >>>> >> item dc1 weight 1.000 >>>> >> item dc2 weight 1.000 >>>> >> item dc3 weight 1.000 >>>> >> } >>>> >> >>>> >> >>>> >> #ceph pg dump >>>> >> dumped all >>>> >> version 29433 >>>> >> stamp 2018-11-09 11:23:44.510872 >>>> >> last_osdmap_epoch 0 >>>> >> last_pg_scan 0 >>>> >> PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES >>>> LOG >>>> >> DISK_LOG STATE STATE_STAMP >>>> VERSION >>>> >> REPORTED UP UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB >>>> SCRUB_STAMP >>>> >> LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN >>>> >> 1.5f 0 0 0 0 0 >>>> 0 >>>> >> 0 0 active+clean 2018-11-09 04:35:32.320607 >>>> 0'0 >>>> >> 544:1317 [0,2,1] 0 [0,2,1] 0 0'0 >>>> 2018-11-09 >>>> >> 04:35:32.320561 0'0 2018-11-04 11:55:54.756115 >>>> 0 >>>> >> 2.5c 143 0 143 0 0 >>>> 19490267 >>>> >> 461 461 active+undersized+degraded 2018-11-08 19:02:03.873218 >>>> 508'461 >>>> >> 544:2100 [2,1] 2 [2,1] 2 290'380 >>>> 2018-11-07 >>>> >> 18:58:43.043719 64'120 2018-11-05 14:21:49.256324 >>>> 0 >>>> >> ..... >>>> >> sum 15239 0 2053 2659 0 2157615019 58286 58286 >>>> >> OSD_STAT USED AVAIL TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM >>>> >> 2 3.7 GiB 28 GiB 32 GiB [0,1] 200 73 >>>> >> 1 3.7 GiB 28 GiB 32 GiB [0,2] 200 58 >>>> >> 0 3.7 GiB 28 GiB 32 GiB [1,2] 173 69 >>>> >> sum 11 GiB 85 GiB 96 GiB >>>> >> >>>> >> #ceph pg map 2.5c >>>> >> osdmap e545 pg 2.5c (2.5c) -> up [2,1] acting [2,1] >>>> >> >>>> >> #pg map 1.5f >>>> >> osdmap e547 pg 1.5f (1.5f) -> up [0,2,1] acting [0,2,1] >>>> >> >>>> >> >>>> >> On Fri, Nov 9, 2018 at 2:21 AM Martin Verges <martin.ver...@croit.io >>>> > >>>> >> wrote: >>>> >>> >>>> >>> Hello Vlad, >>>> >>> >>>> >>> Ceph clients connect to the primary OSD of each PG. If you create a >>>> >>> crush rule for building1 and one for building2 that takes a OSD from >>>> >>> the same building as the first one, your reads to the pool will >>>> always >>>> >>> be on the same building (if the cluster is healthy) and only write >>>> >>> request get replicated to the other building. >>>> >>> >>>> >>> -- >>>> >>> Martin Verges >>>> >>> Managing director >>>> >>> >>>> >>> Mobile: +49 174 9335695 >>>> >>> E-Mail: martin.ver...@croit.io >>>> >>> Chat: https://t.me/MartinVerges >>>> >>> >>>> >>> croit GmbH, Freseniusstr. 31h, 81247 Munich >>>> >>> CEO: Martin Verges - VAT-ID: DE310638492 >>>> >>> Com. register: Amtsgericht Munich HRB 231263 >>>> >>> >>>> >>> Web: https://croit.io >>>> >>> YouTube: https://goo.gl/PGE1Bx >>>> >>> >>>> >>> >>>> >>> 2018-11-09 4:54 GMT+01:00 Vlad Kopylov <vladk...@gmail.com>: >>>> >>> > I am trying to test replicated ceph with servers in different >>>> >>> > buildings, and >>>> >>> > I have a read problem. >>>> >>> > Reads from one building go to osd in another building and vice >>>> versa, >>>> >>> > making >>>> >>> > reads slower then writes! Making read as slow as slowest node. >>>> >>> > >>>> >>> > Is there a way to >>>> >>> > - disable parallel read (so it reads only from the same osd node >>>> where >>>> >>> > mon >>>> >>> > is); >>>> >>> > - or give each client read restriction per osd? >>>> >>> > - or maybe strictly specify read osd on mount; >>>> >>> > - or have node read delay cap (for example if node time out is >>>> larger >>>> >>> > then 2 >>>> >>> > ms then do not use such node for read as other replicas are >>>> available). >>>> >>> > - or ability to place Clients on the Crush map - so it >>>> understands that >>>> >>> > osd >>>> >>> > in - for example osd in the same data-center as client has >>>> preference, >>>> >>> > and >>>> >>> > pull data from it/them. >>>> >>> > >>>> >>> > Mounting with kernel client latest mimic. >>>> >>> > >>>> >>> > Thank you! >>>> >>> > >>>> >>> > Vlad >>>> >>> > >>>> >>> > _______________________________________________ >>>> >>> > ceph-users mailing list >>>> >>> > ceph-users@lists.ceph.com >>>> >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> > >>>> >>>
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com