2017-10-08 2:02 GMT+05:00 Peter Linder <peter.lin...@fiberdirekt.se>:
> > Then, I believe, the next best configuration would be to set size for this > pool to 4. It would choose an NVMe as the primary OSD, and then choose an > HDD from each DC for the secondary copies. This will guarantee that a copy > of the data goes into each DC and you will have 2 copies in other DCs away > from the primary NVMe copy. It wastes a copy of all of the data in the > pool, but that's on the much cheaper HDD storage and can probably be > considered acceptable losses for the sake of having the primary OSD on NVMe > drives. > > I have considered this, and it should of course work when it works so to > say, but what if 1 datacenter is isolated while running? We would be left > with 2 running copies on each side for all PGs, with no way of knowing what > gets written where. In the end, data would be destoyed due to the split > brain. Even being able to enforce quorum where the SSD is would mean a > single point of failure. > In case you have one mon per DC all operations in the isolated DC will be frozen, so I believe you would not lose data. > > > > On Sat, Oct 7, 2017 at 3:36 PM Peter Linder <peter.lin...@fiberdirekt.se> > wrote: > >> On 10/7/2017 8:08 PM, David Turner wrote: >> >> Just to make sure you understand that the reads will happen on the >> primary osd for the PG and not the nearest osd, meaning that reads will go >> between the datacenters. Also that each write will not ack until all 3 >> writes happen adding the latency to the writes and reads both. >> >> >> Yes, I understand this. It is actually fine, the datacenters have been >> selected so that they are about 10-20km apart. This yields around a 0.1 - >> 0.2ms round trip time due to speed of light being too low. Nevertheless, >> latency due to network shouldn't be a problem and it's all 40G (dedicated) >> TRILL network for the moment. >> >> I just want to be able to select 1 SSD and 2 HDDs, all spread out. I can >> do that, but one of the HDDs end up in the same datacenter, probably >> because I'm using the "take" command 2 times (resets selecting buckets?). >> >> >> >> On Sat, Oct 7, 2017, 1:48 PM Peter Linder <peter.lin...@fiberdirekt.se> >> wrote: >> >>> On 10/7/2017 7:36 PM, Дробышевский, Владимир wrote: >>> >>> Hello! >>> >>> 2017-10-07 19:12 GMT+05:00 Peter Linder <peter.lin...@fiberdirekt.se>: >>> >>> The idea is to select an nvme osd, and >>>> then select the rest from hdd osds in different datacenters (see crush >>>> map below for hierarchy). >>>> >>>> It's a little bit aside of the question, but why do you want to mix >>> SSDs and HDDs in the same pool? Do you have read-intensive workload and >>> going to use primary-affinity to get all reads from nvme? >>> >>> >>> Yes, this is pretty much the idea, getting the performance from NVMe >>> reads, while still maintaining triple redundancy and a reasonable cost. >>> >>> >>> -- >>> Regards, >>> Vladimir >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> >> > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Regards, Vladimir
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com