[ceph-users] Why newly added OSD need to get all historical OSDMAPs in pre-boot

2020-08-01 Thread Xiaoxi Chen
Hi List, We see newly added OSD take long time to become active, it is fetching old osdmaps from monitors. Curious why it need that much historical ? The cluster is stable , no osd down/out except the ongoing cap add (one osd at a time). { "cluster_fsid": "e959f744-64be-4f4e-9606-103c

[ceph-users] Re: HBase/HDFS on Ceph/CephFS

2020-04-27 Thread Xiaoxi Chen
RBD is never a workable solution unless you want to pay the cost of double-replication in both HDFS and Ceph. I think the right approach is thinking about other implementation of the FileSystem interface, like s3a and localfs. s3a is straight forward, ceph rgw provide s3 interface and s3a is stab

[ceph-users] Re: [Octopus] OSD overloading

2020-04-13 Thread Xiaoxi Chen
I am not sure if any change in Octopus make this worse, but we are in Nautilus also seeing the RocksDB overhead during snaptrim is huge, we walk around by throttling the snaptrim speed to minimal as well as throttle deep-scurb, see https://www.spinics.net/lists/dev-ceph/msg01277.html for detail

[ceph-users] Re: Multisite RGW data corruption (not 14.2.1 curl issue)

2019-08-19 Thread Xiaoxi Chen
Yes there is no checksum check in RadosSync at this stage... we discussed a bit around it when handling the curl issue. The challenge is for multipart object, the e-tag is not the checksum of the object itself, instead , it is the checksum of the manifest. Special (internal) API is needed to expos