[ceph-users] Rolling out radosgw-admin4j v2.0.2

2019-11-14 Thread hrchu
radosgw-admin4j is an admin client in Java that allows provisioning and control of Ceph object store. In version 2.0.2, Java 11 and Ceph Nautilus are supported. See https://github.com/twonote/radosgw-admin4j for more details. ___ ceph-users mailing list

[ceph-users] Can't Add Zone at Remote Multisite Cluster

2019-11-14 Thread Mac Wynkoop
Hi All, So, I am trying to create a site-specifc zonegroup at my 2nd site's Ceph cluster. Upon creating the zonegroup and a placeholder master zone at my master site, I go to do a period update and commit, and this is what it returns to me: (hostname) ~ $ radosgw-admin period commit 2019-11-14 22

[ceph-users] Re: Bad links on ceph.io for mailing lists

2019-11-14 Thread Ilya Dryomov
On Thu, Nov 14, 2019 at 8:09 PM Gregory Farnum wrote: > > On Thu, Nov 14, 2019 at 9:21 AM Bryan Stillwell > wrote: > > > > There are some bad links to the mailing list subscribe/unsubscribe/archives > > on this page that should get updated: > > > > https://ceph.io/resources/ > > > > The subscri

[ceph-users] Re: osdmaps not trimmed until ceph-mon's restarted (if cluster has a down osd)

2019-11-14 Thread Gregory Farnum
On Thu, Nov 14, 2019 at 8:14 AM Dan van der Ster wrote: > > Hi Joao, > > I might have found the reason why several of our clusters (and maybe > Bryan's too) are getting stuck not trimming osdmaps. > It seems that when an osd fails, the min_last_epoch_clean gets stuck > forever (even long after HEA

[ceph-users] Re: osdmaps not trimmed until ceph-mon's restarted (if cluster has a down osd)

2019-11-14 Thread Dan van der Ster
On Thursday, November 14, 2019, Nathan Cutler wrote: > > Hi Dan: > > > I might have found the reason why several of our clusters (and maybe > > Bryan's too) are getting stuck not trimming osdmaps. > > It seems that when an osd fails, the min_last_epoch_clean gets stuck > > forever (even long after

[ceph-users] Re: Bad links on ceph.io for mailing lists

2019-11-14 Thread Gregory Farnum
On Thu, Nov 14, 2019 at 9:21 AM Bryan Stillwell wrote: > > There are some bad links to the mailing list subscribe/unsubscribe/archives > on this page that should get updated: > > https://ceph.io/resources/ > > The subscribe/unsubscribe/archives links point to the old lists vger and > lists.ceph.

[ceph-users] Bad links on ceph.io for mailing lists

2019-11-14 Thread Bryan Stillwell
There are some bad links to the mailing list subscribe/unsubscribe/archives on this page that should get updated: https://ceph.io/resources/ The subscribe/unsubscribe/archives links point to the old lists vger and lists.ceph.com, and not the new lists on lists.ceph.io: ceph-devel subscribe

[ceph-users] Re: increasing PG count - limiting disruption

2019-11-14 Thread David Turner
There are a few factors to consider. I've gone from 16k pgs to 32k pgs before and learned some lessons. The first and most imminent is the peering that happens when you increase the PG count. I like to increase the pg_num and pgp_num values slowly to mitigate this. Something like [1] this should d

[ceph-users] Re: Possible data corruption with 14.2.3 and 14.2.4

2019-11-14 Thread Mark Nelson
Great job tracking this down to everyone involved! Mark On 11/14/19 10:10 AM, Sage Weil wrote: Hi everyone, We've identified a data corruption bug[1], first introduced[2] (by yours truly) in 14.2.3 and affecting both 14.2.3 and 14.2.4. The corruption appears as a rocksdb checksum error or as

[ceph-users] increasing PG count - limiting disruption

2019-11-14 Thread Frank R
Hi all, When increasing the number of placement groups for a pool by a large amount (say 2048 to 4096) is it better to go in small steps or all at once? This is a filestore cluster. Thanks, Frank ___ ceph-users mailing list -- ceph-users@ceph.io To uns

[ceph-users] osdmaps not trimmed until ceph-mon's restarted (if cluster has a down osd)

2019-11-14 Thread Dan van der Ster
Hi Joao, I might have found the reason why several of our clusters (and maybe Bryan's too) are getting stuck not trimming osdmaps. It seems that when an osd fails, the min_last_epoch_clean gets stuck forever (even long after HEALTH_OK), until the ceph-mons are restarted. I've updated the ticket:

[ceph-users] Possible data corruption with 14.2.3 and 14.2.4

2019-11-14 Thread Sage Weil
Hi everyone, We've identified a data corruption bug[1], first introduced[2] (by yours truly) in 14.2.3 and affecting both 14.2.3 and 14.2.4. The corruption appears as a rocksdb checksum error or assertion that looks like os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(available

[ceph-users] mds crash loop - cephfs disaster recovery

2019-11-14 Thread Karsten Nielsen
I am a problem with my mds that is in a crash loop, with the help of Yan, Zheng I have run a few attempts to save it but it seems that it is not going the way it should. I am reading through this documentation. https://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/ If I use the last step t

[ceph-users] Re: Adding new non-containerised hosts to current contanerised environment and moving away from containers forward

2019-11-14 Thread Thomas Bennett
Hey Jeremi, I'm not sure how ceph-ansible will handle a hybrid system. You'll need to make sure that you have the same info in the ceph-ansible "fetch" directory or it will create a separate cluster. I'm not sure if you can somehow force this without causing some issues. I'm also not sure what o