[ceph-users] Is cephfs multi-volume support stable?

2020-10-09 Thread Alexander E. Patrakov
Hello, I found that documentation on the Internet on the question whether I can safely have two instances of cephfs in my cluster is inconsistent. For the record, I don't use snapshots. FOSDEM 19 presentation by Sage Weil: https://archive.fosdem.org/2019/schedule/event/ceph_project_status_update/

[ceph-users] librados documentation has gone

2020-10-09 Thread Daniel Mezentsev
Hi Ceph users, librados (C interface) disappered ? All google provided links are broken: \          SORRY            /          \                         /           \    This page does     /            ]   not exist yet.    [    ,'|            ]                     [   /  |            ]__

[ceph-users] Monitor recovery

2020-10-09 Thread Brian Topping
Hello experts, I have accidentally created a situation where the only monitor in a cluster has been moved to a new node without it’s /var/lib/ceph contents. Not realizing what I had done, I decommissioned the original node, but still have the contents of it’s /var/lib/ceph. Can I shut down th

[ceph-users] Re: How to clear Health Warning status?

2020-10-09 Thread Anthony D'Atri
* Monitors now have a config option ``mon_osd_warn_num_repaired``, 10 by default. If any OSD has repaired more than this many I/O errors in stored data a ``OSD_TOO_MANY_REPAIRS`` health warning is generated. Look at `dmesg` and the underlying drive’s SMART counters. You almost certainly hav

[ceph-users] Re: Ceph User Survey 2020 - Working Group Invite

2020-10-09 Thread Stefan Kooman
On 2020-10-09 19:12, anantha.ad...@intel.com wrote: > Hello all, > > This is an invite to all interested to join a working group being formed > for 2020 Ceph User Survey planning. I'm interested. How and when will this working group come together? Gr. Stefan

[ceph-users] How to clear Health Warning status?

2020-10-09 Thread Tecnología CHARNE . NET
Hello! Today, I started the morning with a WARNING STATUS on our Ceph cluster. # ceph health detail HEALTH_WARN Too many repaired reads on 1 OSDs [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 1 OSDs     osd.67 had 399911 reads repaired I made "ceph osd out 67" and PGs where migrat

[ceph-users] Re: multiple OSD crash, unfound objects

2020-10-09 Thread Michael Thomas
Hi Frank, That was a good tip. I was able to move the broken files out of the way and restore them for users. However, after 2 weeks I'm still left with unfound objects. Even more annoying, I now have 82k objects degraded (up from 74), which hasn't changed in over a week. I'm ready to cla

[ceph-users] Ceph User Survey 2020 - Working Group Invite

2020-10-09 Thread anantha . adiga
Hello all, This is an invite to all interested to join a working group being formed for 2020 Ceph User Survey planning. The focus is to augment the questionnaire coverage, explore survey delivery formats and to expand the survey reach to audience across the world. The popularity and adoptio

[ceph-users] Re: another osd_pglog memory usage incident

2020-10-09 Thread Dan van der Ster
On Fri, Oct 9, 2020 at 3:12 PM Marc Roos wrote: > > > > > > >1. The pg log contains 3000 entries by default (on nautilus). These > >3000 entries can legitimately consume gigabytes of ram for some > >use-cases. (I haven't determined exactly which ops triggered this > >today). > > How can I chec

[ceph-users] Re: Multisite replication speed

2020-10-09 Thread Nicolas Moal
Hi Matt, Thanks for replying and for the clarification on the load-sharing mechanism. Do you know which version is targeted to include that feature ? Thanks again ! Nicolas De : Matt Benjamin Envoyé : vendredi 9 octobre 2020 14:27 À : Nicolas Moal Cc : Paul Me

[ceph-users] Re: another osd_pglog memory usage incident

2020-10-09 Thread Marc Roos
>1. The pg log contains 3000 entries by default (on nautilus). These >3000 entries can legitimately consume gigabytes of ram for some >use-cases. (I haven't determined exactly which ops triggered this >today). How can I check how much ram my pg_logs are using? -Original Message

[ceph-users] Re: Bucket sharding

2020-10-09 Thread Szabo, Istvan (Agoda)
What I've found is the following method: radosgw-admin reshard add --bucket dsfdsfsf--num-shards 200 radosgw-admin reshard process Could this cause any issue in a 10 millions object bucket if I increase it to 200 maybe? From: Szabo, Istvan (Agoda) Sent:

[ceph-users] Nautilus RGW fails to open Jewel buckets (400 Bad Request)

2020-10-09 Thread Wido den Hollander
Hi, Most of it is described here: https://tracker.ceph.com/issues/22928 Buckets created under Jewel don't always have the *placement_rule* set in their bucket metadata and this causes Nautilus RGWs to not serve requests for them. Snippet from the metadata: { "key": "bucket.instance:pbx:

[ceph-users] Re: another osd_pglog memory usage incident

2020-10-09 Thread Harald Staub
On 09.10.20 13:55, Dan van der Ster wrote: [...] I also noticed a possible relationship with scrubbing -- One week ago we increased to osd_max_scrubs=5 to clear out a scrubbing backlog; I wonder if the increased read/write ratio somehow led to an exploding buffer_anon. Do things stabilize on your

[ceph-users] Bucket sharding

2020-10-09 Thread Szabo, Istvan (Agoda)
Hello, I have a bucket which is close to 10 millions objects (9.1 millions), we have: rgw_dynamic_resharding = false rgw_override_bucket_index_max_shards = 100 rgw_max_objs_per_shard = 10 Do I need to increase the numbers soon or it is not possible so they need to start to use new bucket?

[ceph-users] Re: Multisite replication speed

2020-10-09 Thread Matt Benjamin
Hi Nicolas, This is expected behavior currently, but a sync fairness mechanism that will permit load-sharing across gateways during replication is being worked on. regards, Matt On Fri, Oct 9, 2020 at 6:30 AM Nicolas Moal wrote: > > Hello Paul, > > Thank you very much for pointing us at BBR !

[ceph-users] Re: another osd_pglog memory usage incident

2020-10-09 Thread Dan van der Ster
On Fri, Oct 9, 2020 at 1:42 PM Harald Staub wrote: > > On 07.10.20 21:00, Wido den Hollander wrote: > > > > > > On 07/10/2020 16:00, Dan van der Ster wrote: > >> On Wed, Oct 7, 2020 at 3:29 PM Wido den Hollander wrote: > >>> > >>> > >>> > >>> On 07/10/2020 14:08, Dan van der Ster wrote: > Hi

[ceph-users] Re: another osd_pglog memory usage incident

2020-10-09 Thread Harald Staub
On 07.10.20 21:00, Wido den Hollander wrote: On 07/10/2020 16:00, Dan van der Ster wrote: On Wed, Oct 7, 2020 at 3:29 PM Wido den Hollander wrote: On 07/10/2020 14:08, Dan van der Ster wrote: Hi all, This morning some osds in our S3 cluster started going OOM, after restarting them I not

[ceph-users] Re: Multisite replication speed

2020-10-09 Thread Nicolas Moal
Hello Paul, Thank you very much for pointing us at BBR ! We will definitely run some tests before and with the change applied to see if it's increasing a bit our transfer speed. One additional question if you don't mind. As of today, our zonegroup configuration consist of two zones, a master z

[ceph-users] Re: Bluestore migration: per-osd device copy

2020-10-09 Thread Eugen Block
Hi, I think by "copy function" would be the "bluefs-bdev-migrate" command from ceph-bluestore-tool, this is an excerpt from the man paage: ---snip--- bluefs-bdev-migrate --dev-target new-device --devs-source device1 [--devs-source device2] Moves BlueFS data from source device