[ceph-users] Problem with UID starting with underscores

2018-03-06 Thread Arvydas Opulskis
Hi all, because one our script misbehaved, new user with bad UID was created via API, and now we can't remove, view or modify it. I believe, it's because it has three underscores at the beginning: [root@rgw001 /]# radosgw-admin metadata list user | grep "___pro_" "___pro_", [root@rgw001 /]#

Re: [ceph-users] Cache tier

2018-03-06 Thread Захаров Алексей
Hi, We use write-around cache tier with libradosstriper-based clients. We faced with bug which causes performance degradation: http://tracker.ceph.com/issues/22528 . Especially if it is a lot of small objects - sizeof(1 striper chunk). Such objects will promote on every read/write lock:). And i

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-06 Thread Brad Hubbard
On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata < mbald...@hsamiata.it> wrote: > Hi > > I monitor dmesg in each of the 3 nodes, no hardware issue reported. And > the problem happens with various different OSDs in different nodes, for me > it is clear it's not an hardware problem. > If

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-06 Thread Brad Hubbard
debug_osd that is... :) On Tue, Mar 6, 2018 at 7:10 PM, Brad Hubbard wrote: > > > On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata < > mbald...@hsamiata.it> wrote: > >> Hi >> >> I monitor dmesg in each of the 3 nodes, no hardware issue reported. And >> the problem happens with various

Re: [ceph-users] Delete a Pool - how hard should be?

2018-03-06 Thread Max Cuttins
What about using the at command: ceph osd pool rm --yes-i-really-really-mean-it | at now + 30 days Regards, Alex How do you know that this command is scheduled? How do you delete the scheduled command if is casted? This is weird. We need something within CEPH that make you see the "status

Re: [ceph-users] Delete a Pool - how hard should be?

2018-03-06 Thread Max Cuttins
Il 05/03/2018 20:17, Gregory Farnum ha scritto: You're not wrong, and indeed that's why I pushed back on the latest attempt to make deleting pools even more cumbersome. But having a "trash" concept is also pretty weird. If admins can override it to just immediately delete the data (if they n

[ceph-users] Packages for Debian 8 "Jessie" missing from download.ceph.com APT repository

2018-03-06 Thread Simon Fredsted
Hi, I'm trying to install "ceph-common" on Debian 8 "Jessie", but it seems the packages aren't available for it. Searching for “jessie” on https://download.ceph.com/debian-luminous/pool/main/c/ceph/ yields no results. I've tried to install it like it is documented here: http://docs.ceph.com/do

Re: [ceph-users] Delete a Pool - how hard should be?

2018-03-06 Thread Ronny Aasen
On 06. mars 2018 10:26, Max Cuttins wrote: Il 05/03/2018 20:17, Gregory Farnum ha scritto: You're not wrong, and indeed that's why I pushed back on the latest attempt to make deleting pools even more cumbersome. But having a "trash" concept is also pretty weird. If admins can override it to

Re: [ceph-users] Ceph iSCSI is a prank?

2018-03-06 Thread Konstantin Shalygin
Dear all, I wonder how we could support VM systems with ceph storage (block device)? my colleagues are waiting for my answer for vmware (vSphere 5) and I myself use oVirt (RHEV). the default protocol is iSCSI. I know that openstack/cinder work well with ceph and proxmox (just heard) too. But

Re: [ceph-users] Delete a Pool - how hard should be?

2018-03-06 Thread Max Cuttins
Il 06/03/2018 11:13, Ronny Aasen ha scritto: On 06. mars 2018 10:26, Max Cuttins wrote: Il 05/03/2018 20:17, Gregory Farnum ha scritto: You're not wrong, and indeed that's why I pushed back on the latest attempt to make deleting pools even more cumbersome. But having a "trash" concept is

Re: [ceph-users] Deep Scrub distribution

2018-03-06 Thread David Turner
I'm pretty sure I put up one of those scripts in the past. Basically what we did was we set our scrub cycle to something like 40 days, we then sort all PGs by the last time they were deep scrubbed. We grab the oldest 1/30 of those PGs and tell them to deep-scrub manually, the next day we do it ag

Re: [ceph-users] Delete a Pool - how hard should be?

2018-03-06 Thread Max Cuttins
Il 06/03/2018 16:15, David Turner ha scritto: I've never deleted a bucket, pool, etc at the request of a user that they then wanted back because I force them to go through a process to have their data deleted. They have to prove to me, and I have to agree, that they don't need it before I'll d

Re: [ceph-users] Deep Scrub distribution

2018-03-06 Thread Jonathan Proulx
On Tue, Mar 06, 2018 at 03:48:30PM +, David Turner wrote: :I'm pretty sure I put up one of those scripts in the past. Basically what :we did was we set our scrub cycle to something like 40 days, we then sort :all PGs by the last time they were deep scrubbed. We grab the oldest 1/30 :of those

Re: [ceph-users] Ceph iSCSI is a prank?

2018-03-06 Thread Martin Emrich
Hi! Am 02.03.18 um 13:27 schrieb Federico Lucifredi: We do speak to the Xen team every once in a while, but while there is interest in adding Ceph support on their side, I think we are somewhat down the list of their priorities. Maybe things change with XCP-ng (https://xcp-ng.github.io). N

Re: [ceph-users] Why one crippled osd can slow down or block all request to the whole ceph cluster?

2018-03-06 Thread David Turner
There are multiple settings that affect this. osd_heartbeat_grace is probably the most apt. If an OSD is not getting a response from another OSD for more than the heartbeat_grace period, then it will tell the mons that the OSD is down. Once mon_osd_min_down_reporters have told the mons that an O

Re: [ceph-users] When all Mons are down, does existing RBD volume continue to work

2018-03-06 Thread Mayank Kumar
Thanks Gregory. This is basically just trying to understand the behavior of the system in a failure scenario . Ideally we would track and fix mons going down promptly . In an ideal world where nothing else fails and there cephx is not in use but mons are down , what happens if the osd pings to mon

Re: [ceph-users] When all Mons are down, does existing RBD volume continue to work

2018-03-06 Thread Gregory Farnum
I think things would keep running, but I'm really not sure. This is just not a realistic concern as there are lots of little housekeeping things that can be deferred for a little while but eventually will stop forward progress if you can't talk to the monitors to persist cluster state updates. On

[ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-06 Thread Lazuardi Nasution
Hi, I want to do load balanced multipathing (multiple iSCSI gateway/exporter nodes) of iSCSI backed with RBD images. Should I disable exclusive lock feature? What if I don't disable that feature? I'm using TGT (manual way) since I get so many CPU stuck error messages when I was using LIO. Best re

[ceph-users] change radosgw object owner

2018-03-06 Thread Ryan Leimenstoll
Hi all, We are trying to move a bucket in radosgw from one user to another in an effort both change ownership and attribute the storage usage of the data to the receiving user’s quota. I have unlinked the bucket and linked it to the new user using: radosgw-admin bucket unlink —bucket=$MYBUC

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-06 Thread Mike Christie
On 03/06/2018 01:17 PM, Lazuardi Nasution wrote: > Hi, > > I want to do load balanced multipathing (multiple iSCSI gateway/exporter > nodes) of iSCSI backed with RBD images. Should I disable exclusive lock > feature? What if I don't disable that feature? I'm using TGT (manual > way) since I get so

Re: [ceph-users] change radosgw object owner

2018-03-06 Thread Robin H. Johnson
On Tue, Mar 06, 2018 at 02:40:11PM -0500, Ryan Leimenstoll wrote: > Hi all, > > We are trying to move a bucket in radosgw from one user to another in an > effort both change ownership and attribute the storage usage of the data to > the receiving user’s quota. > > I have unlinked the bucket a

[ceph-users] Civetweb log format

2018-03-06 Thread Aaron Bassett
Hey all, I'm trying to get something of an audit log out of radosgw. To that end I was wondering if theres a mechanism to customize the log format of civetweb. It's already writing IP, HTTP Verb, path, response and time, but I'm hoping to get it to print the Authorization header of the request,

Re: [ceph-users] OSD crash during pg repair - recovery_info.ss.clone_snaps.end and other problems

2018-03-06 Thread Gregory Farnum
On Sat, Mar 3, 2018 at 2:28 AM Jan Pekař - Imatic wrote: > Hi all, > > I have few problems on my cluster, that are maybe linked together and > now caused OSD down during pg repair. > > First few notes about my cluster: > > 4 nodes, 15 OSDs installed on Luminous (no upgrade). > Replicated pools wi

Re: [ceph-users] change radosgw object owner

2018-03-06 Thread Yehuda Sadeh-Weinraub
On Tue, Mar 6, 2018 at 11:40 AM, Ryan Leimenstoll wrote: > Hi all, > > We are trying to move a bucket in radosgw from one user to another in an > effort both change ownership and attribute the storage usage of the data to > the receiving user’s quota. > > I have unlinked the bucket and linked it

Re: [ceph-users] Memory leak in Ceph OSD?

2018-03-06 Thread Kjetil Joergensen
Hi, so.. +1 We don't run compression as far as I know, so that wouldn't be it. We do actually run a mix of bluestore & filestore - due to the rest of the cluster predating a stable bluestore by some amount. The interesting part is - the behavior seems to be specific to our bluestore nodes. Belo

Re: [ceph-users] Memory leak in Ceph OSD?

2018-03-06 Thread Kjetil Joergensen
Hi, addendum: We're running 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b). The workload is a mix of 3xreplicated & ec-coded (rbd, cephfs, rgw). -KJ On Tue, Mar 6, 2018 at 3:53 PM, Kjetil Joergensen wrote: > Hi, > > so.. +1 > > We don't run compression as far as I know, so that wouldn't be

Re: [ceph-users] Memory leak in Ceph OSD?

2018-03-06 Thread Alexandre DERUMIER
Hi, I'm also seeing slow memory increase over time with my bluestore nvme osds (3,2tb each) , with default ceph.conf settings. (ceph 12.2.2) each osd start around 5G memory, and go up to 8GB. Currently I'm restarting them around each month to free memory. here a dump of osd.0 after 1week runn

Re: [ceph-users] Why one crippled osd can slow down or block all request to the whole ceph cluster?

2018-03-06 Thread shadow_lin
Hi Turner, Thanks for your insight. I am wondering if the mon can detect slow/blocked request from certain osd why can't mon mark a osd with blocked request down if the request is blocked for a certain time. 2018-03-07 shadow_lin 发件人:David Turner 发送时间:2018-03-06 23:56 主题:Re: [ceph-users]

Re: [ceph-users] Why one crippled osd can slow down or block all request to the whole ceph cluster?

2018-03-06 Thread David Turner
Marking osds down is not without risks. You are taking away one of the copies of data for every PG on that osd. Also you are causing every PG on that osd to peer. If that osd comes back up, every PG on it again needs to peer and then they need to recover. That is a lot of load and risks to automat