> -Original Message-
> From: Igor Fedotov [mailto:ifedo...@suse.de]
> Sent: 19 October 2018 01:03
> To: n...@fisk.me.uk; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] slow_used_bytes - SlowDB being used despite lots of
> space free in BlockDB on SSD?
>
>
>
> On 10/18/2018 7:49 P
since some time we experience service outages in our Ceph cluster
whenever there is any change to the HEALTH status. E. g. swapping
storage devices, adding storage devices, rebooting Ceph hosts, during
backfills ect.
Just now I had a recent situation, where several VMs hung after I
rebooted one C
On 10/19/18 7:51 AM, xiang@iluvatar.ai wrote:
> Hi!
>
> I use ceph 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic
> (stable), and find that:
>
> When expand whole cluster, i update pg_num, all succeed, but the status
> is as below:
> cluster:
> id: 41ef913c-2351-4794-b9ac
Hello,
I found that the metadata of ldap user and normal radosgw user different in
the "type". Can it be the cause that the bucket policy does not work?
# Normal radosgw user
{
"user_id": "ceph-dashboard",
"display_name": "Ceph Dashboard",
"email": "",
"suspended": 0,
"max_bucke
> -Original Message-
> From: Nick Fisk [mailto:n...@fisk.me.uk]
> Sent: 19 October 2018 08:15
> To: 'Igor Fedotov' ; ceph-users@lists.ceph.com
> Subject: RE: [ceph-users] slow_used_bytes - SlowDB being used despite lots of
> space free in BlockDB on SSD?
>
> > -Original Message-
>
Thanks for the feedback everyone. Based on the TBW figures, it sounds like
these drives are terrible for us as the idea is to NOT use them simply for
archive. This will be a high read/write workload, so totally a show stopper.
I’m interested in the Seagate Nytro myself.
I was recommend c
Hi Denny,
the recommendation for ceph maintenance is to set three flags if you
need to shutdown a node (or the entire cluster):
ceph osd set noout
ceph osd set nobackfill
ceph osd set norecover
Although the 'noout' flag seems to be enough for many maintenance
tasks it doesn't prevent the c
No, you do not need to set nobackfill and norecover if you only shut
down one server. The guide you are referencing is about shutting down
everything.
It will not recover degraded PGs if you shut down one server with noout.
Paul
Am Fr., 19. Okt. 2018 um 11:37 Uhr schrieb Eugen Block :
>
> Hi Denn
Hi,
Our Ceph cluster is a 6 Node cluster each node having 8 disks. The
cluster is used for object storage only (right now). We do use EC 3+2 on
the buckets.data pool.
We had a problem with RadosGW segfaulting (12.2.5) till we upgraded to
12.2.8. We had almost 30.000 radosgw crashes leading
No, you do not need to set nobackfill and norecover if you only shut
down one server. The guide you are referencing is about shutting down
everything.
It will not recover degraded PGs if you shut down one server with noout.
You are right, I must have confused something in my memory with the
r
On Mon, Oct 8, 2018 at 2:57 PM Dylan McCulloch wrote:
>
> Hi all,
>
>
> We have identified some unexpected blocking behaviour by the ceph-fs kernel
> client.
>
>
> When performing 'rm' on large files (100+GB), there appears to be a
> significant delay of 10 seconds or more, before a 'stat' opera
Hi,
upon failover or restart, or MDS complains that something is wrong with
one of the stray directories:
2018-10-19 12:56:06.442151 7fc908e2d700 -1 log_channel(cluster) log
[ERR] : bad/negative dir size on 0x607 f(v133 m2018-10-19
12:51:12.016360 -4=-5+1)
2018-10-19 12:56:06.442182 7fc908
Try to run a scrub on the mds:
ceph daemon mds. scrub_path / recursive
That might yield additional information. You can then add "repair" to
the command to try to fix it.
Paul
Am Fr., 19. Okt. 2018 um 12:59 Uhr schrieb Burkhard Linke
:
>
> Hi,
>
>
> upon failover or restart, or MDS complains tha
Hi David,
sorry for the slow response, we had a hell of a week at work.
OK, so I had compression mode set to aggressive on some pools, but the global
option was not changed, because I interpreted the documentation as "pool
settings take precedence". To check your advise, I executed
ceph tell
Hi,
I’m currently running into a similar problem. My goal is to ensure all S3 users
are able to list any buckets/objects that are available within ceph.
Haven’t found a way around that yet, I indeed found also that linking buckets
to users allows them to list anything, but only for the user the
Hi there,
While upgrading from jewel to luminous, all packages wereupgraded but while
adding MGR with cluster name CEPHDR, it fails. It works with default
cluster name CEPH
root@vtier-P-node1:~# sudo su - ceph-deploy
ceph-deploy@vtier-P-node1:~$ ceph-deploy --ceph-conf /etc/ceph/cephdr.conf
mgr cr
1) I don't really know about the documentation. You can always put
together a PR for an update to the docs. I only know what I've tested
trying to get compression working.
2) If you have permissive in both places, no compression will happen, if
you have aggressive globally for the OSDs and none
Hi,
we were able to solve these issues. We switched bcache OSDs from ssd to
hdd in the ceph osd tree and lowered max recover from 3 to 1.
Thanks for your help!
Greets,
Stefan
Am 18.10.2018 um 15:42 schrieb David Turner:
> What are you OSD node stats? CPU, RAM, quantity and size of OSD disks.
>
Hi, your question is more about MAX AVAIL value I think, see how Ceph
calculates it
http://docs.ceph.com/docs/luminous/rados/operations/monitoring/#checking-a-cluster-s-usage-stats
One OSD getting full makes the pool full as well, so keep on nearfull OSDs
reweighting .
Jakub
19 paź 2018 16:34 "
Hi Nick
On 10/19/2018 10:14 AM, Nick Fisk wrote:
-Original Message-
From: Igor Fedotov [mailto:ifedo...@suse.de]
Sent: 19 October 2018 01:03
To: n...@fisk.me.uk; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] slow_used_bytes - SlowDB being used despite lots of
space free in BlockD
On Thu, Oct 18, 2018 at 2:28 PM Graham Allan wrote:
> Thanks Greg,
>
> This did get resolved though I'm not 100% certain why!
>
> For one of the suspect shards which caused crash on backfill, I
> attempted to delete the associated via s3, late last week. I then
> examined the filestore OSDs and t
Hi Frank,
On 10/19/2018 2:19 PM, Frank Schilder wrote:
Hi David,
sorry for the slow response, we had a hell of a week at work.
OK, so I had compression mode set to aggressive on some pools, but the global option was
not changed, because I interpreted the documentation as "pool settings take
no action is required. mds fixes this type of error atomically.
On Fri, Oct 19, 2018 at 6:59 PM Burkhard Linke
wrote:
>
> Hi,
>
>
> upon failover or restart, or MDS complains that something is wrong with
> one of the stray directories:
>
>
> 2018-10-19 12:56:06.442151 7fc908e2d700 -1 log_channel(c
On Fri, Oct 19, 2018 at 10:06:06AM +0200, Wido den Hollander wrote:
>
>
> On 10/19/18 7:51 AM, xiang@iluvatar.ai wrote:
> > Hi!
> >
> > I use ceph 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic
> > (stable), and find that:
> >
> > When expand whole cluster, i update pg_num, all suc
Hi folks,
I have a rookie question. Does the number of the buckets chosen as the
failure domain must be equal or greater than the number of replica (or
k+m for erasure coding)?
E.g., for an erasure code profile where k=4, m=2, failure domain=rack,
does it only work when there are 6 or more racks
25 matches
Mail list logo