[ceph-users] How you handle failing/slow disks?

2018-11-21 Thread Arvydas Opulskis
Hi all, it's not first time we have this kind of problem, usually with HP raid controllers: 1. One disk is failing, bringing all controller to slow state, where it's performance degrades dramatically 2. Some OSDs are reported as down by other OSDs and marked as down 3. At same time other OSDs on

Re: [ceph-users] Jewel to Luminous RGW upgrade issues

2018-10-18 Thread Arvydas Opulskis
Yes, we know it now :) But it was a surprise at the moment we started RGW upgrade, because it was not noticed in release notes, or I missed it somehow. On Fri, Oct 19, 2018 at 9:41 AM Konstantin Shalygin wrote: > On 10/19/18 1:37 PM, Arvydas Opulskis wrote: > > Yes, that's under

Re: [ceph-users] What is rgw.none

2018-10-18 Thread Arvydas Opulskis
Hi, we have same question when trying to understand output of bucket stats. Maybe you had found explanation somewhere else? Thanks, Arvydas On Mon, Aug 6, 2018 at 10:28 AM Tomasz Płaza wrote: > Hi all, > > I have a bucket with a vary big num_objects in rgw.none: > > { > "bucket": "dyna", >

Re: [ceph-users] Disabling RGW Encryption support in Luminous

2018-10-18 Thread Arvydas Opulskis
Hi, yes we did it two days ago too. There is PR for this, but it's not commited yet. Thanks, anyway! Arvydas On Fri, Oct 19, 2018 at 7:15 AM Konstantin Shalygin wrote: > After RGW upgrade from Jewel to Luminous, one S3 user started to receive > errors from his postgre wal-e solution. Error is l

Re: [ceph-users] Jewel to Luminous RGW upgrade issues

2018-10-18 Thread Arvydas Opulskis
Yes, that's understandable, but question was about "transit period" when at some point we had part of RGW's upgraded and some of them were still in Jewel. At that time we had a lot of complains from S3 users, who couldn't access their buckets randomly. We did several upgrades in last years and it w

Re: [ceph-users] Disabling RGW Encryption support in Luminous

2018-10-17 Thread Arvydas Opulskis
Thank you! [cid:adform_logo_1884183f-e12d-4814-94e9-345e0c828435.png]<https://site.adform.com/> Arvydas Opulskis IT Systems Engineer Email: arvydas.opuls...@adform.com<mailto:arvydas.opuls...@adform.com> Mobile: +370 614 19604 Rotušės a. 17, LT-44279 Kaunas, Lithuania Adform Insider News&

[ceph-users] Disabling RGW Encryption support in Luminous

2018-10-16 Thread Arvydas Opulskis
Hi, got no success on IRC, maybe someone will help me here. After RGW upgrade from Jewel to Luminous, one S3 user started to receive errors from his postgre wal-e solution. Error is like this: "Server Side Encryption with KMS managed key requires HTTP header x-amz-server-side-encryption : aws:kms

[ceph-users] Jewel to Luminous RGW upgrade issues

2018-10-11 Thread Arvydas Opulskis
Hi, all. I want to ask did you had similar experience with upgrading Jewel RGW to Luminous. After upgrading monitors and OSD's, I started two new Luminous RGWs and put them to LB together with Jewel ones. And than interesting things started to happen. Some our jobs start to fail with " fatal err

Re: [ceph-users] Inconsistent PG could not be repaired

2018-08-16 Thread Arvydas Opulskis
it? > > > > Kind Regards, > > > > Tom > > > > *From:* ceph-users *On Behalf Of *Arvydas > Opulskis > *Sent:* 14 August 2018 12:33 > *To:* Brent Kennedy > *Cc:* Ceph Users > > *Subject:* Re: [ceph-users] Inconsistent PG could not be repaired > >

Re: [ceph-users] Inconsistent PG could not be repaired

2018-08-14 Thread Arvydas Opulskis
> the data(meaning it failed to write there). > > > > If you want to go that route, let me know, I wrote a how to on it. Should > be the last resort though. I also don’t know your setup, so I would hate > to recommend something so drastic. > > > > -Brent > > >

Re: [ceph-users] PG went to Down state on OSD failure

2018-08-06 Thread Arvydas Opulskis
Hi, what is "min_size" on that pool? How many osd nodes you have in cluster and do you use any custom crushmap? On Wed, Aug 1, 2018 at 1:57 PM, shrey chauhan wrote: > Hi, > > I am trying to understand what happens when an OSD fails. > > Few days back I wanted to check what happens when an OSD g

Re: [ceph-users] Inconsistent PG could not be repaired

2018-08-06 Thread Arvydas Opulskis
quot;, "data_digest": "0x6b102e59" } ] } ] } # rados -p .rgw.buckets get default.122888368.52__shadow_.3ubGZwLcz0oQ55-LTb7PCOTwKkv-nQf_7 test_2pg.file error getting .rgw.buckets/default.122888368.52__shadow_.3ubGZwLcz0oQ55-

[ceph-users] Inconsistent PG could not be repaired

2018-07-24 Thread Arvydas Opulskis
Hello, Cephers, after trying different repair approaches I am out of ideas how to repair inconsistent PG. I hope, someones sharp eye will notice what I overlooked. Some info about cluster: Centos 7.4 Jewel 10.2.10 Pool size 2 (yes, I know it's a very bad choice) Pool with inconsistent PG: .rgw.bu

[ceph-users] Problem with UID starting with underscores

2018-03-06 Thread Arvydas Opulskis
Hi all, because one our script misbehaved, new user with bad UID was created via API, and now we can't remove, view or modify it. I believe, it's because it has three underscores at the beginning: [root@rgw001 /]# radosgw-admin metadata list user | grep "___pro_" "___pro_", [root@rgw001 /]#

Re: [ceph-users] Unbalanced OSD's

2016-12-31 Thread Arvydas Opulskis
We have similar problems in our clusters and sometimes we do manual reweight. Also we noticed smaller PG's (more of them in pool) help with balancing too. Arvydas On Dec 30, 2016 21:01, "Shinobu Kinjo" wrote: > The best practice to reweight OSDs is to run > test-reweight-by-utilization which is

Re: [ceph-users] Same pg scrubbed over and over (Jewel)

2016-09-28 Thread Arvydas Opulskis
Hi, we have same situation with one PG on our different cluster. Scrubs and deep-scrubs are running over and over for same PG (38.34). I've logged some period with deep-scrub and some scrubs repeating. OSD log form primary osd can be found there: https://www.dropbox.com/s/njmixbgzkfo1wws/ceph-osd.

Re: [ceph-users] Scrub and deep-scrub repeating over and over

2016-09-14 Thread Arvydas Opulskis
n problem. Br, Arvydas On Thu, Sep 8, 2016 at 10:26 AM, Arvydas Opulskis wrote: > Hi Goncalo, there it is: > > # ceph pg 11.34a query > { > "state": "active+clean+scrubbing", > "snap_trimq": "[]", >

Re: [ceph-users] experiences in upgrading Infernalis to Jewel

2016-09-08 Thread Arvydas Opulskis
Hi, if you are using RGW, you can experience similar problems to ours when creating a bucket. You'll find what went wrong and how we solved it in my older email. Subject of topic is "Can't create bucket (ERROR: endpoints not configured for upstream zone)" Cheers, Arvydas On Thu, Sep 8, 2016 at 1

Re: [ceph-users] Scrub and deep-scrub repeating over and over

2016-09-08 Thread Arvydas Opulskis
d": 4190, "num_bytes_recovered": 10727412780, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set

[ceph-users] Scrub and deep-scrub repeating over and over

2016-09-08 Thread Arvydas Opulskis
Hi all, we have several PG's with repeating scrub tasks. As soon as scrub is complete, it starts again. You can get an idea from the log bellow: $ ceph -w | grep -i "11.34a" 2016-09-08 08:28:33.346798 osd.24 [INF] 11.34a scrub ok 2016-09-08 08:28:37.319018 osd.24 [INF] 11.34a scrub starts 2016-09

Re: [ceph-users] radosgw error in its log rgw_bucket_sync_user_stats()

2016-09-07 Thread Arvydas Opulskis
16 at 6:10 PM, Arvydas Opulskis wrote: > It is not over yet. Now if user recreates problematic bucket, it appears, > but with same "Access denied" error. Looks, like there are still some > corrupted data left about this bucket in Ceph. No problems if user creates > new buc

Re: [ceph-users] radosgw error in its log rgw_bucket_sync_user_stats()

2016-09-06 Thread Arvydas Opulskis
ation were noticed. Any ideas? :) On Tue, Sep 6, 2016 at 4:05 PM, Arvydas Opulskis wrote: > Hi, > > time to time we have same problem on our Jewel cluster (10.2.2, upgraded > from Infernalis). I checked few last occurrences and noticed, it happened > when user tried delete b

Re: [ceph-users] radosgw error in its log rgw_bucket_sync_user_stats()

2016-09-06 Thread Arvydas Opulskis
Hi, time to time we have same problem on our Jewel cluster (10.2.2, upgraded from Infernalis). I checked few last occurrences and noticed, it happened when user tried delete bucket from S3, while Ceph cluster was on heavy load (deep-scrub or PG back-fill operations running). Seems like some kind o

Re: [ceph-users] Can't create bucket (ERROR: endpoints not configured for upstream zone)

2016-07-28 Thread Arvydas Opulskis
Hi, We solved it by running Micha scripts, plus we needed to run period update and commit commands (for some reason we had to do it in separate commands): radosgw-admin period update radosgw-admin period commit Btw, we added endpoints to json file, but I am not sure these are needed. And I agr

Re: [ceph-users] Cannot create bucket via the S3 (s3cmd)

2016-02-17 Thread Arvydas Opulskis
Hi, Are you using rgw_dns_name parameter in config? Sometimes it’s needed (when s3 client sends bucket name as subdomain). Arvydas From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alexandr Porunov Sent: Wednesday, February 17, 2016 10:37 PM To: Василий Ангапов ; ceph-co

Re: [ceph-users] Can't fix down+incomplete PG

2016-02-09 Thread Arvydas Opulskis
Hi, What is min_size for this pool? Maybe you need to decrease it for cluster to start recovering. Arvydas From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Scott Laird Sent: Wednesday, February 10, 2016 7:22 AM To: 'ceph-users@lists.ceph.com' (ceph-users@lists.ceph.com)