[ceph-users] Migration to ceph BlueStore

2017-08-02 Thread Andrei Mikhailovsky
Hello everyone, with the release of Kraken, I was thinking to migrate our existing ceph cluster to the BlueStore and use the existing journal ssd disks in a cache tier. The cluster that I have is pretty small, 3 servers with 10 osds each + 2 Intel 3710 SSDs for journals. Each server is also a

[ceph-users] decreasing number of PGs

2017-10-02 Thread Andrei Mikhailovsky
Hello everyone, what is the safest way to decrease the number of PGs in the cluster. Currently, I have too many per osd. Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] decreasing number of PGs

2017-10-03 Thread Andrei Mikhailovsky
in your cluster. > On Mon, Oct 2, 2017 at 4:02 PM Jack < [ mailto:c...@jack.fr.eu.org | > c...@jack.fr.eu.org ] > wrote: >> You cannot; >> On 02/10/2017 21:43, Andrei Mikhailovsky wrote: >> > Hello everyone, >>> what is the safes

[ceph-users] Jewel - frequent ceph-osd crashes

2016-08-30 Thread Andrei Mikhailovsky
Hello I've got a small cluster of 3 osd servers and 30 osds between them running Jewel 10.2.2 on Ubuntu 16.04 LTS with stock kernel version 4.4.0-34-generic. I am experiencing rather frequent osd crashes, which tend to happen a few times a month on random osds. The latest one gave me the foll

[ceph-users] unable to start radosgw after upgrade from 10.2.2 to 10.2.3

2016-10-05 Thread Andrei Mikhailovsky
Hello everyone, I've just updated my ceph to version 10.2.3 from 10.2.2 and I am no longer able to start the radosgw service. When executing I get the following error: 2016-10-05 22:14:10.735883 7f1852d26a00 0 ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b), process radosgw, pi

Re: [ceph-users] unable to start radosgw after upgrade from 10.2.2 to 10.2.3

2016-10-06 Thread Andrei Mikhailovsky
g the config migration > alluded to in that thread? I'm reluctant to do anything to the > still-working 0.94.9 gateway until I can get the 10.2.3 gateways working! > > Graham > > On 10/05/2016 04:23 PM, Andrei Mikhailovsky wrote: >> Hello everyone, >> >&

Re: [ceph-users] unable to start radosgw after upgrade from 10.2.2 to 10.2.3

2016-10-07 Thread Andrei Mikhailovsky
ft", "user_uid_pool": ".users.uid", "system_key": { "access_key": "", "secret_key": "" }, "placement_pools": [ { "key": "default-placement",

[ceph-users] radosgw - http status 400 while creating a bucket

2016-11-08 Thread Andrei Mikhailovsky
Hello I am having issues with creating buckets in radosgw. It started with an upgrade to version 10.2.x When I am creating a bucket I get the following error on the client side: boto.exception.S3ResponseError: S3ResponseError: 400 Bad Request InvalidArgumentmy-new-bucket-31337tx000

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-08 Thread Andrei Mikhailovsky
_log_pool": ".usage", "user_keys_pool": ".users", "user_email_pool": ".users.email", "user_swift_pool": ".users.swift", "user_uid_pool": ".users.uid", "system_key": { &

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-09 Thread Andrei Mikhailovsky
uot;name": "default-placement", "tags": [] } ], "default_placement": "default-placement", "realm_id": "5b41b1b2-0f92-463d-b582-07552f83e66c" } As you can see, the master_zone is now set to defa

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-09 Thread Andrei Mikhailovsky
- Original Message - > From: "Yehuda Sadeh-Weinraub" > To: "Andrei Mikhailovsky" > Cc: "ceph-users" > Sent: Wednesday, 9 November, 2016 01:13:48 > Subject: Re: [ceph-users] radosgw - http status 400 while creating a bucket > On Tue, No

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-09 Thread Andrei Mikhailovsky
[] } ], "default_placement": "default-placement", "realm_id": "" } The strange thing as you can see, following the "radosgw-admin period update --commit" command, the master_zone and the realm_id values reset to blank

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-09 Thread Andrei Mikhailovsky
{ >> "id": "default", >> "name": "default", >> "endpoints": [], >> "log_meta": "false", >> "log_data": "false", &

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
***bump*** this is pretty broken and urgent. thanks - Original Message - > From: "Andrei Mikhailovsky" > To: "Yoann Moulin" > Cc: "ceph-users" > Sent: Wednesday, 9 November, 2016 23:27:17 > Subject: Re: [ceph-users] radosgw - http s

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
"hostnames": [], >>> "hostnames_s3website": [], >>> "master_zone": "", >>> "zones": [ >>> { >>> "id": "default", >>> "name": "d

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
both services before running the script. I will run it again to make sure. Andrei - Original Message - > From: "Orit Wasserman" > To: "Andrei Mikhailovsky" > Cc: "Yoann Moulin" , "ceph-users" > > Sent: Thursday, 10 November, 2

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
"id": "default", "name": "default", "domain_root": ".rgw", "control_pool": ".rgw.control", "gc_pool": ".rgw.gc", "log_pool": ".log", "intent_log_pool"

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
ks Andrei - Original Message - > From: "Orit Wasserman" > To: "Andrei Mikhailovsky" > Cc: "Yoann Moulin" , "ceph-users" > > Sent: Thursday, 10 November, 2016 13:58:32 > Subject: Re: [ceph-users] radosgw - http status 400 while creat

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
- > From: "Orit Wasserman" > To: "Andrei Mikhailovsky" > Cc: "Yoann Moulin" , "ceph-users" > > Sent: Thursday, 10 November, 2016 15:22:16 > Subject: Re: [ceph-users] radosgw - http status 400 while creating a bucket > On Thu, Nov 10,

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
while creating a bucket > Your RGW doesn't think it's the master, and cannot connect to the > master, thus the create fails. > > Daniel > > On 11/08/2016 06:36 PM, Andrei Mikhailovsky wrote: >> Hello >> >> I am having issues with creating buckets in rad

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-11 Thread Andrei Mikhailovsky
oup set --rgw-zonegroup=default --default < > default-zg.json > 7. radosgw-admin zone set --rgw-zone=default --deault < default-zone.json > 8. radosgw-admin period update --commit > > Good luck, > Orit > > On Thu, Nov 10, 2016 at 7:08 PM, Andrei Mikhailovsky > wr

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-12 Thread Andrei Mikhailovsky
22:02 > Subject: Re: [ceph-users] radosgw - http status 400 while creating a bucket > On Fri, Nov 11, 2016 at 11:27 PM, Andrei Mikhailovsky > wrote: >> Hi Orit, >> >> Many thanks. I will try that over the weekend and let you know. >> >> Are you sure removing

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-12 Thread Andrei Mikhailovsky
Hi Orit, your instructions to the workaround has helped to solve the bucket creation problems that I had. again, many thanks for your help Andrei - Original Message - > From: "Orit Wasserman" > To: "Andrei Mikhailovsky" > Cc: "Yoann Moulin"

Re: [ceph-users] Big problems encoutered during upgrade from hammer 0.94.5 to jewel 10.2.3

2016-11-13 Thread Andrei Mikhailovsky
Hi Vincent, when i did the upgrade, i've done all clients and servers at the same time. No issue during the upgrade at all. No downtime. However, when I set the tunables to optimal i've lost all IO to the clients, which happened gradually, like over a few hours the iowait went from low figure

[ceph-users] renaming ceph server names

2016-11-29 Thread Andrei Mikhailovsky
Hello. As a part of the infrastructure change we are planning to rename the servers running ceph-osd, ceph-mon and radosgw services. The IP addresses will be the same, it's only the server names which will need to change. I would like to find out the steps required to perform these changes? W

Re: [ceph-users] renaming ceph server names

2016-12-02 Thread Andrei Mikhailovsky
*BUMP* > From: "andrei" > To: "ceph-users" > Sent: Tuesday, 29 November, 2016 12:46:05 > Subject: [ceph-users] renaming ceph server names > Hello. > As a part of the infrastructure change we are planning to rename the servers > running ceph-osd, ceph-mon and radosgw services. The IP addresses

[ceph-users] checking rbd volumes modification times

2018-07-16 Thread Andrei Mikhailovsky
Dear cephers, Could someone tell me how to check the rbd volumes modification times in ceph pool? I am currently in the process of trimming our ceph pool and would like to start with volumes which were not modified for a long time. How do I get that information? Cheers Andrei

[ceph-users] how to swap osds between servers

2018-09-03 Thread Andrei Mikhailovsky
Hello everyone, I am in the process of adding an additional osd server to my small ceph cluster as well as migrating from filestore to bluestore. Here is my setup at the moment: Ceph - 12.2.5 , running on Ubuntu 16.04 with latest updates 3 x osd servers with 10x3TB SAS drives, 2 x Intel S371

[ceph-users] bluestore osd journal move

2018-09-24 Thread Andrei Mikhailovsky
Hello everyone, I am wondering if it is possible to move the ssd journal for the bluestore osd? I would like to move it from one ssd drive to another. Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ce

Re: [ceph-users] bluestore osd journal move

2018-09-24 Thread Andrei Mikhailovsky
r us, there's no guarantee that they will work for you. > Read it very carefully and recheck every step before executing it. > > Regards, > Eugen > > [1] > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-February/024913.html > [2] > http://heiterbiswolkig.blo

[ceph-users] Luminous with osd flapping, slow requests when deep scrubbing

2018-10-15 Thread Andrei Mikhailovsky
Hello, I am currently running Luminous 12.2.8 on Ubuntu with 4.15.0-36-generic kernel from the official ubuntu repo. The cluster has 4 mon + osd servers. Each osd server has the total of 9 spinning osds and 1 ssd for the hdd and ssd pools. The hdds are backed by the S3710 ssds for journaling w

Re: [ceph-users] Luminous with osd flapping, slow requests when deep scrubbing

2018-10-16 Thread Andrei Mikhailovsky
Hi Christian, - Original Message - > From: "Christian Balzer" > To: "ceph-users" > Cc: "Andrei Mikhailovsky" > Sent: Tuesday, 16 October, 2018 08:51:36 > Subject: Re: [ceph-users] Luminous with osd flapping, slow requests when deep > sc

[ceph-users] fixing unrepairable inconsistent PG

2018-06-19 Thread Andrei Mikhailovsky
Hello everyone I am having trouble repairing one inconsistent and stubborn PG. I get the following error in ceph.log: 2018-06-19 11:00:00.000225 mon.arh-ibstorage1-ib mon.0 192.168.168.201:6789/0 675 : cluster [ERR] overall HEALTH_ERR noout flag(s) set; 4 scrub errors; Possible data damage

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-19 Thread Andrei Mikhailovsky
A quick update on my issue. I have noticed that while I was trying to move the problem object on osds, the file attributes got lost on one of the osds, which is I guess why the error messages showed the no attribute bit. I then copied the attributes metadata to the problematic object and restar

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-20 Thread Andrei Mikhailovsky
June, 2018 00:02:07 > Subject: Re: [ceph-users] fixing unrepairable inconsistent PG > Can you post the output of a pg query? > > On Tue, Jun 19, 2018 at 11:44 PM, Andrei Mikhailovsky > wrote: >> A quick update on my issue. I have noticed that while I was trying to move >> t

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-22 Thread Andrei Mikhailovsky
t; Subject: Re: [ceph-users] fixing unrepairable inconsistent PG > That seems like an authentication issue? > > Try running it like so... > > $ ceph --debug_monc 20 --debug_auth 20 pg 18.2 query > > On Thu, Jun 21, 2018 at 12:18 AM, Andrei Mikhailovsky > wrote: >> H

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-25 Thread Andrei Mikhailovsky
2.168.168.201:0/3046734987 wait complete. 2018-06-25 10:59:12.112764 7fe244b28700 1 -- 192.168.168.201:0/3046734987 >> 192.168.168.201:0/3046734987 conn(0x7fe240167220 :-1 s=STATE_NONE pgs=0 cs=0 l=0).mark_down 2018-06-25 10:59:12.112770 7fe244b28700 2 -- 192.168.168.201:0/3046734987 >>

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-26 Thread Andrei Mikhailovsky
r you > can query any other pg that has osd.21 as its primary? > > On Mon, Jun 25, 2018 at 8:04 PM, Andrei Mikhailovsky > wrote: >> Hi Brad, >> >> here is the output: >> >> -- >> >> root@arh-ibstorage1-ib:/home/andrei# ceph

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-27 Thread Andrei Mikhailovsky
"num_write_kb": 0, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 207,

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-27 Thread Andrei Mikhailovsky
; : "", "snapid" : -2, "max" : 0 }, "truncate_size" : 0, "version" : "120985'632942", "expected_object_size" : 0, "omap_digest" : "0x&q

Re: [ceph-users] Luminous Bluestore performance, bcache

2018-06-28 Thread Andrei Mikhailovsky
Hi Richard, It is an interesting test for me too as I am planning to migrate to Bluestore storage and was considering repurposing the ssd disks that we currently use for journals. I was wondering if you are using the Filestore or the bluestone for the osds? Also, when you perform your testing,

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-28 Thread Andrei Mikhailovsky
index . All other pools look okay so far. I am wondering what could have got horribly wrong with the above pool? Cheers Andrei - Original Message - > From: "Brad Hubbard" > To: "Andrei Mikhailovsky" > Cc: "ceph-users" > Sent: Thursday, 28

Re: [ceph-users] Luminous Bluestore performance, bcache

2018-06-29 Thread Andrei Mikhailovsky
Thanks Richard, That sounds impressive, especially the around 30% hit ratio. That would be ideal for me, but we were only getting single digit results during my trials. I think around 5% was the figure if I remember correctly. However, most of our vms were created a bit chaotically (not using p

Re: [ceph-users] ceph-mon memory issue jewel 10.2.5 kernel 4.4

2017-02-08 Thread Andrei Mikhailovsky
+1 Ever since upgrading to 10.2.x I have been seeing a lot of issues with our ceph cluster. I have been seeing osds down, osd servers running out of memory and killing all ceph-osd processes. Again, 10.2.5 on 4.4.x kernel. It seems what with every release there are more and more problems with

Re: [ceph-users] ceph-mon memory issue jewel 10.2.5 kernel 4.4

2017-02-08 Thread Andrei Mikhailovsky
+1 Ever since upgrading to 10.2.x I have been seeing a lot of issues with our ceph cluster. I have been seeing osds down, osd servers running out of memory and killing all ceph-osd processes. Again, 10.2.5 on 4.4.x kernel. It seems what with every release there are more and more problems with

Re: [ceph-users] ceph-mon memory issue jewel 10.2.5 kernel 4.4

2017-02-09 Thread Andrei Mikhailovsky
Hi Jim, I've got a few questions for you as it looks like we have a similar cluster for our ceph infrastructure. A quick overview of what we have. We are also running a small cluster of 3 storage nodes (30 osds in total) and 5 clients over 40gig/s infiniband link (ipoib). Ever since installin

[ceph-users] temp workaround for the unstable Jewel cluster

2017-02-16 Thread Andrei Mikhailovsky
Hello fellow cephers, I have been struggling with stability of my Jewel cluster and from what I can see I am not the only person. My setup is: 3 osd+mon servers, 30 osds, half a dozen of client host servers for rbd access, 40gbit/s infiniband link, all ceph servers are running on Ubuntu 16.0

Re: [ceph-users] Ceph on XenServer

2017-02-24 Thread Andrei Mikhailovsky
Hi Max, I've played around with ceph on xenserver about 2-3 years ago. I made it work, but it was all hackish and a lot of manual work. It didn't play well with the cloud orchestrator and I gave up hoping that either Citrix or Ceph team would make it work. Currently, I would not recommend usin

Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio results

2015-12-22 Thread Andrei Mikhailovsky
Hello guys, Was wondering if anyone has done testing on Samsung PM863 120 GB version to see how it performs? IMHO the 480GB version seems like a waste for the journal as you only need to have a small disk size to fit 3-4 osd journals. Unless you get a far greater durability. I am planning to r

[ceph-users] release of the next Infernalis

2015-12-22 Thread Andrei Mikhailovsky
Hello guys, I was planning to upgrade our ceph cluster over the holiday period and was wondering when are you planning to release the next point release of the Infernalis? Should I wait for it or just roll out 9.2.0 for the time being? thanks Andrei

Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio results

2015-12-26 Thread Andrei Mikhailovsky
From: "Tyler Bishop" > To: "Lionel Bouton" > Cc: "Andrei Mikhailovsky" , "ceph-users" > > Sent: Tuesday, 22 December, 2015 16:36:21 > Subject: Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio > results > Write endur

Re: [ceph-users] Unable to upload files with special characters like +

2016-02-02 Thread Andrei Mikhailovsky
Hi Eric, I remember having very similar issue when I was setting up radosgw. It turned out to be the issue on the proxy server side and not the radosgw. After trying a different proxy server the problem has been solved. Perhaps you have the same issue. Andrei > From: "Eric Magutu" > To: "

[ceph-users] rebalance near full osd

2016-04-05 Thread Andrei Mikhailovsky
Hi I've just had a warning ( from ceph -s) that one of the osds is near full. Having investigated the warning, i've located that osd.6 is 86% full. The data distribution is nowhere near to being equal on my osds as you can see from the df command output below: /dev/sdj1 2.8T 2.4T 413G 86% /v

Re: [ceph-users] rebalance near full osd

2016-04-07 Thread Andrei Mikhailovsky
thanks for pointing it out. Cheers Andrei - Original Message - > From: "Christian Balzer" > To: "ceph-users" > Cc: "Andrei Mikhailovsky" > Sent: Wednesday, 6 April, 2016 04:36:30 > Subject: Re: [ceph-users] rebalance near full osd >

Re: [ceph-users] rebalance near full osd

2016-04-12 Thread Andrei Mikhailovsky
I've done the ceph osd reweight-by-utilization and it seems to have solved the issue. However, not sure if this will be the long term solution. Thanks for your help Andrei - Original Message - > From: "Shinobu Kinjo" > To: "Andrei Mikhailovsky" > Cc

[ceph-users] Ceph cluster upgrade - adding ceph osd server

2016-04-15 Thread Andrei Mikhailovsky
Hi all, Was wondering what is the best way to add a new osd server to the small ceph cluster? I am interested in minimising performance degradation as the cluster is live and actively used. At the moment i've got the following setup: 2 osd servers (9 osds each) Journals on Intel 520/530 ss

[ceph-users] Hammer broke after adding 3rd osd server

2016-04-26 Thread Andrei Mikhailovsky
Hello everyone, I've recently performed a hardware upgrade on our small two osd server ceph cluster, which seems to have broke the ceph cluster. We are using ceph for cloudstack rbd images for vms.All of our servers are Ubuntu 14.04 LTS with latest updates and kernel 4.4.6 from ubuntu repo.

Re: [ceph-users] performance in a small cluster

2019-05-29 Thread Andrei Mikhailovsky
It would be interesting to learn the improvements types and the BIOS changes that helped you. Thanks > From: "Martin Verges" > To: "Robert Sander" > Cc: "ceph-users" > Sent: Wednesday, 29 May, 2019 10:19:09 > Subject: Re: [ceph-users] performance in a small cluster > Hello Robert, >> We h

[ceph-users] troubleshooting space usage

2019-06-28 Thread Andrei Mikhailovsky
Hi Could someone please explain / show how to troubleshoot the space usage in Ceph and how to reclaim the unused space? I have a small cluster with 40 osds, replica of 2, mainly used as a backend for cloud stack as well as the S3 gateway. The used space doesn't make any sense to me, especial

Re: [ceph-users] troubleshooting space usage

2019-07-02 Thread Andrei Mikhailovsky
Bump! > From: "Andrei Mikhailovsky" > To: "ceph-users" > Sent: Friday, 28 June, 2019 14:54:53 > Subject: [ceph-users] troubleshooting space usage > Hi > Could someone please explain / show how to troubleshoot the space usage in > Ceph > and how to

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Andrei Mikhailovsky
might also want to collect and share performance counter dumps (ceph > daemon > osd.N perf dump) and " > " reports from a couple of your OSDs. > Thanks, > Igor > On 7/2/2019 11:43 AM, Andrei Mikhailovsky wrote: >> Bump! >>> From: "Andrei Mik

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Andrei Mikhailovsky
ely different from allocation overhead. Looks > like > some orphaned objects in the pool. Could you please compare and share the > amounts of objects in the pool reported by "ceph (or rados) df detail" and > radosgw tools? > Thanks, > Igor > On 7/3/2019 12:56 PM, And

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Andrei Mikhailovsky
, 2019 13:49:02 > Subject: Re: [ceph-users] troubleshooting space usage > Looks fine - comparing bluestore_allocated vs. bluestore_stored shows a little > difference. So that's not the allocation overhead. > What's about comparing object counts reported by ceph and radosgw t

Re: [ceph-users] troubleshooting space usage

2019-07-04 Thread Andrei Mikhailovsky
Thanks for trying to help, Igor. > From: "Igor Fedotov" > To: "Andrei Mikhailovsky" > Cc: "ceph-users" > Sent: Thursday, 4 July, 2019 12:52:16 > Subject: Re: [ceph-users] troubleshooting space usage > Yep, this looks fine.. > hmm... sorry,

Re: [ceph-users] RGW how to delete orphans

2019-08-13 Thread Andrei Mikhailovsky
Hello I was hoping to follow up on this email and if Florian manage to get to the bottom of this. I have a case where I believe my RGW bucket is using too much space. For me, the ceph df command shows over 16TB usage, whereas the bucket stats shows the total of about 6TB. So, It seems that the

<    1   2   3