date:20160217

Re: [ceph-users] Performance issues related to scrubbing

2016-02-17 Thread Christian Balzer

Hello, On Tue, 16 Feb 2016 10:46:32 -0800 Cullen King wrote: > Thanks for the helpful commentary Christian. Cluster is performing much > better with 50% more spindles (12 to 18 drives), along with setting scrub > sleep to 0.1. Didn't see really any gain from moving from the Samsung 850 > Pro jou

Re: [ceph-users] SSD-Cache Tier + RBD-Cache = Filesystem corruption?

2016-02-17 Thread Udo Waechter

Hello, sorry for the delay. I was pretty busy otherwise. On 02/11/2016 03:13 PM, Jason Dillaman wrote: > Assuming the partition table is still zeroed on that image, can you run: > > # rados -p get rbd_data.18394b3d1b58ba. - | cut > -b 512 | hexdump > Here's the hexdump: 000

[ceph-users] Infernalis sortbitwise flag

2016-02-17 Thread Markus Blank-Burian

Hi, I recently saw, that a new osdmap is created with the sortbitwise flag. Can this safely be enabled on an existing cluster and would there be any advantages in doing so? Markus ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Piotr Wachowicz

Thanks for your reply. > > Let's consider both cases: > > Journals on SSDs - for writes, the write operation returns right after > > data lands on the Journal's SSDs, but before it's written to the backing > > HDD. So, for writes, SSD journal approach should be comparable to having > > a SSD cach

Re: [ceph-users] Hammer OSD crash during deep scrub

2016-02-17 Thread Maksym Krasilnikov

Hello! On Wed, Feb 17, 2016 at 07:38:15AM +, ceph.user wrote: > ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) > 1: /usr/bin/ceph-osd() [0xbf03dc] > 2: (()+0xf0a0) [0x7f29e4c4d0a0] > 3: (gsignal()+0x35) [0x7f29e35b7165] > 4: (abort()+0x180) [0x7f29e35ba3e0] > 5: (__gnu_c

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread Nick Fisk

Thanks for posting your experiences John, very interesting read. I think the golden rule of around 1Ghz is still a realistic goal to aim for. It looks like you probably have around 16ghz for 60OSD's, or 0.26Ghz per OSD. Do you have any idea on how much CPU you think you would need to just be abl

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Nick Fisk

> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Christian Balzer > Sent: 17 February 2016 04:22 > To: ceph-users@lists.ceph.com > Cc: Piotr Wachowicz > Subject: Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is > better?

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread Nick Fisk

> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Christian Balzer > Sent: 17 February 2016 02:41 > To: ceph-users > Subject: Re: [ceph-users] Recomendations for building 1PB RadosGW with > Erasure Code > > > Hello, > > On Tue, 16 Feb 2016

Re: [ceph-users] ceph 9.2.0 mds cluster went down and now constantly crashes with Floating point exception

2016-02-17 Thread Kenneth Waegeman

On 05/02/16 11:43, John Spray wrote: On Fri, Feb 5, 2016 at 9:36 AM, Kenneth Waegeman wrote: On 04/02/16 16:17, Gregory Farnum wrote: On Thu, Feb 4, 2016 at 1:42 AM, Kenneth Waegeman wrote: Hi, Hi, we are running ceph 9.2.0. Overnight, our ceph state went to 'mds mds03 is laggy' . When I

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Christian Balzer

Hello, On Wed, 17 Feb 2016 10:04:11 +0100 Piotr Wachowicz wrote: > Thanks for your reply. > > > > > Let's consider both cases: > > > Journals on SSDs - for writes, the write operation returns right > > > after data lands on the Journal's SSDs, but before it's written to > > > the backing HDD.

[ceph-users] Adding multiple OSDs to existing cluster

2016-02-17 Thread Ed Rowley

Hi, We have been running Ceph in production for a few months and looking at our first big expansion. We are going to be adding 8 new OSDs across 3 hosts to our current cluster of 13 OSD across 5 hosts. We obviously want to minimize the amount of disruption this is going to cause but we are unsure

Re: [ceph-users] Adding multiple OSDs to existing cluster

2016-02-17 Thread Christian Balzer

Hello, On Wed, 17 Feb 2016 11:18:40 + Ed Rowley wrote: > Hi, > > We have been running Ceph in production for a few months and looking > at our first big expansion. We are going to be adding 8 new OSDs > across 3 hosts to our current cluster of 13 OSD across 5 hosts. We > obviously want to m

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Christian Balzer

Hello, On Wed, 17 Feb 2016 09:23:11 - Nick Fisk wrote: > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > > Of Christian Balzer > > Sent: 17 February 2016 04:22 > > To: ceph-users@lists.ceph.com > > Cc: Piotr Wachowicz > > Subject: Re:

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread John Hogenmiller

Tyler, E5-2660 V2 is a 10-core, 2.2Ghz, giving you roughly 44Ghz or 0.78Ghz per OSD. That seems to fall in line with Nick's "golden rule" or 0.5Ghz - 1Ghz per OSD. Are you doing EC or Replication? If EC, what profile? Could you also provide an average of CPU utilization? I'm still researching,

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Mark Nelson

On 02/17/2016 06:36 AM, Christian Balzer wrote: Hello, On Wed, 17 Feb 2016 09:23:11 - Nick Fisk wrote: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Christian Balzer Sent: 17 February 2016 04:22 To: ceph-users@lists.ceph.com Cc: Piotr

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread John Hogenmiller

I hadn't come across this ratio prior, but now that I've read that PDF you linked and I've narrowed my search in the mailing list, I think that the 0.5 - 1ghz per OSD ratio is pretty spot on. The 100Mhz per IOP is also pretty interesting, and we do indeed use 7200 RPM drives. I'll look up a few mo

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread Tyler Bishop

I'm using 2x replica on that pool for storing rbd volumes. Our workload is pretty heavy, id imagine objects an ec would be light in comparison. Tyler Bishop Chief Technical Officer 513-299-7108 x10 tyler.bis...@beyondhosting.net If you are not the intended recipient of thi

[ceph-users] Cannot change the gateway port (civetweb)

2016-02-17 Thread Alexandr Porunov

Hello, I have problem with port changes of rados gateway node. I don't know why but I cannot change listening port of civetweb. My steps to install radosgw: *ceph-deploy install --rgw gateway* *ceph-deploy admin gateway* *ceph-deploy create rgw gateway* (gateway starts on port 7480 as expected)

Re: [ceph-users] Cannot change the gateway port (civetweb)

2016-02-17 Thread Karol Mroz

On Wed, Feb 17, 2016 at 04:28:38PM +0200, Alexandr Porunov wrote: [...] > set_ports_option: cannot bind to 80: 13 (Permission denied) Hi, The problem is that civetweb can't bind to privileged port 80 because it currently drops permissions _before_ the bind. https://github.com/ceph/ceph/pull/7313

Re: [ceph-users] Cannot change the gateway port (civetweb)

2016-02-17 Thread Jaroslaw Owsiewski

Probably this is the reason: https://www.w3.org/Daemon/User/Installation/PrivilegedPorts.html Regards, -- Jarosław Owsiewski 2016-02-17 15:28 GMT+01:00 Alexandr Porunov : > Hello, > > I have problem with port changes of rados gateway node. > I don't know why but I cannot change listening port

Re: [ceph-users] Adding multiple OSDs to existing cluster

2016-02-17 Thread Christian Balzer

Hello, On Wed, 17 Feb 2016 13:44:17 + Ed Rowley wrote: > On 17 February 2016 at 12:04, Christian Balzer wrote: > > > > Hello, > > > > On Wed, 17 Feb 2016 11:18:40 + Ed Rowley wrote: > > > >> Hi, > >> > >> We have been running Ceph in production for a few months and looking > >> at our f

Re: [ceph-users] Adding multiple OSDs to existing cluster

2016-02-17 Thread Ed Rowley

On 17 February 2016 at 14:59, Christian Balzer wrote: > > Hello, > > On Wed, 17 Feb 2016 13:44:17 + Ed Rowley wrote: > >> On 17 February 2016 at 12:04, Christian Balzer wrote: >> > >> > Hello, >> > >> > On Wed, 17 Feb 2016 11:18:40 + Ed Rowley wrote: >> > >> >> Hi, >> >> >> >> We have bee

Re: [ceph-users] Performance Testing of CEPH on ARM MicroServer

2016-02-17 Thread Swapnil Jain

Thanks Christian, > On 17-Feb-2016, at 7:25 AM, Christian Balzer wrote: > > > Hello, > > On Mon, 15 Feb 2016 21:10:33 +0530 Swapnil Jain wrote: > >> For most of you CEPH on ARMv7 might not sound good. This is our setup >> and our FIO testing report. I am not able to understand …. >> > Jus

Re: [ceph-users] Performance issues related to scrubbing

2016-02-17 Thread Cullen King

On Wed, Feb 17, 2016 at 12:13 AM, Christian Balzer wrote: > > Hello, > > On Tue, 16 Feb 2016 10:46:32 -0800 Cullen King wrote: > > > Thanks for the helpful commentary Christian. Cluster is performing much > > better with 50% more spindles (12 to 18 drives), along with setting scrub > > sleep to 0

Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2016-02-17 Thread Scottix

Looks like the bug with the kernel using ceph and XFS was fixed, I haven't tested it yet but just wanted to give an update. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1527062 On Tue, Dec 8, 2015 at 8:05 AM Scottix wrote: > I can confirm it seems to be kernels greater than 3.16, we had

[ceph-users] Cannot create bucket via the S3 (s3cmd)

2016-02-17 Thread Alexandr Porunov

When I try to create bucket: s3cmd mb s3://first-bucket I always get this error: ERROR: S3 error: 405 (MethodNotAllowed) /var/log/ceph/ceph-client.rgw.gateway.log : 2016-02-17 20:22:49.282715 7f86c50f3700 1 handle_sigterm 2016-02-17 20:22:49.282750 7f86c50f3700 1 handle_sigterm set alarm for 12

Re: [ceph-users] Cannot create bucket via the S3 (s3cmd)

2016-02-17 Thread Василий Ангапов

First, seems to me you should not delete pools .rgw.buckets and .rgw.buckets.index because that's the pools where RGW stores buckets actually. But why did you do that? 2016-02-18 3:08 GMT+08:00 Alexandr Porunov : > When I try to create bucket: > s3cmd mb s3://first-bucket > > I always get this er

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread Nick Fisk

Ah typo, I meant to say 10Mhz per IO. So a 7.2k disk does around 80IOPs = ~ 800mhz which is close to the 1Ghz figure. From: John Hogenmiller [mailto:j...@hogenmiller.net] Sent: 17 February 2016 13:15 To: Nick Fisk Cc: Василий Ангапов ; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Reco

Re: [ceph-users] Cannot create bucket via the S3 (s3cmd)

2016-02-17 Thread Alexandr Porunov

Because I have created them manually and then I have installed Rados Gateway. After that I realised that Rados Gateway didn't work. I thought that it was because I have created pools manually so I removed those buckets which I had created and reinstall Rados Gateway. But without success of course

[ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Lukáš Kubín

Hi, I'm running a very small setup of 2 nodes with 6 OSDs each. There are 2 pools, each of size=2. Today, one of our OSDs got full, another 2 near full. Cluster turned into ERR state. I have noticed uneven space distribution among OSD drives between 70 and 100 perce. I have realized there's a low a

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Jan Schermer

Ahoj ;-) You can reweight them temporarily, that shifts the data from the full drives. ceph osd reweight osd.XX YY (XX = the number of full OSD, YY is "weight" which default to 1) This is different from "crush reweight" which defaults to drive size in TB. Beware that reweighting will (afaik) on

[ceph-users] Idea for speedup RadosGW for buckets with many objects.

2016-02-17 Thread Krzysztof Księżyk

Hi, I'm experiencing problem with poor performance of RadosGW while operating on bucket with many object. That's known issue with LevelDB and can be partially resolved using shrading but I have one more idea. As I see in ceph osd logs all slow requests are while making call to rgw.bucket_list: 20

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Somnath Roy

If you are not sure about what weight to put , ‘ceph osd reweight-by-utilization’ should also do the job for you automatically.. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: Wednesday, February 17, 2016 12:48 PM To: Lukáš

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Lukáš Kubín

Ahoj Jan, thanks for the quick hint! Those 2 OSDs are currently full and down. How should I handle that? Is it ok that I delete some pg directories again and start the OSD daemons, on both drives in parallel. Then set the weights as recommended ? What effect should I expect then - will the cluste

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Jan Schermer

Something must be on those 2 OSDs that ate all that space - ceph by default doesn't allow OSD to get completely full (filesystem-wise) and from what you've shown those filesystems are really really full. OSDs don't usually go down when "full" (95%) .. or do they? I don't think so... so the reaso

Re: [ceph-users] pg repair behavior? (Was: Re: getting rid of misplaced objects)

2016-02-17 Thread George Mihaiescu

We have three replicas, so we just performed md5sum on all of them in order to find the correct ones, then we deleted the bad file and ran pg repair. On 15 Feb 2016 10:42 a.m., "Zoltan Arnold Nagy" wrote: > Hi Bryan, > > You were right: we’ve modified our PG weights a little (from 1 to around > 0

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Lukáš Kubín

You're right, the "full" osd was still up and in until I increased the pg values of one of the pools. The redistribution has not completed yet and perhaps that's what is still filling the drive. With this info - do you think I'm still safe to follow the steps suggested in previous post? Thanks! L

[ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Kostis Fardelas

Hello cephers, due to an unfortunate sequence of events (disk crashes, network problems), we are currently in a situation with one PG that reports unfound objects. There is also an OSD which cannot start-up and crashes with the following: 2016-02-17 18:40:01.919546 7fecb0692700 -1 os/FileStore.cc:

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Gregory Farnum

On Wed, Feb 17, 2016 at 3:05 PM, Kostis Fardelas wrote: > Hello cephers, > due to an unfortunate sequence of events (disk crashes, network > problems), we are currently in a situation with one PG that reports > unfound objects. There is also an OSD which cannot start-up and > crashes with the foll

[ceph-users] How to properly deal with NEAR FULL OSD

2016-02-17 Thread Vlad Blando

Hi This been bugging me for some time now, the distribution of data on the OSD is not balanced so some OSD are near full, i did ceph osd reweight-by-utilization but it not helping much. [root@controller-node ~]# ceph osd tree # idweight type name up/down reweight -1 98.28 root d

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Jan Schermer

Hmm, it's possible there aren't any safeguards against filling the whole drive when increasing PGs, actually I think ceph only cares about free space when backilling which is not what happened (at least directly) in your case. However, having a completely full OSD filesystem is not going to end w

Re: [ceph-users] How to properly deal with NEAR FULL OSD

2016-02-17 Thread Jan Schermer

It would be helpful to see your crush map (there are some tunables that help with this issue as well available if you're not running ancient versions). However, distribution uniformity isn't that great really. It helps to increase the number of PGs, but beware that there's no turning back. Other

Re: [ceph-users] How to properly deal with NEAR FULL OSD

2016-02-17 Thread Stillwell, Bryan

Vlad, First off your cluster is rather full (80.31%). Hopefully you have hardware ordered for an expansion in the near future. Based on your 'ceph osd tree' output, it doesn't look like the reweight-by-utilization did anything for you. That last number for each OSD is set to 1, which means it d

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Kostis Fardelas

Thanks Greg, I gather from reading about ceph_objectstore_tool that it acts at the level of the PG. The fact is that I do not want to wipe the whole PG, only export certain objects (the unfound ones) and import them again into the cluster. To be precise the pg with the unfound objects is mapped lik

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Gregory Farnum

On Wed, Feb 17, 2016 at 4:44 PM, Kostis Fardelas wrote: > Thanks Greg, > I gather from reading about ceph_objectstore_tool that it acts at the > level of the PG. The fact is that I do not want to wipe the whole PG, > only export certain objects (the unfound ones) and import them again > into the c

Re: [ceph-users] How to properly deal with NEAR FULL OSD

2016-02-17 Thread Don Laursen

What are your outputs of ceph df ceph osd df Regards, Don > On Feb 17, 2016, at 5:31 PM, Stillwell, Bryan > wrote: > > Vlad, > > First off your cluster is rather full (80.31%). Hopefully you have > hardware ordered for an expansion in the near future. > > Based on your 'ceph osd tree' out

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Kostis Fardelas

Right now the PG is served by two other OSDs and fresh data is written to them. Is it safe to export the stale pg contents from the crashed OSD and try to just import them again back to the cluster (the PG is not entirely lost, only some objects didn't make it). What could be the right sequence of

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Gregory Farnum

You probably don't want to try and replace the dead OSD with a new one until stuff is otherwise recovered. Just import the PG into any osd in the cluster and it should serve the data up for proper recovery (and then delete it when done). I've never done this or worked on the tooling though so that

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Christian Balzer

Hello, On Wed, 17 Feb 2016 07:00:38 -0600 Mark Nelson wrote: > On 02/17/2016 06:36 AM, Christian Balzer wrote: > > > > Hello, > > > > On Wed, 17 Feb 2016 09:23:11 - Nick Fisk wrote: > > > >>> -Original Message- > >>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Beha

Re: [ceph-users] Performance Testing of CEPH on ARM MicroServer

2016-02-17 Thread Christian Balzer

Hello, On Wed, 17 Feb 2016 21:47:31 +0530 Swapnil Jain wrote: > Thanks Christian, > > > > > On 17-Feb-2016, at 7:25 AM, Christian Balzer wrote: > > > > > > Hello, > > > > On Mon, 15 Feb 2016 21:10:33 +0530 Swapnil Jain wrote: > > > >> For most of you CEPH on ARMv7 might not sound good.

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread Christian Balzer

Hello, On Wed, 17 Feb 2016 09:19:39 - Nick Fisk wrote: > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > > Of Christian Balzer > > Sent: 17 February 2016 02:41 > > To: ceph-users > > Subject: Re: [ceph-users] Recomendations for buildi

Re: [ceph-users] Cannot create bucket via the S3 (s3cmd)

2016-02-17 Thread Arvydas Opulskis

Hi, Are you using rgw_dns_name parameter in config? Sometimes it’s needed (when s3 client sends bucket name as subdomain). Arvydas From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alexandr Porunov Sent: Wednesday, February 17, 2016 10:37 PM To: Василий Ангапов ; ceph-co

52 matches

Mail list logo