date:20161118

Re: [ceph-users] Register ceph daemons on initctl

2016-11-18 Thread Jaemyoun Lee

Thanks! I solved it by the ceph-osd command. So... there is no a script to install Upstart, isn't it? Jae On Fri, Nov 18, 2016 at 3:26 PM 钟佳佳 wrote: > if you built from git repo tag v10.2.3, > refers to links below from ceph.com > http://docs.ceph.com/docs/emperor/install/build-packages/ > > h

[ceph-users] ceph mon eating lots of memory after upgrade0.94.2 to 0.94.9

2016-11-18 Thread Corin Langosch

Hi, about 2 weeks ago I upgraded a rather small cluster from ceph 0.94.2 to 0.94.9. The upgrade went fine, the cluster is running stable. But I just noticed that one monitor is already eating 20 GB of memory, growing slowly over time. The other 2 mons look fine. The disk space used by the probl

Re: [ceph-users] index-sharding on existing bucket ?

2016-11-18 Thread Orit Wasserman

Hi, We have support for offline bucket resharding admin command: https://github.com/ceph/ceph/pull/11230. It will be available in Jewel 10.2.5. Orit On Thu, Nov 17, 2016 at 9:11 PM, Yoann Moulin wrote: > Hello, > > is that possible to shard the index of existing buckets ? > > I have more than

[ceph-users] I want to submit a PR - Can someone guide me

2016-11-18 Thread Nick Fisk

Hi All, I want to submit a PR to include fix in this tracker bug, as I have just realised I've been experiencing it. http://tracker.ceph.com/issues/9860 I understand that I would also need to update the debian/ceph-osd.* to get the file copied, however I'm not quite sure where this new file (/

Re: [ceph-users] After OSD Flap - FAILED assert(oi.version == i->first)

2016-11-18 Thread Nick Fisk

Hi Sam, Updated with some more info. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Samuel Just > Sent: 17 November 2016 19:02 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] After OSD Flap - FAILED assert(oi.version == >

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-18 Thread Thomas Danan

Hi Nick, Here are some logs. The system is in IST TZ and I have filtered the logs to get only 2 last hours during which we can observe the issue. In that particular case, issue is illustrated with the following OSDs Primary: ID:607 PID:2962227 HOST:10.137.81.18 Secondary1 ID:528 PID:3721728 HO

[ceph-users] Down OSDs blocking read requests.

2016-11-18 Thread Iain Buclaw

Hi, Follow up from the suggestion to use any of the following options: - client_mount_timeout - rados_mon_op_timeout - rados_osd_op_timeout To mitigate the waiting time being blocked on requests. Is there really no other way around this? If two OSDs go down that between them have the both copi

Re: [ceph-users] Down OSDs blocking read requests.

2016-11-18 Thread John Spray

On Fri, Nov 18, 2016 at 11:53 AM, Iain Buclaw wrote: > Hi, > > Follow up from the suggestion to use any of the following options: > > - client_mount_timeout > - rados_mon_op_timeout > - rados_osd_op_timeout > > To mitigate the waiting time being blocked on requests. Is there > really no other way

Re: [ceph-users] Down OSDs blocking read requests.

2016-11-18 Thread Iain Buclaw

On 18 November 2016 at 13:14, John Spray wrote: > On Fri, Nov 18, 2016 at 11:53 AM, Iain Buclaw wrote: >> Hi, >> >> Follow up from the suggestion to use any of the following options: >> >> - client_mount_timeout >> - rados_mon_op_timeout >> - rados_osd_op_timeout >> >> To mitigate the waiting tim

Re: [ceph-users] Down OSDs blocking read requests.

2016-11-18 Thread John Spray

On Fri, Nov 18, 2016 at 1:04 PM, Iain Buclaw wrote: > On 18 November 2016 at 13:14, John Spray wrote: >> On Fri, Nov 18, 2016 at 11:53 AM, Iain Buclaw wrote: >>> Hi, >>> >>> Follow up from the suggestion to use any of the following options: >>> >>> - client_mount_timeout >>> - rados_mon_op_timeo

[ceph-users] Intel P3700 SSD for journals

2016-11-18 Thread William Josefsson

Hi list, I wonder if there is anyone who have experience with Intel P3700 SSD drives as Journals, and can share their experience? I was thinking of using the P3700 SSD 400GB as journal in my ceph deployment. It is benchmarked in Sebastian hann ssd page as well. However a vendor I spoke to didn't q

Re: [ceph-users] ceph mon eating lots of memory after upgrade0.94.2 to 0.94.9

2016-11-18 Thread David Turner

We've had this for a while. We just monitor memory usage and restart the mon services when 1 or more reach 80%. Sent from my iPhone > On Nov 18, 2016, at 3:35 AM, Corin Langosch > wrote: > > Hi, > > about 2 weeks ago I upgraded a rather small cluster from ceph 0.94.2 to > 0.94.9. The upgrade

Re: [ceph-users] rgw print continue and civetweb

2016-11-18 Thread William Josefsson

thanks Yehuda and Brian. I'm not sure if you have ever seen this error with radosgw (lastest Hammer CentOS7), or can advice whether this is a critical error? appreciate any hints here. thx will 2016-11-12 13:49:08.905114 7fbba7fff700 20 RGWUserStatsCache: sync user=myuserid1 2016-11-12 13:49:08.90

Re: [ceph-users] ceph mon eating lots of memory after upgrade0.94.2 to 0.94.9

2016-11-18 Thread William Josefsson

hi Corin. We run latest hammer on CentOS7.2, with 3 mons and have not seen this problem. I'm not sure if there are any other possible differences between the healthy nodes and the one that has excessive consumption of memory? thx will On Fri, Nov 18, 2016 at 6:35 PM, Corin Langosch wrote: > Hi, >

Re: [ceph-users] Intel P3700 SSD for journals

2016-11-18 Thread Nick Fisk

I'm using the 400Gb models as a Journal for 12x drives. I know this is probably pushing it a little bit, but seems to work fine. I'm guessing the reason may be relating to the TBW figure being higher on the more expensive models, maybe they don't want to have to replace warn NVME's on warranty?

Re: [ceph-users] Intel P3700 SSD for journals

2016-11-18 Thread William Josefsson

yes nick, you're right, I can now see on page 16 here www.intel.com/content/www/xa/en/solid-state-drives/ssd-dc-p3700-spec.html there is a difference in the durability. However, I think 7.3PBW isn't much worse than Intel S3610 that's much slower. thx will 400GB: 7.3 PBW 800GB: 14.6 PBW (10 drive

Re: [ceph-users] Antw: Re: Best practices for extending a ceph cluster with minimal client impact data movement

2016-11-18 Thread Martin Palma

> I was wondering how exactly you accomplish that? > Can you do this with a "ceph-deploy create" with "noin" or "noup" flags > set, or does one need to follow the manual steps of adding an osd? You can do it either way (manual or with ceph-deploy). Here are the steps using ceph-deploy: 1. Add "os

[ceph-users] Ceph Infrastructure Downtime

2016-11-18 Thread Patrick McGarry

Hey Cephers, Due to Dreamhost shutting down the old DreamCompute cluster in their US-East 1 region, we are in the process of beginning the migration of Ceph infrastructure. We will need to move download.ceph.com, tracker.ceph.com, and docs.ceph.com to their US-East 2 region. The current plan is

Re: [ceph-users] backup of radosgw config

2016-11-18 Thread Stéphane DUGRAVOT

- Le 3 Nov 16, à 5:18, Thomas a écrit : > Hi guys, Hi Thomas, This is a question I also asked myself ... Maybe something like : radosgw-admin zonegroup get radosgw-admin zone get And for each user : radosgw-admin metadata get user:uid Anyone ? Stephane. > I'm not sure this was ask

Re: [ceph-users] Configuring Ceph RadosGW with SLA based rados pools

2016-11-18 Thread Stéphane DUGRAVOT

- Le 4 Nov 16, à 21:17, Andrey Ptashnik a écrit : > Hello Ceph team! > I’m trying to create different pools in Ceph in order to have different tiers > (some are fast, small and expensive and others are plain big and cheap), so > certain users will be tied to one pool or another. > - I crea

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-18 Thread Thomas Danan

I often read that small IO write and RBD are working better with bigger filestore_max_sync_interval than default value. Default value is 5 sec and I saw many post saying they are using 30 sec. Also the slow request symptom is often linked to this parameter. My journals are 10GB ( collocated with O

Re: [ceph-users] Intel P3700 SSD for journals

2016-11-18 Thread Heath Albritton

I've used the 400GB unit extensively for almost 18 months, one per six drives. They've performed flawlessly. In practice, journals will typically be quite small relative to the total capacity of the SSD. As such, there will be plenty of room for wear leveling. If there was some concern, one

Re: [ceph-users] Intel P3700 SSD for journals

2016-11-18 Thread Alan Johnson

We use the 800GB version as journal devices with up to an 1:18 ratio and have had good experiences no bottleneck on the journal side. These also feature good endurance characteristics. I would think that higher capacities are hard to justify as journals -Original Message- From: ceph-use

Re: [ceph-users] ceph-mon not starting on system startup (Ubuntu 16.04 / systemd)

2016-11-18 Thread Matthew Vernon

Hi, On 15/11/16 11:55, Craig Chi wrote: > You can try to manually fix this by adding the > /lib/systemd/system/ceph-mon.target file, which contains: > and then execute the following command to tell systemd to start this > target on bootup > systemctl enable ceph-mon.target This worked a treat

Re: [ceph-users] how possible is that ceph cluster crash

2016-11-18 Thread Craig Chi

Hi Nick and other Cephers, Thanks for your reply. >2) Config Errors>This can be an easy one to say you are safe from. But I would >say most outages and data loss incidents I have seen on the mailing>lists have >been due to poor hardware choice or configuring options such as size=2, >min_size=1

Re: [ceph-users] how possible is that ceph cluster crash

2016-11-18 Thread Samuel Just

Never *ever* use nobarrier with ceph under *any* circumstances. I cannot stress this enough. -Sam On Fri, Nov 18, 2016 at 10:39 AM, Craig Chi wrote: > Hi Nick and other Cephers, > > Thanks for your reply. > >> 2) Config Errors >> This can be an easy one to say you are safe from. But I would say

[ceph-users] "Lost" buckets on radosgw

2016-11-18 Thread Jeffrey McDonald

Hi, MSI has an erasure coded ceph pool accessible by the radosgw interface. We recently upgraded to Jewel from Hammer. Several days ago, we experienced issues with a couple of the rados gateway servers and inadvertently deployed older Hammer versions of the radosgw instances. This configuration

Re: [ceph-users] "Lost" buckets on radosgw

2016-11-18 Thread Yehuda Sadeh-Weinraub

On Fri, Nov 18, 2016 at 1:14 PM, Jeffrey McDonald wrote: > Hi, > > MSI has an erasure coded ceph pool accessible by the radosgw interface. > We recently upgraded to Jewel from Hammer. Several days ago, we > experienced issues with a couple of the rados gateway servers and > inadvertently deploye

Re: [ceph-users] I want to submit a PR - Can someone guide me

2016-11-18 Thread Brad Hubbard

+ceph-devel On Fri, Nov 18, 2016 at 8:45 PM, Nick Fisk wrote: > Hi All, > > I want to submit a PR to include fix in this tracker bug, as I have just > realised I've been experiencing it. > > http://tracker.ceph.com/issues/9860 > > I understand that I would also need to update the debian/ceph-osd

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-18 Thread Peter Maloney

On 11/18/16 18:00, Thomas Danan wrote: > > I often read that small IO write and RBD are working better with > bigger filestore_max_sync_interval than default value. > > Default value is 5 sec and I saw many post saying they are using 30 sec. > > Also the slow request symptom is often linked to this

[ceph-users] Ceph Down on Cluster

2016-11-18 Thread Bruno Silva

I have a Cluster with 5 nodes Ceph. For some reason the sync down and now I don't know what i can do to restore it. # ceph -s cluster 338bc0a5-c2f7-4c0a-9b35-25c7afee50c6 health HEALTH_WARN 1 pgs down 6 pgs incomplete 6 pgs stuck inactive 6 p

Re: [ceph-users] I want to submit a PR - Can someone guide me

2016-11-18 Thread Shinobu Kinjo

On Sat, Nov 19, 2016 at 6:59 AM, Brad Hubbard wrote: > +ceph-devel > > On Fri, Nov 18, 2016 at 8:45 PM, Nick Fisk wrote: >> Hi All, >> >> I want to submit a PR to include fix in this tracker bug, as I have just >> realised I've been experiencing it. >> >> http://tracker.ceph.com/issues/9860 >> >

Re: [ceph-users] how possible is that ceph cluster crash

2016-11-18 Thread Brian ::

This is like your mother telling not to cross the road when you were 4 years of age but not telling you it was because you could be flattened by a car :) Can you expand on your answer? If you are in a DC with AB power, redundant UPS, dual feed from the electric company, onsite generators, dual PSU

Re: [ceph-users] how possible is that ceph cluster crash

2016-11-18 Thread Nick Fisk

Yes, because these things happen http://www.theregister.co.uk/2016/11/15/memset_power_cut_service_interruption/ We had customers who had kit in this DC. To use your analogy, it's like crossing the road at traffic lights but not checking cars have stopped. You might be OK 99%of the time, but

Re: [ceph-users] how possible is that ceph cluster crash

2016-11-18 Thread Samuel Just

Many reasons: 1) You will eventually get a DC wide power event anyway at which point probably most of the OSDs will have hopelessly corrupted internal xfs structures (yes, I have seen this happen to a poor soul with a DC with redundant power). 2) Even in the case of a single rack/node power failur

Re: [ceph-users] Ceph Down on Cluster

2016-11-18 Thread Bruno Silva

Hi, thanks. # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable straw_calc_version 1 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 device5 devic

Re: [ceph-users] Ceph Down on Cluster

2016-11-18 Thread Goncalo Borges

Olá Bruno I am not understanding your outputs. On the first 'ceph -s' it says one mon is down but hour 'ceph health detail' does not report it further. On your crush map I count 7 osds= 0,1,2,3,4,6,7 but ceph -s says only 6 are active. Can you send the output of 'ceph osd tree, 'ceph osd df'

Re: [ceph-users] Bluestore + erasure coding memory usage

2016-11-18 Thread bobobo1...@gmail.com

Just to update, this is still an issue as of the latest Git commit ( 64bcf92e87f9fbb3045de49b7deb53aca1989123). On Fri, Nov 11, 2016 at 1:31 PM, bobobo1...@gmail.com wrote: > Here's another: http://termbin.com/smnm > > On Fri, Nov 11, 2016 at 1:28 PM, Sage Weil wrote: > > On Fri, 11 Nov 2016, b

Re: [ceph-users] how possible is that ceph cluster crash

2016-11-18 Thread Brian ::

Thanks Nick / Samuel, It's definitely worthwhile to explain exactly why this is such a bad idea. I think it will prevent people from ever doing it - rather than just telling people not to do it. On Sat, Nov 19, 2016 at 12:30 AM, Samuel Just wrote: > Many reasons: > > 1) You will eventually get

39 matches

Mail list logo