Re: [ceph-users] ceph-fuse performance about hammer and jewel

2016-06-01 Thread Yan, Zheng
On Mon, May 30, 2016 at 10:22 PM, qisy wrote: > Hi, > After jewel released fs product ready version, I upgrade the old hammer > cluster, but iops droped a lot > > I made a test, with 3 nodes, each one have 8c 16G 1osd, the osd device > got 15000 iops > > I found ceph-fuse client has be

[ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread David Riedl
Hello everybody, we want to upgrade/fix our SAN switches. I kinda screwed up when I was first planning our CEPH storage cluster. Right now we have 2 x HP 2530-24G Switch (J9776A). We have 3 server each outfittet with 2 x 4 gigabit cards. (Don't judge me, I also was on a budget) Each card go

Re: [ceph-users] CephFS: slow writes over NFS when fs is mounted with kernel driver but fast with Fuse

2016-06-01 Thread Yan, Zheng
On Mon, May 30, 2016 at 10:29 PM, David wrote: > Hi All > > I'm having an issue with slow writes over NFS (v3) when cephfs is mounted > with the kernel driver. Writing a single 4K file from the NFS client is > taking 3 - 4 seconds, however a 4K write (with sync) into the same folder on > the serve

Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage capacity when using Ceph based On high-end storage systems

2016-06-01 Thread Adrian Saul
Also if for political reasons you need a “vendor” solution – ask Dell about their DSS 7000 servers – 90 8TB disks and two compute nodes in 4RU would go a long way to making up a multi-PB Ceph solution. Supermicro also do a similar solution with some 36, 60 and 90 disk in 4RU models. Cisco ha

[ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread Diego Castro
Hello, i have a cluster running Jewel 10.2.0, 25 OSD's + 4 Mon. Today my cluster suddenly went unhealth with lots of stuck pg's due unfound objects, no disks failures nor node crashes, it just went bad. I managed to put the cluster on health state again by marking lost objects to delete "ceph pg

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread Christian Balzer
Hello, firstly, I'm not the main network guy here by a long shot, OTOH I do know a thing or two, my they just be from trial and error. On Wed, 1 Jun 2016 09:49:53 +0200 David Riedl wrote: > Hello everybody, > > we want to upgrade/fix our SAN switches. I kinda screwed up when I was > first pla

[ceph-users] Message sequence overflow

2016-06-01 Thread Yan, Zheng
On Wed, Jun 1, 2016 at 6:15 AM, James Webb wrote: > Dear ceph-users... > > My team runs an internal buildfarm using ceph as a backend storage platform. > We’ve recently upgraded to Jewel and are having reliability issues that we > need some help with. > > Our infrastructure is the following: > -

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread David Riedl
So 3 servers are the entirety of your Ceph storage nodes, right? Exactly. + 3 Openstack Compute Nodes Have you been able to determine what causes the drops? My first guess would be that this bonding is simply not compatible with what the switches can do/expect. Yeah, something like that.

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread Christian Balzer
Hello, On Wed, 1 Jun 2016 11:03:16 +0200 David Riedl wrote: > > > So 3 servers are the entirety of your Ceph storage nodes, right? > Exactly. + 3 Openstack Compute Nodes > > > > Have you been able to determine what causes the drops? > > My first guess would be that this bonding is simply not

Re: [ceph-users] radosgw s3 errors after installation quickstart

2016-06-01 Thread hp cre
Hello Jean-Charles, Thanks for the tip. When I added my buckets as host aliases it worked with s3cmd. I just couldn't visualise how this is a thing, to add bucket names as hostname aliases. However, now, from dragondisk and crossftp s3 clients on another machine, when i try to put requests i get

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread Nick Fisk
Just a couple of points. 1. I know you said 10G was not an option, but I would really push for it. You can pick up Dell 10G-T switches (N4032) for not a lot more than a 48 port 1G switch. They make a lot more difference than just 10x the bandwidth. With Ceph latency is critical. As its 10G-T, you

Re: [ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread SCHAER Frederic
I do… In my case, I have collocated the MONs with some OSDs, and no later than Saturday when I lost data again, I found out that one of the MON+OSD nodes ran out of memory and started killing ceph-mon on that node… At the same moment, all OSDs started to complain about not being able to see oth

[ceph-users] Cache pool with replicated pool don't work properly.

2016-06-01 Thread 한승진
Hi All. My name is John Haan. I've been testing Cache Pool using Jewel version on ubuntu 16.04 OS. I implemented 2 types of cache tiers. first one is cache pool + erasure pool and the other one is cache pool + replicated pool I choose writeback mode of cache mode. vdbench and rados bench are

Re: [ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread Uwe Mesecke
> Am 01.06.2016 um 10:25 schrieb Diego Castro : > > Hello, i have a cluster running Jewel 10.2.0, 25 OSD's + 4 Mon. > Today my cluster suddenly went unhealth with lots of stuck pg's due unfound > objects, no disks failures nor node crashes, it just went bad. > > I managed to put the cluster on

Re: [ceph-users] Message sequence overflow

2016-06-01 Thread Ilya Dryomov
> On Wed, Jun 1, 2016 at 6:15 AM, James Webb wrote: >> Dear ceph-users... >> >> My team runs an internal buildfarm using ceph as a backend storage platform. >> We’ve recently upgraded to Jewel and are having reliability issues that we >> need some help with. >> >> Our infrastructure is the follo

Re: [ceph-users] ceph-fuse performance about hammer and jewel

2016-06-01 Thread qisy
my test fio fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=1G -filename=test.iso -name="CEPH 4KB randwrite test" -iodepth=32 -runtime=60 在 16/6/1 15:22, Yan, Zheng 写道: On Mon, May 30, 2016 at 10:22 PM, qisy wrote: Hi, After jewel released fs product ready version,

[ceph-users] rbd mirror : space and io requirements ?

2016-06-01 Thread Alexandre DERUMIER
Hi, I'm begin to look at rbd mirror features. How much space does it take ? Is it only a journal with some kind of list of block changes ? and how much io/s does it take ? worst case, 4k block write , how much write in journal ? My osd are ssd, 1journal + data for each osd/ssd, but I don't ov

Re: [ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread Diego Castro
Hello Uwe, i also have sortbitwise flag enable and i have the exactly behavior of yours. Perhaps this is also the root of my issues, does anybody knows if is safe to disable it? --- Diego Castro / The CloudFather GetupCloud.com - Eliminamos a Gravidade 2016-06-01 7:17 GMT-03:00 Uwe Mesecke : >

Re: [ceph-users] Message sequence overflow

2016-06-01 Thread Sage Weil
On Wed, 1 Jun 2016, Yan, Zheng wrote: > On Wed, Jun 1, 2016 at 6:15 AM, James Webb wrote: > > Dear ceph-users... > > > > My team runs an internal buildfarm using ceph as a backend storage > > platform. We’ve recently upgraded to Jewel and are having reliability > > issues that we need some help

[ceph-users] OOM on OSDS with erasure coding

2016-06-01 Thread Sharath Gururaj
Hi All, We are testing a erasure coded pool fronted by rados gateway. Recently many osds are going down due to out-of-memory. Here are the details. *Description of the cluster:* 32 hosts, 6 disks (osds) per host so 32*6 = 192 osds 17024 pgs, 15 pools, 107 TB data, 57616 kobjects 167 TB used, 508

[ceph-users] Problems with Calamari setup

2016-06-01 Thread fridifree
*Hello, Everyone. * I'm trying to install a Calamari server in my organisation and I'm encountering some problems. I have a small dev environment, just 4 OSD nodes and 5 monitors (one of them is also the RADOS GW). We chose to use Ubuntu 14.04 LTS for all our servers. The Calamari server is provi

Re: [ceph-users] Message sequence overflow

2016-06-01 Thread Ilya Dryomov
On Wed, Jun 1, 2016 at 2:49 PM, Sage Weil wrote: > On Wed, 1 Jun 2016, Yan, Zheng wrote: >> On Wed, Jun 1, 2016 at 6:15 AM, James Webb wrote: >> > Dear ceph-users... >> > >> > My team runs an internal buildfarm using ceph as a backend storage >> > platform. We’ve recently upgraded to Jewel and a

Re: [ceph-users] ceph-fuse performance about hammer and jewel

2016-06-01 Thread Yan, Zheng
On Wed, Jun 1, 2016 at 6:52 PM, qisy wrote: > my test fio > > fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=1G > -filename=test.iso -name="CEPH 4KB randwrite test" -iodepth=32 -runtime=60 > You were testing direct-IO performance. Hammer does not handle direct-IO correctly, da

Re: [ceph-users] Message sequence overflow

2016-06-01 Thread Yan, Zheng
On Wed, Jun 1, 2016 at 8:49 PM, Sage Weil wrote: > On Wed, 1 Jun 2016, Yan, Zheng wrote: >> On Wed, Jun 1, 2016 at 6:15 AM, James Webb wrote: >> > Dear ceph-users... >> > >> > My team runs an internal buildfarm using ceph as a backend storage >> > platform. We’ve recently upgraded to Jewel and a

Re: [ceph-users] Message sequence overflow

2016-06-01 Thread Sage Weil
On Wed, 1 Jun 2016, Yan, Zheng wrote: > On Wed, Jun 1, 2016 at 8:49 PM, Sage Weil wrote: > > On Wed, 1 Jun 2016, Yan, Zheng wrote: > >> On Wed, Jun 1, 2016 at 6:15 AM, James Webb wrote: > >> > Dear ceph-users... > >> > > >> > My team runs an internal buildfarm using ceph as a backend storage > >

[ceph-users] CDM at 12:30p EST Today

2016-06-01 Thread Patrick McGarry
Hey cephers, Just a reminder that our monthly Ceph developer call is today in just under 2 hours. Come join us to talk about current work going in to Ceph. Thanks! http://wiki.ceph.com/Planning -- Best Regards, Patrick McGarry Director Ceph Community || Red Hat http://ceph.com || http://com

Re: [ceph-users] Message sequence overflow

2016-06-01 Thread Ilya Dryomov
On Wed, Jun 1, 2016 at 4:22 PM, Sage Weil wrote: > On Wed, 1 Jun 2016, Yan, Zheng wrote: >> On Wed, Jun 1, 2016 at 8:49 PM, Sage Weil wrote: >> > On Wed, 1 Jun 2016, Yan, Zheng wrote: >> >> On Wed, Jun 1, 2016 at 6:15 AM, James Webb wrote: >> >> > Dear ceph-users... >> >> > >> >> > My team runs

[ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Adam Tygart
Hello all, I'm running into an issue with ceph osds crashing over the last 4 days. I'm running Jewel (10.2.1) on CentOS 7.2.1511. A little setup information: 26 hosts 2x 400GB Intel DC P3700 SSDs 12x6TB spinning disks 4x4TB spinning disks. The SSDs are used for both journals and as an OSD (for t

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread David Riedl
4. As Ceph has lots of connections on lots of IP's and port's, LACP or the Linux ALB mode should work really well to balance connections. Linux ALB Mode looks promising. Does that work with two switches? Each server has 4 ports which are 'splitted' and connected to each switch.

Re: [ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread Samuel Just
Can either of you reproduce with logs? That would make it a lot easier to track down if it's a bug. I'd want debug osd = 20 debug ms = 1 debug filestore = 20 On all of the osds for a particular pg from when it is clean until it develops an unfound object. -Sam On Wed, Jun 1, 2016 at 5:36 AM, D

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Brandon Morris, PMP
Adam, We ran into similar issues when we get too many objects in bucket (around 300 million). The .rgw.buckets.index pool became unable to complete backfill operations.The only way we were able to get past it was to export the offending placement group with the ceph-objectstore-tool and

[ceph-users] Infernalis => Jewel: ceph-fuse regression concerning the automatic mount at boot?

2016-06-01 Thread Francois Lafont
Hi, I have a Jewel Ceph cluster in OK state and I have a "ceph-fuse" Ubuntu Trusty client with ceph Infernalis. The cephfs is mounted automatically and perfectly during the boot via ceph-fuse and this line in /etc/fstab : ~# grep ceph /etc/fstab id=cephfs,keyring=/etc/ceph/ceph.client.cephfs.keyr

[ceph-users] OSD issue: unable to obtain rotating service keys

2016-06-01 Thread Jeffrey McDonald
Hi, I just performed a minor ceph upgrade on my ubuntu 14.04 cluster from ceph version to0.94.6-1trusty to 0.94.7-1trusty. Upon restarting the OSDs, I receive the error message: 2016-06-01 12:17:49.219512 7f64a70ea8c0 0 monclient: wait_auth_rotating timed out after 30 2016-06-01 12:17:49.219

Re: [ceph-users] Infernalis => Jewel: ceph-fuse regression concerning the automatic mount at boot?

2016-06-01 Thread Gregory Farnum
On Wed, Jun 1, 2016 at 10:23 AM, Francois Lafont wrote: > Hi, > > I have a Jewel Ceph cluster in OK state and I have a "ceph-fuse" Ubuntu > Trusty client with ceph Infernalis. The cephfs is mounted automatically > and perfectly during the boot via ceph-fuse and this line in /etc/fstab : > > ~# gre

Re: [ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread Diego Castro
Hello Samuel, i'm bit afraid of restarting my osd's again, i'll wait until the weekend to push the config. BTW, i just unset sortbitwise flag. --- Diego Castro / The CloudFather GetupCloud.com - Eliminamos a Gravidade 2016-06-01 13:39 GMT-03:00 Samuel Just : > Can either of you reproduce with l

[ceph-users] CephFS in the wild

2016-06-01 Thread Brady Deetz
Question: I'm curious if there is anybody else out there running CephFS at the scale I'm planning for. I'd like to know some of the issues you didn't expect that I should be looking out for. I'd also like to simply see when CephFS hasn't worked out and why. Basically, give me your war stories. Pr

[ceph-users] Client not finding keyring

2016-06-01 Thread RJ Nowling
Hi all, I'm trying to set up a Ceph cluster with an S3 gateway using the ceph-ansible playbooks. I'm running into an issue where the radosgw-admin client can't find the keyring. The path to the keyring is listed in the ceph.conf file. I confirmed with strace that the client opens the conf file

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Adam Tygart
I've been attempting to work through this, finding the pgs that are causing hangs, determining if they are "safe" to remove, and removing them with ceph-objectstore-tool on osd 16. I'm now getting hangs (followed by suicide timeouts) referencing pgs that I've just removed, so this doesn't seem to

Re: [ceph-users] Client not finding keyring

2016-06-01 Thread LOPEZ Jean-Charles
Hi, radosgw-admin is not radosgw. It’s the RADDOS Gateway cli admin utility. All ceph components by default use the client.admin user name to connect to the Ceph cluster. If you deployed the radosgw, the gateway itself was properly configured by Ansible and the files were placed where they have

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Gregory Farnum
If that pool is your metadata pool, it looks at a quick glance like it's timing out somewhere while reading and building up the omap contents (ie, the contents of a directory). Which might make sense if, say, you have very fragmented leveldb stores combined with very large CephFS directories. Tryin

Re: [ceph-users] Client not finding keyring

2016-06-01 Thread RJ Nowling
I did use ceph-ansible to deploy the gateway -- using the default settings. It should work out of the box but does not. So... can the radosgw-admin CLI utility take a keyring path in the conf file or does the path need to be manually specified? And secondly, after copying the keyring to one of t

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Adam Tygart
I tried to compact the leveldb on osd 16 and the osd is still hitting the suicide timeout. I know I've got some users with more than 1 million files in single directories. Now that I'm in this situation, can I get some pointers on how can I use either of your options? Thanks, Adam On Wed, Jun 1,

Re: [ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread Samuel Just
Was this cluster upgraded to jewel? If so, at what version did it start? -Sam On Wed, Jun 1, 2016 at 1:48 PM, Diego Castro wrote: > Hello Samuel, i'm bit afraid of restarting my osd's again, i'll wait until > the weekend to push the config. > BTW, i just unset sortbitwise flag. > > > --- > Diego

Re: [ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread Diego Castro
Yes, it was created as Hammer. I haven't faced any issues on the upgrade (despite the well know systemd), and after that the cluster didn't show any suspicious behavior. --- Diego Castro / The CloudFather GetupCloud.com - Eliminamos a Gravidade 2016-06-01 18:57 GMT-03:00 Samuel Just : > Was thi

Re: [ceph-users] Client not finding keyring

2016-06-01 Thread LOPEZ Jean-Charles
Looks like I missed the paste: http://docs.ceph.com/docs/master/man/8/ceph/#options There you have the options available from the command line. In your case the user id is radosgw-rgw0 so the command line should be radosgw-admin --id radosgw.rgw0 usage show or radosgw-admin --name client.rados

Re: [ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread Samuel Just
http://tracker.ceph.com/issues/16113 I think I found the bug. Thanks for the report! Turning off sortbitwise should be an ok workaround for the moment. -Sam On Wed, Jun 1, 2016 at 3:00 PM, Diego Castro wrote: > Yes, it was created as Hammer. > I haven't faced any issues on the upgrade (despite

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Mike Lovell
On Wed, Jun 1, 2016 at 9:13 AM, Adam Tygart wrote: > Hello all, > > I'm running into an issue with ceph osds crashing over the last 4 > days. I'm running Jewel (10.2.1) on CentOS 7.2.1511. > > A little setup information: > 26 hosts > 2x 400GB Intel DC P3700 SSDs > 12x6TB spinning disks > 4x4TB spi

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Gregory Farnum
On Wed, Jun 1, 2016 at 2:47 PM, Adam Tygart wrote: > I tried to compact the leveldb on osd 16 and the osd is still hitting > the suicide timeout. I know I've got some users with more than 1 > million files in single directories. > > Now that I'm in this situation, can I get some pointers on how ca

[ceph-users] Ceph API Announcement

2016-06-01 Thread chris holcombe
Hey Ceph Community, I'd like to show everyone a project I've been working on. It parses the ceph/src/mon/MonCommands.h file and produces a Python file that allows you to call every possible command Ceph exposes. It also has sub modules for every release since firefly so you can import the module

Re: [ceph-users] OSD issue: unable to obtain rotating service keys

2016-06-01 Thread Shinobu Kinjo
Would you enable debug for osd.177 debug osd = 20 debug filestore = 20 debug ms = 1 Cheers, Shinobu On Thu, Jun 2, 2016 at 2:31 AM, Jeffrey McDonald wrote: > Hi, > > I just performed a minor ceph upgrade on my ubuntu 14.04 cluster from ceph > version to0.94.6-1trusty to 0.94.7-1trusty. Upo

Re: [ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread Uwe Mesecke
Hey Sam, glad you found the bug. As another data point a just did the whole round of "healthy -> set sortbitwise -> osd restarts -> unfound objects -> unset sortbitwise -> healthy" with the debug settings as described by you earlier. I uploaded the logfiles... https://www.dropbox.com/s/f5hhptb

Re: [ceph-users] Infernalis => Jewel: ceph-fuse regression concerning the automatic mount at boot?

2016-06-01 Thread Francois Lafont
Hi, On 01/06/2016 23:16, Florent B wrote: > Don't have this problem on Debian migration from Infernalis to Jewel, > check all permissions... Ok, it's probably the reason (I hope) but currently I don't find the good unix rights. I have this (which doesn't work): ~# ll -d /etc/ceph drwxr-xr-x 2 r

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Brandon Morris, PMP
I concur with Greg. The only way that I was able to get back to Health_OK was to export/import. * Please note, any time you use the ceph_objectstore_tool you risk data loss if not done carefully. Never remove a PG until you have a known good export * Here are the steps I used: 1. set

Re: [ceph-users] OSD Restart results in "unfound objects"

2016-06-01 Thread Samuel Just
Yep, looks like the same issue: 2016-06-02 00:45:27.977064 7fc11b4e9700 10 osd.17 pg_epoch: 11108 pg[34.4a( v 11104'1080336 lc 11104'1080335 (11069'1077294,11104'1080336] local-les=11108 n=50593 ec=2051 les/c/f 11104/11104/0 11106/11107/11107) [17,13] r=0 lpr=11107 pi=11101-11106/3 crt=11104'10803

Re: [ceph-users] Ceph Status - Segmentation Fault

2016-06-01 Thread Brad Hubbard
Could this be the call in RotatingKeyRing::get_secret() failing? Mathias, I'd suggest opening a tracker for this with the information in your last post and let us know the number here. Cheers, Brad On Wed, Jun 1, 2016 at 3:15 PM, Mathias Buresch < mathias.bure...@de.clara.net> wrote: > Hi, > >

Re: [ceph-users] OSD issue: unable to obtain rotating service keys

2016-06-01 Thread Christian Balzer
Hello, On Wed, 1 Jun 2016 12:31:41 -0500 Jeffrey McDonald wrote: > Hi, > > I just performed a minor ceph upgrade on my ubuntu 14.04 cluster from > ceph version to0.94.6-1trusty to 0.94.7-1trusty. Upon restarting the > OSDs, I receive the error message: > Unfortunately (despite what common s

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread Christian Balzer
On Wed, 1 Jun 2016 18:11:54 +0200 David Riedl wrote: > > > 4. As Ceph has lots of connections on lots of IP's and port's, LACP or > > the Linux ALB mode should work really well to balance connections. > Linux ALB Mode looks promising. Does that work with two switches? Each > server has 4 ports w

Re: [ceph-users] inkscope version 1.4

2016-06-01 Thread David Wang
Hi eric, can the new release version 1.4 be used on ceph jewel ? 2016-05-31 15:05 GMT+08:00 eric mourgaya : > hi guys, > > Inkscope 1.4 is released. > You can find the rpms and debian packages at > https://github.com/inkscope/inkscope-packaging. > This release add a monitor panel using coll

Re: [ceph-users] RGW Could not create user

2016-06-01 Thread David Wang
First, please check your ceph cluster is HEALTH_OK and then check if you have the caps the create users. 2016-05-31 16:11 GMT+08:00 Khang Nguyễn Nhật : > Thank, Wasserman! > I followed the instructions here: > http://docs.ceph.com/docs/master/radosgw/multisite/ > Step 1: radosgw-admin realm cre

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread Adrian Saul
I am currently running our Ceph POC environment using dual Nexus 9372TX 10G-T switches, each OSD host has two connections to each switch and they are formed into a single 4 link VPC (MC-LAG), which is bonded under LACP on the host side. What I have noticed is that the various hashing policies f

Re: [ceph-users] CephFS in the wild

2016-06-01 Thread Christian Balzer
Hello, On Wed, 1 Jun 2016 15:50:19 -0500 Brady Deetz wrote: > Question: > I'm curious if there is anybody else out there running CephFS at the > scale I'm planning for. I'd like to know some of the issues you didn't > expect that I should be looking out for. I'd also like to simply see > when Ce

Re: [ceph-users] OSD issue: unable to obtain rotating service keys

2016-06-01 Thread Jeffrey McDonald
Thanks Christian, I did google the error and I actually found this link. (Of course, I wouldn't want to waste others' time as either.) It appears to me to be a different issue than what I see because the OSDs actually fail to start. Anyways, after a few minutes I restarted the OSDs and they s

Re: [ceph-users] civetweb vs Apache for rgw

2016-06-01 Thread Tyler Bishop
Use Haproxy. sudomakeinstall.com/uncategorized/ceph-radosgw-nginx-tengine-apache-and-now-civetweb - Original Message - From: c...@jack.fr.eu.org To: ceph-users@lists.ceph.com Sent: Tuesday, May 24, 2016 5:01:05 AM Subject: Re: [ceph-users] civetweb vs Apache for rgw I'm using mod_rewrit

Re: [ceph-users] OSD issue: unable to obtain rotating service keys

2016-06-01 Thread Christian Balzer
Hello, On Wed, 1 Jun 2016 20:21:29 -0500 Jeffrey McDonald wrote: > Thanks Christian, > I did google the error and I actually found this link. (Of course, I > wouldn't want to waste others' time as either.) It appears to me to be a > different issue than what I see because the OSDs actually fa

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread Christian Balzer
Hello Adrian, On Thu, 2 Jun 2016 00:53:41 + Adrian Saul wrote: > > I am currently running our Ceph POC environment using dual Nexus 9372TX > 10G-T switches, each OSD host has two connections to each switch and > they are formed into a single 4 link VPC (MC-LAG), which is bonded under > LACP

Re: [ceph-users] Infernalis => Jewel: ceph-fuse regression concerning the automatic mount at boot?

2016-06-01 Thread Francois Lafont
Now, I have a explanation and it's _very_ strange, absolutely not related to a problem of Unix rights. For memory, my client node is an updated Ubuntu Trusty and I use ceph-fuse. Here is my fstab line: ~# grep ceph /etc/fstab id=cephfs,keyring=/etc/ceph/ceph.client.cephfs.keyring,client_mountpoint

Re: [ceph-users] Message sequence overflow

2016-06-01 Thread Yan, Zheng
On Wed, Jun 1, 2016 at 10:22 PM, Sage Weil wrote: > On Wed, 1 Jun 2016, Yan, Zheng wrote: >> On Wed, Jun 1, 2016 at 8:49 PM, Sage Weil wrote: >> > On Wed, 1 Jun 2016, Yan, Zheng wrote: >> >> On Wed, Jun 1, 2016 at 6:15 AM, James Webb wrote: >> >> > Dear ceph-users... >> >> > >> >> > My team runs

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread Adrian Saul
> > For two links it should be quite good - it seemed to balance across > > that quite well, but with 4 links it seemed to really prefer 2 in my case. > > > Just for the record, did you also change the LACP policies on the switches? > > From what I gather, having fancy pants L3+4 hashing on the Li

Re: [ceph-users] RGW Could not create user

2016-06-01 Thread Khang Nguyễn Nhật
Thank Wang! I will check it again. 2016-06-02 7:37 GMT+07:00 David Wang : > First, please check your ceph cluster is HEALTH_OK and then check if you > have the caps the create users. > > 2016-05-31 16:11 GMT+08:00 Khang Nguyễn Nhật < > nguyennhatkhang2...@gmail.com>: > >> Thank, Wasserman! >> I f

[ceph-users] Ceph Pool JERASURE issue.

2016-06-01 Thread Khang Nguyễn Nhật
Hi, I have 1 cluster as pictured below: - OSD-host1 run 2 ceph-osd daemon is mounted in /var/ceph/osd0 and /var/ceph/osd1. - OSD-host2 run 2 ceph-osd daemon is mounted in /var/ceph/osd2 and /var/ceph/osd3. - OSD-host3 only run 1 ceph-osd daemon is mounted in the /var/ceph/osd4. - This is my mypr

Re: [ceph-users] Ceph Pool JERASURE issue.

2016-06-01 Thread Somnath Roy
You need to either change failure domain to osd or need at least 5 host to satisfy host failure domain. Since it is not satisfying failure domain , pgs are undersized and degraded.. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Khang Nguy?n Nh

[ceph-users] mark_unfound_lost revert|delete behaviour

2016-06-01 Thread Richard Bade
Hi Everyone, Can anyone tell me how the ceph pg x.x mark_unfound_lost revert|delete command is meant to work? Due to some not fully know strange circumstances I have 1 unfound object in one of my pools. I've read through http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#unf