date:20161115

Re: [ceph-users] Intermittent permission denied using kernel client with mds path cap

2016-11-15 Thread Henrik Korkuc

I filled http://tracker.ceph.com/issues/17858 recently, I am seeing this problem on 10.2.3 ceph-fuse, but maybe kernel client is affected too. It is easy to replicate, just do deep "mkdir -p", e.g. "mkdir -p 1/2/3/4/5/6/7/8/9/0/1/2/3/4/5/6/7/8/9" On 16-11-11 10:46, Dan van der Ster wrote: Hi

[ceph-users] After OSD Flap - FAILED assert(oi.version == i->first)

2016-11-15 Thread Nick Fisk

Hi, I have two OSD's which are failing with an assert which looks related to missing objects. This happened after a large RBD snapshot was deleted causing several OSD's to start flapping as they experienced high load. Cluster is fully recovered and I don't need any help from a recovery perspecti

[ceph-users] Kernel 4.7 on OSD nodes

2016-11-15 Thread Nick Fisk

Hi All, Just a slight note of caution. I had been running the 4.7 kernel (With Ubuntu 16.04) on the majority of my OSD Nodes, as when I installed the cluster there was that outstanding panic bug with the 4.4 kernel. I have been experiencing a lot of flapping OSD's every time the cluster was p

Re: [ceph-users] ceph-mon not starting on system startup (Ubuntu 16.04 / systemd)

2016-11-15 Thread Matthew Vernon

Hi, On 15/11/16 01:27, Craig Chi wrote: > What's your Ceph version? > I am using Jewel 10.2.3 and systemd seems to work normally. I deployed > Ceph by ansible, too. The version in Ubuntu 16.04, which is 10.2.2-0ubuntu0.16.04.2 > You can check whether you have /lib/systemd/system/ceph-mon.target

Re: [ceph-users] Kernel 4.7 on OSD nodes

2016-11-15 Thread Jaroslaw Owsiewski

Hi, We observed the same behavior with kernel 4.7 and Ubuntu 14.04 under heavy load. Kernel 4.2 is stable. We use only S3 gateway. -- Jarek -- Jarosław Owsiewski 2016-11-15 11:31 GMT+01:00 Nick Fisk : > Hi All, > > > > Just a slight note of caution. I had been running the 4.7 kernel (With >

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-15 Thread Thomas Danan

Hi Chris, We checked memory as well and we have plenty of free memory (12GB used / 125GB available) on each and every DN. Actually we have activated some Debug logs yesterday and found many messages like : 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7ff9bdb42700' had timed out after 1

Re: [ceph-users] Standby-replay mds: 10.2.2

2016-11-15 Thread John Spray

On Mon, Nov 14, 2016 at 11:35 PM, Goncalo Borges wrote: > Hi John... > > Thanks for replying. > > Some of the requested input is inline. > > Cheers > > Goncalo > > >>> >>> >>> We are currently undergoing an infrastructure migration. One of the first >>> machines to go through this migration proces

Re: [ceph-users] ceph-mon not starting on system startup (Ubuntu 16.04 / systemd)

2016-11-15 Thread Craig Chi

Hi, You can try to manually fix this by adding the /lib/systemd/system/ceph-mon.target file, which contains: === [Unit] Description=ceph target allowing to start/stop all ceph-mon@.service instances at once PartOf=ceph.target [Install] WantedBy

[ceph-users] FW: Kernel 4.7 on OSD nodes

2016-11-15 Thread Оралов Алкексей

Hello! I have problem with slow requests on kernel 4.4.0-45 , rolled back all nodes to 4.4.0-42 Ubuntu 16.04.1 LTS (Xenial Xerus) ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) cid:image001.png@01CDADF0.79E46560 Оралов Алексей Отдел корпоративной сети и техно

[ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread Webert de Souza Lima

Hi, after running a cephfs on my ceph cluster I got stuck with the following heath status: # ceph status cluster ac482f5b-dce7-410d-bcc9-7b8584bd58f5 health HEALTH_WARN 128 pgs degraded 128 pgs stuck unclean 128 pgs undersized recovery 24/4

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread Webert de Souza Lima

Also, i instructed all unclean pgs to repair and nothing happend. I did it like this: ~# for pg in `ceph pg dump_stuck unclean 2>&1 | grep -Po '[0-9]+\.[A-Za-z0-9]+'`; do ceph pg repair $pg; done On Tue, Nov 15, 2016 at 9:58 AM Webert de Souza Lima wrote: > Hi, > > after running a cephfs on my c

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread John Spray

On Tue, Nov 15, 2016 at 11:58 AM, Webert de Souza Lima wrote: > Hi, > > after running a cephfs on my ceph cluster I got stuck with the following > heath status: > > # ceph status > cluster ac482f5b-dce7-410d-bcc9-7b8584bd58f5 > health HEALTH_WARN > 128 pgs degraded >

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread Webert de Souza Lima

Hey John. Just to be sure; by "deleting the pools" you mean the *cephfs_metadata* and *cephfs_metadata* pools, right? Does it have any impact over radosgw? Thanks. On Tue, Nov 15, 2016 at 10:10 AM John Spray wrote: > On Tue, Nov 15, 2016 at 11:58 AM, Webert de Souza Lima > wrote: > > Hi, > > >

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread Webert de Souza Lima

I'm sorry, I meant *cephfs_data* and *cephfs_metadata* On Tue, Nov 15, 2016 at 10:15 AM Webert de Souza Lima wrote: > Hey John. > > Just to be sure; by "deleting the pools" you mean the *cephfs_metadata* > and *cephfs_metadata* pools, right? > Does it have any impact over radosgw? Thanks. > > O

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread John Spray

On Tue, Nov 15, 2016 at 12:14 PM, Webert de Souza Lima wrote: > Hey John. > > Just to be sure; by "deleting the pools" you mean the cephfs_metadata and > cephfs_metadata pools, right? > Does it have any impact over radosgw? Thanks. Yes, I meant the cephfs pools. It doesn't affect rgw (assuming y

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread Webert de Souza Lima

Not that I know of. On 5 other clusters it works just fine and configuration is the same for all. On this cluster I was using only radosgw, but cephfs was not in use but it had been already created following our procedures. This happened right after mounting it. On Tue, Nov 15, 2016 at 10:24 AM J

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread Burkhard Linke

Hi, On 11/15/2016 01:27 PM, Webert de Souza Lima wrote: Not that I know of. On 5 other clusters it works just fine and configuration is the same for all. On this cluster I was using only radosgw, but cephfs was not in use but it had been already created following our procedures. This happene

[ceph-users] kernel versions and slow requests - WAS: Re: FW: Kernel 4.7 on OSD nodes

2016-11-15 Thread Peter Maloney

On 11/15/16 12:58, Оралов Алкексей wrote: > > > > Hello! > > > > I have problem with slow requests on kernel 4.4.0-45 , rolled back all > nodes to 4.4.0-42 > > > > Ubuntu 16.04.1 LTS (Xenial Xerus) > > ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) > > > Can you describe yo

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-15 Thread Peter Maloney

Which kernel version are you using? I have a similar issue..ubuntu 14.04 kernel 3.13.0-96-generic, and ceph jewel 10.2.3. I get logs like this: 2016-11-15 13:13:57.295067 osd.9 10.3.0.132:6817/24137 98 : cluster [WRN] 16 slow requests, 5 included below; oldest blocked for > 7.957045 secs I set o

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread Webert de Souza Lima

sure, as requested: *cephfs* was created using the following command: ceph osd pool create cephfs_metadata 128 128 ceph osd pool create cephfs_data 128 128 ceph fs new cephfs cephfs_metadata cephfs_data *ceph.conf:* https://paste.debian.net/895841/ *# ceph osd crush tree*https://paste.debian.n

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-15 Thread Thomas Danan

Hi Peter, Ceph cluster version is 0.94.5 and we are running with Firefly tunables and also we have 10KPGs instead of the 30K / 40K we should have. The linux kernel version is 3.10.0-327.36.1.el7.x86_64 with RHEL 7.2 On our side we havethe following settings: mon_osd_adjust_heartbeat_grace = fals

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread Burkhard Linke

Hi, On 11/15/2016 01:55 PM, Webert de Souza Lima wrote: sure, as requested: *cephfs* was created using the following command: ceph osd pool create cephfs_metadata 128 128 ceph osd pool create cephfs_data 128 128 ceph fs new cephfs cephfs_metadata cephfs_data *ceph.conf:* https://paste.debian

[ceph-users] Best practices for use ceph cluster and directories with many! Entries

2016-11-15 Thread Hauke Homburg

Hello, wie have setup a ceph cluster with 10.0.2.3. under centos7. We have some Directories with more than 100k entries. We cannot! Unfortunately reduce directory count on the 100k Directories. As well as we don't want a ceph cluster with development functions. We installed the jewel release bec

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread Webert de Souza Lima

Right, thank you. On this particular cluster it would be Ok to have everything on the HDD. No big traffic here. In order to do that, do I need to delete this cephfs, delete its pools and create them again? After that I assume I would run ceph osd pool set cephfs_metadata crush_ruleset 0, as 0 is

[ceph-users] Ceph and container

2016-11-15 Thread Matteo Dacrema

Hi, does anyone ever tried to run ceph monitors in containers? Could it lead to performance issues? Can I run monitor containers on the OSD nodes? I don’t want to buy 3 dedicated servers. Is there any other solution? Thanks Best regards Matteo Dacrema __

Re: [ceph-users] Ceph and container

2016-11-15 Thread John Petrini

I've had lots of success running monitors in VM's. Never tried the container route but there is a ceph-docker project https://github.com/ceph/ceph-docker if you want to give it a shot. I don't know how highly recommended that it though, I've got no personal experience with it. No matter what you w

Re: [ceph-users] Ceph and container

2016-11-15 Thread Daniel Gryniewicz

In addition, Red Hat is shipping a containerized Ceph (all daemons, not just mons) as a tech preview in RHCS, and the plan is to support it going forward. We have not seen performance issues related to being containerized. It's based on the ceph-docker and ceph-ansible projects. Daniel On 1

Re: [ceph-users] Ceph and container

2016-11-15 Thread Tomasz Kuzemko

We are running all Ceph services inside LXC containers with XFS bind mounts since few years and it works great. Additionally we use macvlan for networking so each container has it's own IP address without any NATing. As for Docker (and specifically aufs/overlay), I would advise to test for data in

[ceph-users] Issues with RGW randomly restarting

2016-11-15 Thread John Rowe

Hello, We have 3 RGW servers setup with 5 OSDs. We have an application that is doing pretty steady writes, as well as a bunch of reads from that and other applications. Over the last week or so we have been seeing the app doing the writing getting blocked connections randomly, and in the RGW logs

Re: [ceph-users] Can't recover pgs degraded/stuck unclean/undersized

2016-11-15 Thread Webert de Souza Lima

I removed cephfs and its pools, created everything again using the default crush ruleset, which is for the HDD, and now ceph health is OK. I appreciate your help. Thank you very much. On Tue, Nov 15, 2016 at 11:48 AM Webert de Souza Lima wrote: > Right, thank you. > > On this particular cluster

Re: [ceph-users] After OSD Flap - FAILED assert(oi.version == i->first)

2016-11-15 Thread Samuel Just

http://tracker.ceph.com/issues/17916 I just pushed a branch wip-17916-jewel based on v10.2.3 with some additional debugging. Once it builds, would you be able to start the afflicted osds with that version of ceph-osd and debug osd = 20 debug ms = 1 debug filestore = 20 and get me the log? -Sam

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-15 Thread Peter Maloney

On 11/15/16 14:05, Thomas Danan wrote: > Hi Peter, > > Ceph cluster version is 0.94.5 and we are running with Firefly tunables and > also we have 10KPGs instead of the 30K / 40K we should have. > The linux kernel version is 3.10.0-327.36.1.el7.x86_64 with RHEL 7.2 > > On our side we havethe follow

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-15 Thread Thomas Danan

Very interesting ... Any idea why optimal tunable would help here ? on our cluster we have 500TB of data, I am a bit concerned about changing it without taking lot of precautions . ... I am curious to know how much time it takes you to change tunable, size of your cluster and observed impacts

Re: [ceph-users] Ceph and container

2016-11-15 Thread Matt Taylor

I think you may need to re-evaluate your situation. If you aren't willing to spend the $ on 3 Dedicated Servers, is your platform big enough to warrant the need for Ceph? On 16/11/16 01:25, Matteo Dacrema wrote: Hi, does anyone ever tried to run ceph monitors in containers? Could it lead to

Re: [ceph-users] Ceph and container

2016-11-15 Thread John Petrini

I forgot to mention that we are running 2 of our 3 monitors in VM's on our OSD nodes. It's a small cluster with only two OSD nodes. The third monitor is on a VM on a separate host. It works well but we made sure the OSD's had plenty of extra resources to accommodate the VM's and the host OS is runn

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-15 Thread Peter Maloney

On 11/15/16 22:13, Thomas Danan wrote: > Very interesting ... > > Any idea why optimal tunable would help here ? I think there are some versions where it rebalances data a bunch to even things out... I don't know why I think that...where I read it or anything. Maybe it was only argonaut vs newer. B

[ceph-users] stalls caused by scrub on jewel

2016-11-15 Thread Sage Weil

Hi everyone, There was a regression in jewel that can trigger long OSD stalls during scrub. How long the stalls are depends on how many objects are in your PGs, how fast your storage device is, and what is cached, but in at least one case they were long enough that the OSD internal heartbeat c

[ceph-users] rgw cache

2016-11-15 Thread Martin Bureau

Hello, How does the rgw cache work ? Is there any situation in which it would be better to disable it ? Regards, Martin ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Best practices for use ceph cluster and directories with many! Entries

2016-11-15 Thread Patrick Donnelly

On Tue, Nov 15, 2016 at 8:40 AM, Hauke Homburg wrote: > In the last weeks we enabled for testing the dir fragmentation. The Resultat > is that we have sometimes error messages with rsync with unlink and no-space > left on device. Enabling directory fragmentation would not cause the unlink and ENO

[ceph-users] Fwd: iSCSI Lun issue after MON Out Of Memory

2016-11-15 Thread Daleep Singh Bais

Dear All, Any suggestion in this regard will be helpful. Thanks, Daleep Singh Bais Forwarded Message Subject:iSCSI Lun issue after MON Out Of Memory Date: Tue, 15 Nov 2016 11:58:07 +0530 From: Daleep Singh Bais To: ceph-users Hello friends, I had RBD imag

[ceph-users] Ceph Volume Issue

2016-11-15 Thread Mehul1.Jani

Hi All, We have a Ceph Storage Cluster and it's been integrated with our Openstack private cloud. We have created a Pool for Volume which allows our Openstack Private Cloud user to create a volume from image and boot from volume. Additionally our images(both Ubuntu1404 and CentOS 7) are in a raw

41 matches

Mail list logo