Re: [ceph-users] hammer-0.94.5 + kernel-4.1.15 - cephfs stuck

2016-02-04 Thread Nikola Ciprich
On 4 February 2016 08:33:55 CET, Gregory Farnum wrote: >The quick and dirty cleanup is to restart the OSDs hosting those PGs. >They might have gotten some stuck ops which didn't get woken up; a few >bugs like that have gone by and are resolved in various stable >branches (I'm not sure what relea

Re: [ceph-users] mds0: Client X failing to respond to capability release

2016-02-04 Thread Michael Metz-Martini | SpeedPartner GmbH
Hi, Am 03.02.2016 um 15:55 schrieb Yan, Zheng: >> On Feb 3, 2016, at 21:50, Michael Metz-Martini | SpeedPartner GmbH >> wrote: >> Am 03.02.2016 um 12:11 schrieb Yan, Zheng: On Feb 3, 2016, at 17:39, Michael Metz-Martini | SpeedPartner GmbH wrote: Am 03.02.2016 um 10:26 schrieb G

Re: [ceph-users] mds0: Client X failing to respond to capability release

2016-02-04 Thread Yan, Zheng
On Thu, Feb 4, 2016 at 4:36 PM, Michael Metz-Martini | SpeedPartner GmbH wrote: > Hi, > > Am 03.02.2016 um 15:55 schrieb Yan, Zheng: >>> On Feb 3, 2016, at 21:50, Michael Metz-Martini | SpeedPartner GmbH >>> wrote: >>> Am 03.02.2016 um 12:11 schrieb Yan, Zheng: > On Feb 3, 2016, at 17:39, Mi

Re: [ceph-users] mds0: Client X failing to respond to capability release

2016-02-04 Thread Michael Metz-Martini | SpeedPartner GmbH
Hi, Am 04.02.2016 um 09:43 schrieb Yan, Zheng: > On Thu, Feb 4, 2016 at 4:36 PM, Michael Metz-Martini | SpeedPartner > GmbH wrote: >> Am 03.02.2016 um 15:55 schrieb Yan, Zheng: On Feb 3, 2016, at 21:50, Michael Metz-Martini | SpeedPartner GmbH wrote: Am 03.02.2016 um 12:11 schrie

[ceph-users] ceph 9.2.0 mds cluster went down and now constantly crashes with Floating point exception

2016-02-04 Thread Kenneth Waegeman
Hi, Hi, we are running ceph 9.2.0. Overnight, our ceph state went to 'mds mds03 is laggy' . When I checked the logs, I saw this mds crashed with a stacktrace. I checked the other mdss, and I saw the same there. When I try to start the mds again, I get again a stacktrace and it won't come up:

Re: [ceph-users] can not umount ceph osd partition

2016-02-04 Thread Yoann Moulin
Hello, >>> I am using 0.94.5. When I try to umount partition and fsck it I have issue: >>> root@storage003:~# stop ceph-osd id=13 >>> ceph-osd stop/waiting >>> root@storage003:~# umount /var/lib/ceph/osd/ceph-13 >>> root@storage003:~# fsck -yf /dev/sdf >>> fsck from util-linux 2.20.1 >>> e2fsck 1.

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-04 Thread Sascha Vogt
Am 03.02.2016 um 17:24 schrieb Wade Holler: > AFAIK when using XFS, parallel write as you described is not enabled. Not sure I'm getting this. If I have multiple OSDs on the same NVMe (separated by different data-partitions) I have multiple parallel writes (one "stream" per OSD), or am I mistaken?

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-04 Thread Wade Holler
You referenced parallel writes for journal and data. Which is default for btrfs but but XFS. Now you are mentioning multiple parallel writes to the drive , which of course yes will occur. Also Our Dell 400 Gb NVMe drives do not top out around 5-7 sequential writes as you mentioned. That would be 5

Re: [ceph-users] can not umount ceph osd partition

2016-02-04 Thread Max A. Krasilnikov
Hello! On Thu, Feb 04, 2016 at 11:10:06AM +0100, yoann.moulin wrote: > Hello, I am using 0.94.5. When I try to umount partition and fsck it I have issue: root@storage003:~# stop ceph-osd id=13 ceph-osd stop/waiting root@storage003:~# umount /var/lib/ceph/osd/ceph-13 root

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-04 Thread Sascha Vogt
Hi Robert, Am 04.02.2016 um 00:45 schrieb Robert LeBlanc: > Once we put in our cache tier the I/O on the spindles was so low, we > just moved the journals off the SSDs onto the spindles and left the > SSD space for cache. There have been testing showing that better > performance can be achieved by

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-04 Thread Sascha Vogt
Hi, Am 04.02.2016 um 12:59 schrieb Wade Holler: > You referenced parallel writes for journal and data. Which is default > for btrfs but but XFS. Now you are mentioning multiple parallel writes > to the drive , which of course yes will occur. Ah, that is good to know. So if I want to create more "p

Re: [ceph-users] Ceph Stats back to Calamari

2016-02-04 Thread Daniel Rolfe
Hi John, Thanks for the help, it was related to the calamari branch of diamond not working with the latest version of ceph I've given you credit on the github issue also https://github.com/ceph/calamari/issues/384 [image: Inline image 1] On Mon, Feb 1, 2016 at 11:22 PM, John Spray wrote: > Th

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-04 Thread Wade Holler
First on your comment of: "we found that during times where the cache pool flushed to the storage pool client IO took a severe hit" We found the same thing. http://blog.wadeit.io/ceph-cache-tier-performance-random-writes/ -- I don't claim this is a great write up, and not what a lot of folks are

[ceph-users] Default CRUSH Weight Set To 0 ?

2016-02-04 Thread Kyle Harris
Hello, I have been working on a very basic cluster with 3 nodes and a single OSD per node. I am using Hammer installed on CentOS 7 (ceph-0.94.5-0.el7.x86_64) since it is the LTS version. I kept running into an issue of not getting past the status of undersized+degraded+peered. I finally discove

Re: [ceph-users] mds0: Client X failing to respond to capability release

2016-02-04 Thread Yan, Zheng
> On Feb 4, 2016, at 17:00, Michael Metz-Martini | SpeedPartner GmbH > wrote: > > Hi, > > Am 04.02.2016 um 09:43 schrieb Yan, Zheng: >> On Thu, Feb 4, 2016 at 4:36 PM, Michael Metz-Martini | SpeedPartner >> GmbH wrote: >>> Am 03.02.2016 um 15:55 schrieb Yan, Zheng: > On Feb 3, 2016, at 21

Re: [ceph-users] Default CRUSH Weight Set To 0 ?

2016-02-04 Thread Burkhard Linke
Hi, On 02/04/2016 03:17 PM, Kyle Harris wrote: Hello, I have been working on a very basic cluster with 3 nodes and a single OSD per node. I am using Hammer installed on CentOS 7 (ceph-0.94.5-0.el7.x86_64) since it is the LTS version. I kept running into an issue of not getting past the sta

Re: [ceph-users] ceph 9.2.0 mds cluster went down and now constantly crashes with Floating point exception

2016-02-04 Thread Gregory Farnum
On Thu, Feb 4, 2016 at 1:42 AM, Kenneth Waegeman wrote: > Hi, > > Hi, we are running ceph 9.2.0. > Overnight, our ceph state went to 'mds mds03 is laggy' . When I checked the > logs, I saw this mds crashed with a stacktrace. I checked the other mdss, > and I saw the same there. > When I try to sta

[ceph-users] hb in and hb out from pg dump

2016-02-04 Thread WRIGHT, JON R (JON R)
New ceph user, so a basic question :) I have a newly setup Ceph cluster. Seems to be working ok. But . . . I'm looking at the output of ceph pg dump, and I see that in the osdstat list at the bottom of the output, there are empty brackets [] in the 'hb out' column for all of the OSDs. It

Re: [ceph-users] Performance issues related to scrubbing

2016-02-04 Thread Cullen King
Replies in-line: On Wed, Feb 3, 2016 at 9:54 PM, Christian Balzer wrote: > > Hello, > > On Wed, 3 Feb 2016 17:48:02 -0800 Cullen King wrote: > > > Hello, > > > > I've been trying to nail down a nasty performance issue related to > > scrubbing. I am mostly using radosgw with a handful of buckets

Re: [ceph-users] Upgrading with mon & osd on same host

2016-02-04 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Just make sure that your monitors and OSDs are on the very latest of Hammer or else your Infernalis OSDs won't activate. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Feb 4, 2016 at 12:

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-04 Thread Zoltan Arnold Nagy
One option you left out: you could put the journals on NVMe plus use the leftover space for a writeback bcache device which caches those 5 OSDs. This is exactly what I’m testing at the moment - 4xNVMe + 20 disks per box.Or just use the NVMe itself as a bcache cache device (don’t partition it) and l

[ceph-users] pg dump question

2016-02-04 Thread WRIGHT, JON R (JON R)
New ceph user, so a basic question I have a newly setup Ceph cluster. Seems to be working ok. But . . . I'm looking at the output of ceph pg dump, and I see that in the osdstat list at the bottom of the output, there are empty brackets [] in the 'hb out' column for all of the OSDs. It seem

[ceph-users] Feb Ceph Developer Monthly

2016-02-04 Thread Patrick McGarry
Hey cephers, For those of you that weren’t able to make the inaugural Ceph Developer Monthly (CDM) call yesterday (or for those who wish to review), the recording is now available on the Ceph YouTube channel: https://youtu.be/0gIqgxrmrJw If you are working on something related to Ceph, or would

[ceph-users] Vacaciones Hernán Pinto

2016-02-04 Thread hpinto
Me encuentro de Vacaciones hasta el Viernes 12 de Febrero. Para la atencion de requerimientos de Proyectos en curso favor canalizar con Ariel Muñoz al correo electronico amu...@iia.cl o al Fono +56228401000. Para la atención de Incidentes favor canalizar al correo soporte.inter...@iia.cl o al

Re: [ceph-users] Ceph and hadoop (fstab insted of CephFS)

2016-02-04 Thread Zoltan Arnold Nagy
Might be totally wrong here, but it’s not layering them but replacing hdfs:// URLs with ceph:// URLs so all the mapreduce/spark/hbase/whatever is on top can use CephFS directly which is not a bad thing to do (if it works) :-) > On 02 Feb 2016, at 16:50, John Spray wrote: > > On Tue, Feb 2, 201

Re: [ceph-users] pg dump question

2016-02-04 Thread Gregory Farnum
On Thu, Feb 4, 2016 at 10:23 AM, WRIGHT, JON R (JON R) wrote: > New ceph user, so a basic question > > I have a newly setup Ceph cluster. Seems to be working ok. But . . . > > I'm looking at the output of ceph pg dump, and I see that in the osdstat > list at the bottom of the output, there are

[ceph-users] why is there heavy read traffic during object delete?

2016-02-04 Thread Stephen Lord
I setup a cephfs file system with a cache tier over an erasure coded tier as an experiment: ceph osd erasure-code-profile set raid6 k=4 m=2 ceph osd pool create cephfs-metadata 512 512 ceph osd pool set cephfs-metadata size 3 ceph osd pool create cache-data 2048 2048 ceph osd pool cre

Re: [ceph-users] why is there heavy read traffic during object delete?

2016-02-04 Thread Gregory Farnum
On Thu, Feb 4, 2016 at 4:37 PM, Stephen Lord wrote: > I setup a cephfs file system with a cache tier over an erasure coded tier as > an experiment: > > ceph osd erasure-code-profile set raid6 k=4 m=2 > ceph osd pool create cephfs-metadata 512 512 > ceph osd pool set cephfs-metadata size 3 >

Re: [ceph-users] why is there heavy read traffic during object delete?

2016-02-04 Thread Stephen Lord
> On Feb 4, 2016, at 6:51 PM, Gregory Farnum wrote: > > I presume we're doing reads in order to gather some object metadata > from the cephfs-data pool; and the (small) newly-created objects in > cache-data are definitely whiteout objects indicating the object no > longer exists logically. > >

Re: [ceph-users] why is there heavy read traffic during object delete?

2016-02-04 Thread Gregory Farnum
On Thu, Feb 4, 2016 at 5:07 PM, Stephen Lord wrote: > >> On Feb 4, 2016, at 6:51 PM, Gregory Farnum wrote: >> >> I presume we're doing reads in order to gather some object metadata >> from the cephfs-data pool; and the (small) newly-created objects in >> cache-data are definitely whiteout objects

[ceph-users] network connectivity test tool?

2016-02-04 Thread Nigel Williams
I thought I had book-marked a neat shell script that used the Ceph.conf definitions to do an all-to-all, all-to-one check of network connectivity for a Ceph cluster (useful for discovering problems with jumbo frames), but I've lost the bookmark and after trawling github and trying various keywords

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-04 Thread Christian Balzer
On Wed, 3 Feb 2016 22:42:32 -0700 Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > > On Wed, Feb 3, 2016 at 9:00 PM, Christian Balzer wrote: > > On Wed, 3 Feb 2016 16:57:09 -0700 Robert LeBlanc wrote: > > > That's an interesting strategy, I suppose you haven't run

Re: [ceph-users] Performance issues related to scrubbing

2016-02-04 Thread Christian Balzer
Hello, On Thu, 4 Feb 2016 08:44:25 -0800 Cullen King wrote: > Replies in-line: > > On Wed, Feb 3, 2016 at 9:54 PM, Christian Balzer > wrote: > > > > > Hello, > > > > On Wed, 3 Feb 2016 17:48:02 -0800 Cullen King wrote: > > > > > Hello, > > > > > > I've been trying to nail down a nasty perform

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-04 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Thu, Feb 4, 2016 at 8:32 PM, Christian Balzer wrote: > On Wed, 3 Feb 2016 22:42:32 -0700 Robert LeBlanc wrote: > I just finished downgrading my test cluster from testing to Jessie and > then upgrading Ceph from Firefly to Hammer (that was fun fe

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-04 Thread Christian Balzer
On Thu, 4 Feb 2016 21:33:10 -0700 Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > On Thu, Feb 4, 2016 at 8:32 PM, Christian Balzer wrote: > > On Wed, 3 Feb 2016 22:42:32 -0700 Robert LeBlanc wrote: > > > I just finished downgrading my test cluster from testing to J

[ceph-users] Confusing message when (re)starting OSDs (location)

2016-02-04 Thread Christian Balzer
Hello, This is the latest version of Hammer, when restarting an OSD I get this output: --- === osd.23 === create-or-move updated item name 'osd.23' weight 0 at location {host=engtest03,root=default} to crush map --- However that host and all OSDs on it reside under a different root and thankfu

[ceph-users] [rgw][hammer] quota. how it should work?

2016-02-04 Thread Odintsov Vladislav
Hi all, I'm trying to set up a bucket quotas in hammer and I don't get any errors using S3 API and ceph documentation, when I exceed the limit (of total count of objects in a bucket, of bucket size), so I've got questions: There are two places, where I can configure quotas: user and bucket of t