[ceph-users] hammer-0.94.5 + kernel-4.1.15 - cephfs stuck

2016-02-03 Thread Nikola Ciprich
Hello fellow ceph users and developers few days ago, I've update one our small cluster (three nodes) to kernel 4.1.15. Today I got cephfs stuck on one of the nodes. cpeh -s reports: mds0: Behind on trimming (155/30) restarting all MDS servers didn't help. all three cluster nodes are running ham

Re: [ceph-users] how to monit ceph bandwidth?

2016-02-03 Thread Gregory Farnum
On Tue, Feb 2, 2016 at 9:23 PM, yang wrote: > Hello everyone, > I have a ceph cluster (v0.94.5) with cephFS. There is several clients in the > cluster, > every client use their own directory in cephFS with ceph-fuse. > > I want to monit the IO bandwidth of the cluster and the client. > r/w bandwi

Re: [ceph-users] mds0: Client X failing to respond to capability release

2016-02-03 Thread Gregory Farnum
On Tue, Feb 2, 2016 at 10:09 PM, Michael Metz-Martini | SpeedPartner GmbH wrote: > Hi, > > we're experiencing some strange issues running ceph 0.87 in our, I > think, quite large cluster (taking number of objects as a measurement). > > mdsmap e721086: 1/1/1 up {0=storagemds01=up:active}, 2 up

Re: [ceph-users] mds0: Client X failing to respond to capability release

2016-02-03 Thread Michael Metz-Martini | SpeedPartner GmbH
Hi, Am 03.02.2016 um 10:26 schrieb Gregory Farnum: > On Tue, Feb 2, 2016 at 10:09 PM, Michael Metz-Martini | SpeedPartner > GmbH wrote: >> Putting some higher load via cephfs on the cluster leads to messages >> like mds0: Client X failing to respond to capability release after some >> minutes. Re

[ceph-users] MDS: bad/negative dir size

2016-02-03 Thread Markus Blank-Burian
Hi, on ceph mds startup, I see the following two errors in the our logfiles (using ceph 9.2.0 and linux 4.4 cephfs kernel client): Feb 2 19:27:13 server1 ceph-mds[1809]: 2016-02-02 19:27:13.363416 7fce9effd700 -1 log_channel(cluster) log [ERR] : bad/negative dir size on 603 f(v2008 m2016-0

Re: [ceph-users] hammer-0.94.5 + kernel-4.1.15 - cephfs stuck

2016-02-03 Thread Gregory Farnum
On Wed, Feb 3, 2016 at 1:21 AM, Nikola Ciprich wrote: > Hello fellow ceph users and developers > > few days ago, I've update one our small cluster > (three nodes) to kernel 4.1.15. Today I got cephfs > stuck on one of the nodes. > > cpeh -s reports: > mds0: Behind on trimming (155/30) > > restarti

[ceph-users] Same SSD-Cache-Pool for multiple Spinning-Disks-Pools?

2016-02-03 Thread Udo Waechter
Hello everyone, I'm using ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) with debian 8 I have now implemented a SSD (2 OSDs) cache tier for one of my pool. I am now wondering whether it is possible to use the same SSD-Pool for multiple pools as a cache tier? Or do I need to creat

Re: [ceph-users] hammer-0.94.5 + kernel-4.1.15 - cephfs stuck

2016-02-03 Thread Nikola Ciprich
Hello Gregory, in the meantime, I managed to break it further :( I tried getting rid of active+remapped pgs and got some undersized instead.. nto sure whether this can be related.. anyways here's the status: ceph -s cluster ff21618e-5aea-4cfe-83b6-a0d2d5b4052a health HEALTH_WARN

Re: [ceph-users] Same SSD-Cache-Pool for multiple Spinning-Disks-Pools?

2016-02-03 Thread Ferhat Ozkasgarli
Hello Udo, You can not use one cache pool for multiple back end pools. You must create new caching pool for every back end pool. On Wed, Feb 3, 2016 at 12:32 PM, Udo Waechter wrote: > Hello everyone, > > I'm using ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) > with debian 8 >

[ceph-users] Upgrading with mon & osd on same host

2016-02-03 Thread Udo Waechter
Hi, I would like to upgrade my ceph cluster from hammer to infernalis. I'm reading the upgrade notes, that I need to upgrade & restart the monitors first, then the OSDs. Now, my cluster has OSDs and Mons on the same hosts (I know that should not be the case, but it is :( ). I'm just wondering:

Re: [ceph-users] mds0: Client X failing to respond to capability release

2016-02-03 Thread Yan, Zheng
> On Feb 3, 2016, at 17:39, Michael Metz-Martini | SpeedPartner GmbH > wrote: > > Hi, > > Am 03.02.2016 um 10:26 schrieb Gregory Farnum: >> On Tue, Feb 2, 2016 at 10:09 PM, Michael Metz-Martini | SpeedPartner >> GmbH wrote: >>> Putting some higher load via cephfs on the cluster leads to messa

Re: [ceph-users] MDS: bad/negative dir size

2016-02-03 Thread Yan, Zheng
> On Feb 3, 2016, at 17:51, Markus Blank-Burian wrote: > > Hi, > > on ceph mds startup, I see the following two errors in the our logfiles > (using ceph 9.2.0 and linux 4.4 cephfs kernel client): > > Feb 2 19:27:13 server1 ceph-mds[1809]: 2016-02-02 19:27:13.363416 > 7fce9effd700 -1 log_c

[ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-03 Thread Sascha Vogt
Hi all, we recently tried adding a cache tier to our ceph cluster. We had 5 spinning disks per hosts with a single journal NVMe disk, hosting the 5 journals (1 OSD per spinning disk). We have 4 hosts up to now, so overall 4 NVMes hosting 20 journals for 20 spinning disks. As we had some space lef

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-03 Thread Wade Holler
Hi Sascha, What is your file system type, XFS or Btrfs ? Thanks Wade On Wed, Feb 3, 2016 at 7:01 AM Sascha Vogt wrote: > Hi all, > > we recently tried adding a cache tier to our ceph cluster. We had 5 > spinning disks per hosts with a single journal NVMe disk, hosting the 5 > journals (1 OSD pe

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-03 Thread Sascha Vogt
Hi Wade, Am 03.02.2016 um 13:26 schrieb Wade Holler: > What is your file system type, XFS or Btrfs ? We're using XFS, though for the new cache tier we could also switch to btrfs if that suggest a significant performance improvement... Greetings -Sascha- __

[ceph-users] hammer - remapped / undersized pgs + related questions

2016-02-03 Thread Nikola Ciprich
Hello, I'd like to ask few rebalancing and related questions. On one of my cluster, I got nearfull warning for one of OSDs. Apart from that, the cluster health was perfectly OK, all PGs active+clean. Therefore I used rebalance-by-utilization which changed weights a bit causing about 30% of data

[ceph-users] Monthly Dev Meeting Today

2016-02-03 Thread Patrick McGarry
Hey cephers, Just a reminder that the monthly replacement for CDS (Ceph Developer Summit) is today at 12:30p EST. This will be a short meeting to discuss all pending work going on with Ceph, so if you have anything to share or discuss, please drop a very brief summary in the wiki: http://tracker.

Re: [ceph-users] mds0: Client X failing to respond to capability release

2016-02-03 Thread Michael Metz-Martini | SpeedPartner GmbH
Hi, Am 03.02.2016 um 12:11 schrieb Yan, Zheng: >> On Feb 3, 2016, at 17:39, Michael Metz-Martini | SpeedPartner GmbH >> wrote: >> Am 03.02.2016 um 10:26 schrieb Gregory Farnum: >>> On Tue, Feb 2, 2016 at 10:09 PM, Michael Metz-Martini | SpeedPartner >>> GmbH wrote: >>> Or maybe your kernels are

[ceph-users] HEALTH_WARN pool vol has too few pgs

2016-02-03 Thread M Ranga Swami Reddy
Hi, I am using ceph for my storage cluster and health shows as WARN state with too few pgs. == health HEALTH_WARN pool volumes has too few pgs == The volume pool has 4096 pgs -- ceph osd pool get volumes pg_num pg_num: 4096 --- and >ceph df NAME ID USED %USED

Re: [ceph-users] mds0: Client X failing to respond to capability release

2016-02-03 Thread Yan, Zheng
> On Feb 3, 2016, at 21:50, Michael Metz-Martini | SpeedPartner GmbH > wrote: > > Hi, > > Am 03.02.2016 um 12:11 schrieb Yan, Zheng: >>> On Feb 3, 2016, at 17:39, Michael Metz-Martini | SpeedPartner GmbH >>> wrote: >>> Am 03.02.2016 um 10:26 schrieb Gregory Farnum: On Tue, Feb 2, 2016 a

[ceph-users] Adding Cache Tier breaks rbd access

2016-02-03 Thread Udo Waechter
Hello, I am experimenting with adding a SSD-Cache tier to my existing Ceph 0.94.5 Cluster. Currently I have: 10 OSDs on 5 hosts (spinning disks). 2 OSDs on 1 host (SSDs) I have followed the cache tier docs: http://docs.ceph.com/docs/master/rados/operations/cache-tiering/ 1st I created a new (sp

Re: [ceph-users] Adding Cache Tier breaks rbd access

2016-02-03 Thread Mihai Gheorghe
Did you set read/write permissions to the cache pool? For example, in openstack i need to set read/write permission for cinder to be able to use the cache pool. On 3 Feb 2016 17:25, "Udo Waechter" wrote: > Hello, > > I am experimenting with adding a SSD-Cache tier to my existing Ceph > 0.94.5 Clu

Re: [ceph-users] Adding Cache Tier breaks rbd access

2016-02-03 Thread Udo Waechter
Ah, I might have found the solution: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg26441.html Add access to the Cache-tier for libvirt. I'll try that later. Talking about it sometimes really helps ;) Thanks, udo. On 02/03/2016 04:25 PM, Udo Waechter wrote: > Hello, > > I am exper

[ceph-users] can not umount ceph osd partition

2016-02-03 Thread Max A. Krasilnikov
Hello! I am using 0.94.5. When I try to umount partition and fsck it I have issue: root@storage003:~# stop ceph-osd id=13 ceph-osd stop/waiting root@storage003:~# umount /var/lib/ceph/osd/ceph-13 root@storage003:~# fsck -yf /dev/sdf fsck from util-linux 2.20.1 e2fsck 1.42.9 (4-Feb-2014) /dev/sdf i

Re: [ceph-users] can not umount ceph osd partition

2016-02-03 Thread Yoann Moulin
Hello, > I am using 0.94.5. When I try to umount partition and fsck it I have issue: > root@storage003:~# stop ceph-osd id=13 > ceph-osd stop/waiting > root@storage003:~# umount /var/lib/ceph/osd/ceph-13 > root@storage003:~# fsck -yf /dev/sdf > fsck from util-linux 2.20.1 > e2fsck 1.42.9 (4-Feb-20

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-03 Thread Wade Holler
AFAIK when using XFS, parallel write as you described is not enabled. Regardless in a way though the NVMe drives are so fast it shouldn't matter much the partitioned journal or other choice. What I would be more interested in is you replication size on the cache pool. This might sound crazy but

Re: [ceph-users] can not umount ceph osd partition

2016-02-03 Thread Max A. Krasilnikov
Здравствуйте! On Wed, Feb 03, 2016 at 04:59:30PM +0100, yoann.moulin wrote: > Hello, >> I am using 0.94.5. When I try to umount partition and fsck it I have issue: >> root@storage003:~# stop ceph-osd id=13 >> ceph-osd stop/waiting >> root@storage003:~# umount /var/lib/ceph/osd/ceph-13 >> root@s

Re: [ceph-users] Ceph and hadoop (fstab insted of CephFS)

2016-02-03 Thread Noah Watkins
Hi Jose, I believe what you are referring to is using Hadoop over Ceph via the VFS implementation of the Ceph client vs the user-space libcephfs client library. The current Hadoop plugin for Ceph uses the client library. You could run Hadoop over Ceph using a local Ceph mount point, but it would t

[ceph-users] Set cache tier pool forward state automatically!

2016-02-03 Thread Mihai Gheorghe
Hi, Is there a built in setting in ceph that would set the cache pool from writeback to forward state automatically in case of an OSD fail from the pool? Let;s say the size of the cache pool is 2. If an OSD fails ceph blocks write to the pool, making the VM that use this pool to be unaccesable. B

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-03 Thread Nick Fisk
I think this would be better to be done outside of Ceph. It should be quite simple for whatever monitoring software you are using to pick up the disk failure to set the target_dirty_ratio to a very low value or change the actual caching mode. Doing it in Ceph would be complicated as you are the

Re: [ceph-users] MDS: bad/negative dir size

2016-02-03 Thread Gregory Farnum
On Wed, Feb 3, 2016 at 3:16 AM, Yan, Zheng wrote: > >> On Feb 3, 2016, at 17:51, Markus Blank-Burian wrote: >> >> Hi, >> >> on ceph mds startup, I see the following two errors in the our logfiles >> (using ceph 9.2.0 and linux 4.4 cephfs kernel client): >> >> Feb 2 19:27:13 server1 ceph-mds[180

Re: [ceph-users] hammer-0.94.5 + kernel-4.1.15 - cephfs stuck

2016-02-03 Thread Gregory Farnum
On Wed, Feb 3, 2016 at 2:32 AM, Nikola Ciprich wrote: > Hello Gregory, > > in the meantime, I managed to break it further :( > > I tried getting rid of active+remapped pgs and got some undersized > instead.. nto sure whether this can be related.. > > anyways here's the status: > > ceph -s > cl

[ceph-users] Fwd: HEALTH_WARN pool vol has too few pgs

2016-02-03 Thread M Ranga Swami Reddy
Hi, I am using ceph for my storage cluster and health shows as WARN state with too few pgs. == health HEALTH_WARN pool volumes has too few pgs == The volume pool has 4096 pgs -- ceph osd pool get volumes pg_num pg_num: 4096 --- and >ceph df NAME ID USED %USED

[ceph-users] e9 handle_probe ignoring

2016-02-03 Thread Oliver Dzombic
Hi, after the cluster changed its cluster id, because we reissued a ceph-deploy by mistake, we had to change everything to the new id. Now we see on the nodes: 2016-02-03 19:59:51.729969 7f11ef540700 0 mon.ceph2@1(peon) e9 handle_probe ignoring fsid != What does this mean ? Thank you ! -

Re: [ceph-users] Fwd: HEALTH_WARN pool vol has too few pgs

2016-02-03 Thread Ferhat Ozkasgarli
As message satates, you must increase placement group number for the pool. Because 108T data require larger pg mumber. On Feb 3, 2016 8:09 PM, "M Ranga Swami Reddy" wrote: > Hi, > > I am using ceph for my storage cluster and health shows as WARN state > with too few pgs. > > == > health HEALTH_WA

Re: [ceph-users] Ceph Tech Talk - High-Performance Production Databases on Ceph

2016-02-03 Thread Josef Johansson
I was fascinated as well. This is how it should be done ☺ We are in the middle of ordering and I saw the notice that they use single socket systems for the OSDs due to latency issues. I have only seen dual socket systems on the OSD setups here. Is this something you should do with new SSD clusters

Re: [ceph-users] Ceph Tech Talk - High-Performance Production Databases on Ceph

2016-02-03 Thread Mark Nelson
Basically sticking to a single socket lets you avoid a lot of NUMA issues that can crop up on dual socket machines so long as you still have enough overall CPU power. Ben England and Joe Mario here at Red Hat have been looking into some of these issues using C2C to observe things like remote c

[ceph-users] placement group lost by using force_create_pg ?

2016-02-03 Thread Nikola Ciprich
Hello cephers, I think I've got into pretty bad situation :( I mistakenly run force_create_pg on one placement group in live cluster Now it's stuck in creating state. Now I suppose the placement group content is lost, right? Is there a way to recover it? Or at least way to find out which objects

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-03 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Once we put in our cache tier the I/O on the spindles was so low, we just moved the journals off the SSDs onto the spindles and left the SSD space for cache. There have been testing showing that better performance can be achieved by putting more OSDs

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-03 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 My experience with Hammer is showing that setting the pool to forward mode is not evicting objects, nor do I think it is flushing objects. We have had our pool in forward mode for weeks now and we still have almost the same amount of I/O to it. There

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-03 Thread Mihai Gheorghe
Does the cache pool flush when setting a min value ratio if the pool doesn't meet the min_size? I mean ceph blocks only writes when an osd fails in a pool size of 2 or does it block reads too? Because on paper it looks good on a small cache pool, in case of osd failiure to set lowest ratio for flu

[ceph-users] Performance issues related to scrubbing

2016-02-03 Thread Cullen King
Hello, I've been trying to nail down a nasty performance issue related to scrubbing. I am mostly using radosgw with a handful of buckets containing millions of various sized objects. When ceph scrubs, both regular and deep, radosgw blocks on external requests, and my cluster has a bunch of request

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-03 Thread Christian Balzer
On Wed, 3 Feb 2016 16:57:09 -0700 Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > My experience with Hammer is showing that setting the pool to forward > mode is not evicting objects, nor do I think it is flushing objects. > Same here (with Firefly). > We have had o

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-03 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I think it depends. If there are no writes, then there probably won't be any blocking if there are less than min_size OSDs to service a PG. In an RBD workload, that is highly unlikely. If there is no blocking then setting the full_ratio to near zero

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-03 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Wed, Feb 3, 2016 at 9:00 PM, Christian Balzer wrote: > On Wed, 3 Feb 2016 16:57:09 -0700 Robert LeBlanc wrote: > That's an interesting strategy, I suppose you haven't run into the issue I > wrote about 2 days ago when switching to forward while

Re: [ceph-users] Fwd: HEALTH_WARN pool vol has too few pgs

2016-02-03 Thread M Ranga Swami Reddy
Current pg_num: 4096. As per the PG num formula, no OSD * 100/pool size -> 184 * 100/3 = 6133, so I can increase to 8192. Is this solves the problem? Thanks Swami On Thu, Feb 4, 2016 at 2:14 AM, Ferhat Ozkasgarli wrote: > As message satates, you must increase placement group number for the pool

Re: [ceph-users] Fwd: HEALTH_WARN pool vol has too few pgs

2016-02-03 Thread Somnath Roy
You can increase it, but, that will trigger rebalancing and based on the amount of data it will take some time before cluster is coming into clean state. Client IO performance will be affected during this. BTW this is not really an error , it is a warning because performance on that pool will be

Re: [ceph-users] Fwd: HEALTH_WARN pool vol has too few pgs

2016-02-03 Thread M Ranga Swami Reddy
Yes, if I change the pg_num on current pool, cluster rebalance start.. Alternatively - I plan to do as below: 1. Createa a new pool with max possible pg_num (as per the pg calc). 2. Copy the current pool to new pool (during this step - IO should be stopped) 3. Rename the curent pool current.old an

Re: [ceph-users] Performance issues related to scrubbing

2016-02-03 Thread Christian Balzer
Hello, On Wed, 3 Feb 2016 17:48:02 -0800 Cullen King wrote: > Hello, > > I've been trying to nail down a nasty performance issue related to > scrubbing. I am mostly using radosgw with a handful of buckets containing > millions of various sized objects. When ceph scrubs, both regular and > deep,

Re: [ceph-users] Upgrading with mon & osd on same host

2016-02-03 Thread Mika c
Hi, >** Do the packages (Debian) restart the services upon upgrade?* ​​No need, restart by yourself. >*Do I need to actually stop all OSDs, or can I upgrade them one by one?* No need to stop. Just upgrade osd server one by one and restart each osd daemons. Best wishes, Mika 2016-02-03 18:

Re: [ceph-users] hammer-0.94.5 + kernel-4.1.15 - cephfs stuck

2016-02-03 Thread Nikola Ciprich
> Yeah, these inactive PGs are basically guaranteed to be the cause of > the problem. There are lots of threads about getting PGs healthy > again; you should dig around the archives and the documentation > troubleshooting page(s). :) > -Greg Hello Gregory, well, I wouldn't doubt it, but when the

Re: [ceph-users] hammer-0.94.5 + kernel-4.1.15 - cephfs stuck

2016-02-03 Thread Gregory Farnum
The quick and dirty cleanup is to restart the OSDs hosting those PGs. They might have gotten some stuck ops which didn't get woken up; a few bugs like that have gone by and are resolved in various stable branches (I'm not sure what release binaries they're in). On Wed, Feb 3, 2016 at 11:32 PM, Nik