date:20150414

Re: [ceph-users] rbd performance problem on kernel 3.13.6 and 3.18.11

2015-04-14 Thread Ilya Dryomov

On Tue, Apr 14, 2015 at 6:24 AM, yangruifeng.09...@h3c.com wrote: > Hi all！ > > > > I am testing rbd performance based on kernel rbd dirver, when I compared the > result of the kernel 3.13.6 with 3.18.11, my head gets so confused. > > > > look at the result, down by a third. > > > > 3.13.6 IOPS >

Re: [ceph-users] How to dispatch monitors in a multi-site cluster (ie in 2 datacenters)

2015-04-14 Thread Joao Eduardo Luis

On 04/14/2015 04:42 AM, Francois Lafont wrote: > Joao Eduardo wrote: > >> To be more precise, it's the lowest IP:PORT combination: >> >> 10.0.1.2:6789 = rank 0 >> 10.0.1.2:6790 = rank 1 >> 10.0.1.3:6789 = rank 3 >> >> and so on. > > Ok, so if there is 2 possible quorum, the quorum with the > lowe

Re: [ceph-users] All pools have size=3 but "MB data" and "MB used" ratio is 1 to 5

2015-04-14 Thread Saverio Proto

2015-03-27 18:27 GMT+01:00 Gregory Farnum : > Ceph has per-pg and per-OSD metadata overhead. You currently have 26000 PGs, > suitable for use on a cluster of the order of 260 OSDs. You have placed > almost 7GB of data into it (21GB replicated) and have about 7GB of > additional overhead. > > You mi

Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Saverio Proto

Yes you can. You have to write your own crushmap. At the end of the crushmap you have rulesets Write a ruleset that selects only the OSDs you want. Then you have to assign the pool to that ruleset. I have seen examples online, people what wanted some pools only on SSD disks and other pools only

Re: [ceph-users] Force an OSD to try to peer

2015-04-14 Thread Martin Millnert

On Tue, Mar 31, 2015 at 10:44:51PM +0300, koukou73gr wrote: > On 03/31/2015 09:23 PM, Sage Weil wrote: > > > >It's nothing specific to peering (or ceph). The symptom we've seen is > >just that byte stop passing across a TCP connection, usually when there is > >some largish messages being sent. Th

[ceph-users] 答复: rbd performance problem on kernel 3.13.6 and 3.18.11

2015-04-14 Thread yangruifeng.09...@h3c.com

cluster detail： ceph version 0.94 3 host, 3 mon, 18 osd 1 ssd as journal + 6 hdd per host. 1 pool, name is rbd , pg_num is 1024, 3 replicated. step： 1. rbd create test1 -s 81920 rbd create test2 -s 81920 rbd create test3 -s 81920 2. on host1, rbd map test1, get /dev/rbd0 on kernel 3.18.11 or /dev

Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Vincenzo Pii

Hi Giuseppe, There is also this article from Sébastien Han that you might find useful: http://www.sebastien-han.fr/blog/2014/08/25/ceph-mix-sata-and-ssd-within-the-same-box/ Best regards, Vincenzo. 2015-04-14 10:34 GMT+02:00 Saverio Proto : > Yes you can. > You have to write your own crushmap.

Re: [ceph-users] how to compute Ceph durability?

2015-04-14 Thread ghislain.chevalier

Hi All, Am I alone to have this need ? De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de ghislain.cheval...@orange.com Envoyé : vendredi 20 mars 2015 11:47 À : ceph-users Objet : [ceph-users] how to compute Ceph durability? Hi all, I would like to compute the durability

Re: [ceph-users] how to compute Ceph durability?

2015-04-14 Thread Mark Nelson

Hi Ghislain, Mark Kampe was working on durability models a couple of years ago, but I'm not sure if they ever were completed or if anyone has reviewed them. The source code is available here: https://github.com/ceph/ceph-tools/tree/master/models/reliability This was before EC was in Ceph, s

[ceph-users] OSD replacement

2015-04-14 Thread Corey Kovacs

I am fairly new to ceph and so far things are going great. That said, when I try to replace a failed OSD, I can't seem to get it to use the same OSD id#. I have gotten it to point which a "ceph osd create" does use the correct id# but when I try to use ceph-deploy to instantiate the replacement, I

Re: [ceph-users] how to compute Ceph durability?

2015-04-14 Thread Christian Balzer

Hello, On Tue, 14 Apr 2015 12:04:35 + ghislain.cheval...@orange.com wrote: > Hi All, > > Am I alone to have this need ? > No, but for starters, there have been a number of threads about that topic in this ML, for example the "Failure probability with largish deployments" one nearly 1.5 year

Re: [ceph-users] OSD replacement

2015-04-14 Thread Vikhyat Umrao

Hi, I hope you are following this : http://ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual After removing the osd successfully run the following command : # ceph-deploy --overwrite-conf osd create : --zap-disk It will give you the same osd id for new osd as old one

Re: [ceph-users] v0.80.8 and librbd performance

2015-04-14 Thread shiva rkreddy

Hi Josh, We are using firefly 0.80.9 and see both cinder create/delete numbers slow down compared 0.80.7. I don't see any specific tuning requirements and our cluster is run pretty much on default configuration. Do you recommend any tuning or can you please suggest some log signatures we need to b

Re: [ceph-users] rbd: incorrect metadata

2015-04-14 Thread Jason Dillaman

The C++ librados API uses STL strings so it can properly handle embedded NULLs. You can make a backup copy of rbd_children using 'rados cp'. However, if you don't care about the snapshots and you've already flattened the all the images, you could just delete the rbd_children object so that lib

Re: [ceph-users] norecover and nobackfill

2015-04-14 Thread Robert LeBlanc

HmmmI've been deleting the OSD (ceph osd rm X; ceph osd crush rm osd.X) along with removing the auth key. This has caused data movement, but reading your reply and thinking about it made me think it should be done differently. I should just remove the auth key and leave the OSD in the CRUSH map

Re: [ceph-users] norecover and nobackfill

2015-04-14 Thread Robert LeBlanc

OK, I remember now, if I don't remove the OSD from the CRUSH, ceph-disk will get a new OSD ID and the old one will hang around as a zombie. This will change the host/rack/etc weights causing cluster wide rebalance. On Tue, Apr 14, 2015 at 9:31 AM, Robert LeBlanc wrote: > HmmmI've been deleti

Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Giuseppe Civitella

Hi all, I've been following this tutorial to realize my setup: http://www.sebastien-han.fr/blog/2014/08/25/ceph-mix-sata-and-ssd-within-the-same-box/ I got this CRUSH map from my test lab: http://paste.openstack.org/show/203887/ then I modified the map and uploaded it. This is the final version:

Re: [ceph-users] rbd: incorrect metadata

2015-04-14 Thread Matthew Monaco

On 04/14/2015 08:45 AM, Jason Dillaman wrote: > The C++ librados API uses STL strings so it can properly handle embedded > NULLs. You can make a backup copy of rbd_children using 'rados cp'. > However, if you don't care about the snapshots and you've already flattened > the all the images, you cou

Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Saverio Proto

You only have 4 OSDs ? How much RAM per server ? I think you have already too many PG. Check your RAM usage. Check on Ceph wiki guidelines to dimension the correct number of PGs. Remeber that everytime to create a new pool you add PGs into the system. Saverio 2015-04-14 17:58 GMT+02:00 Giuseppe

Re: [ceph-users] OSD replacement

2015-04-14 Thread Corey Kovacs

Vikhyat, I went through the steps as I did yesterday but with one small change. I was putting the --zap-disk option before the host:disk option. Now it works as expected. I'll try it again with the "wrong" syntax to see if that's really the problem but it's the only difference between working an

Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Giuseppe Civitella

Hi Saverio, I first made a test on my test staging lab where I have only 4 OSD. On my mon servers (which run other services) I have 16BG RAM, 15GB used but 5 cached. On the OSD servers I have 3GB RAM, 3GB used but 2 cached. "ceph -s" tells me nothing about PGs, shouldn't I get an error message fro

Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Bruce McFarland

I use this to quickly check pool stats: [root@ceph-mon01 ceph]# ceph osd dump | grep pool pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size

Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Saverio Proto

No error message. You just finish the RAM memory and you blow up the cluster because of too many PGs. Saverio 2015-04-14 18:52 GMT+02:00 Giuseppe Civitella : > Hi Saverio, > > I first made a test on my test staging lab where I have only 4 OSD. > On my mon servers (which run other services) I have

Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Mark Nelson

You may also be interested in the cbt code that does this kind of thing for creating cache tiers: https://github.com/ceph/cbt/blob/master/cluster/ceph.py#L295 The idea is that you create a parallel crush hierarchy for the SSDs and then you can assign that to the pool used for the cache tier.

Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Bruce McFarland

You won’t get a PG warning message from ceph –s unless you have < 20 PG’s per OSD in your cluster. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bruce McFarland Sent: Tuesday, April 14, 2015 10:00 AM To: Giuseppe Civitella; Saverio Proto Cc: ceph-users@lists.ceph.com S

Re: [ceph-users] v0.80.8 and librbd performance

2015-04-14 Thread Josh Durgin

I don't see any commits that would be likely to affect that between 0.80.7 and 0.80.9. Is this after upgrading an existing cluster? Could this be due to fs aging beneath your osds? How are you measuring create/delete performance? You can try increasing rbd concurrent management ops in ceph.conf

Re: [ceph-users] norecover and nobackfill

2015-04-14 Thread Francois Lafont

Robert LeBlanc wrote: > HmmmI've been deleting the OSD (ceph osd rm X; ceph osd crush rm osd.X) > along with removing the auth key. This has caused data movement, Maybe but if the flag "noout" is set, removing an OSD of the cluster doesn't trigger at all data movement (I have tested with Fire

Re: [ceph-users] norecover and nobackfill

2015-04-14 Thread Gregory Farnum

On Tue, Apr 14, 2015 at 1:18 PM, Francois Lafont wrote: > Robert LeBlanc wrote: > >> HmmmI've been deleting the OSD (ceph osd rm X; ceph osd crush rm osd.X) >> along with removing the auth key. This has caused data movement, > > Maybe but if the flag "noout" is set, removing an OSD of the clus

[ceph-users] Upgrade from Firefly to Hammer

2015-04-14 Thread Garg, Pankaj

Hi, I have a small cluster of 7 machines. Can I just individually upgrade each of them (using apt-get upgrade) from Firefly to Hammer release, or there more to it than that? Thank Pankaj ___ ceph-users mailing list ceph-users@lists.ceph.com http://lis

Re: [ceph-users] Radosgw: upgrade Firefly to Hammer, impossible to create bucket

2015-04-14 Thread Yehuda Sadeh-Weinraub

- Original Message - > From: "Francois Lafont" > To: ceph-users@lists.ceph.com > Sent: Monday, April 13, 2015 7:11:49 PM > Subject: Re: [ceph-users] Radosgw: upgrade Firefly to Hammer, impossible to > create bucket > > Hi, > > Yehuda Sadeh-Weinraub wrote: > > > The 405 in this case u

Re: [ceph-users] Force an OSD to try to peer

2015-04-14 Thread Scott Laird

Things *mostly* work if hosts on the same network have different MTUs, at least with TCP, because the hosts will negotiate the MSS for each connection. UDP will still break, but large UDP packets are less common. You don't want to run that way for very long, but there's no need for an atomic MTU s

Re: [ceph-users] Purpose of the s3gw.fcgi script?

2015-04-14 Thread Ken Dreyer

On 04/13/2015 07:35 PM, Yehuda Sadeh-Weinraub wrote: > > > - Original Message - >> From: "Francois Lafont" >> To: ceph-users@lists.ceph.com >> Sent: Monday, April 13, 2015 5:17:47 PM >> Subject: Re: [ceph-users] Purpose of the s3gw.fcgi script? >> >> Hi, >> >> Yehuda Sadeh-Weinraub wrote

Re: [ceph-users] Upgrade from Firefly to Hammer

2015-04-14 Thread Francois Lafont

Hi, Garg, Pankaj wrote: > I have a small cluster of 7 machines. Can I just individually upgrade each of > them (using apt-get upgrade) from Firefly to Hammer release, or there more to > it than that? Not exactly, this is "individually" which is not correct. ;) You should indeed "apt-get upgrad

[ceph-users] ceph data not well distributed.

2015-04-14 Thread Yujian Peng

I have a ceph cluster with 125 osds with the same weight. But I found that data is not well distributed. df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda147929224 2066208 43405264 5% / udev 16434372 4 16434368 1% /dev tmpfs

[ceph-users] Ceph OSD Log INFO Learning

2015-04-14 Thread Star Guo

Hi, all, There is a image in attachment of ceph osd log information. It prints "fault witch nothing to send, going to standby". What does it mean? Thanks J. Best Regards, Star Guo ___ ceph-users mailing list ceph-users@lists.ceph.com http://li

Re: [ceph-users] Ceph OSD Log INFO Learning

2015-04-14 Thread Yujian Peng

Star Guo writes: > There is a image in attachment of ceph osd log information. It prints “fault witch nothing to send, going to standby”. What does it mean? Thanks J. Logs like this is OK. ___ ceph-users mailing list ceph-users@lists.ceph.com http:

Re: [ceph-users] ceph data not well distributed.

2015-04-14 Thread Mark Nelson

On 04/14/2015 08:58 PM, Yujian Peng wrote: I have a ceph cluster with 125 osds with the same weight. But I found that data is not well distributed. df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda147929224 2066208 43405264 5% / udev

Re: [ceph-users] ceph data not well distributed.

2015-04-14 Thread Sage Weil

On Tue, 14 Apr 2015, Mark Nelson wrote: > On 04/14/2015 08:58 PM, Yujian Peng wrote: > > I have a ceph cluster with 125 osds with the same weight. > > But I found that data is not well distributed. > > df > > Filesystem 1K-blocks Used Available Use% Mounted on > > /dev/sda1

Re: [ceph-users] ceph data not well distributed.

2015-04-14 Thread Yujian Peng

Thanks for your advices! I'll increase the number of PGs to improve the balance. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] v0.80.8 and librbd performance

2015-04-14 Thread shiva rkreddy

The clusters are in test environment, so its a new deployment of 0.80.9. OS on the cluster nodes is reinstalled as well, so there shouldn't be any fs aging unless the disks are slowing down. The perf measurement is done initiating multiple cinder create/delete commands and tracking the volume to b

Re: [ceph-users] ceph data not well distributed.

2015-04-14 Thread GuangYang

We have a tiny script which does the CRUSH re-weight based on the PGs/OSD to achieve balance across OSDs, and we run the script right after setup the cluster to avoid data migration after the cluster is filled up. A couple of experiences to share: 1> As suggested, it is helpful to choose a 2-po

Re: [ceph-users] use ZFS for OSDs

2015-04-14 Thread Quenten Grasso

Hi Michal, Really nice work on the ZFS testing. I've been thinking about this myself from time to time, However I wasn't sure if ZoL was ready to use in production with Ceph. I would like to see instead of using multiple osd's in zfs/ceph but running say a z+2 for say 8-12 3-4TB spinners and

Re: [ceph-users] v0.80.8 and librbd performance

2015-04-14 Thread shiva rkreddy

Retried the test with by setting: rbd_concurrent_management_ops and rbd-concurrent-management-ops to 20 (default 10?) and didn't see any difference in the delete time. Steps: 1. Create 20, 500GB volumes 2. run : rbd -n clientkey -p cindervols rbd rm $volumeId & 3. run rbd ls command in with 1 seco

43 matches

Mail list logo