Re: [ceph-users] How does crush selects different osds using hash(pg) in diferent iterations

2015-03-23 Thread Gregory Farnum
On Sat, Mar 21, 2015 at 10:46 AM, shylesh kumar wrote: > Hi , > > I was going through this simplified crush algorithm given in ceph website. > > def crush(pg): >all_osds = ['osd.0', 'osd.1', 'osd.2', ...] >result = [] ># size is the number of copies; primary+replicas >while len(res

Re: [ceph-users] Can't Start OSD

2015-03-23 Thread Gregory Farnum
On Sun, Mar 22, 2015 at 11:22 AM, Somnath Roy wrote: > You should be having replicated copies on other OSDs (disks), so, no need to > worry about the data loss. You add a new drive and follow the steps in the > following link (either 1 or 2) Except that's not the case if you only had one copy o

Re: [ceph-users] Ceph in Production: best practice to monitor OSD up/down status

2015-03-23 Thread Gregory Farnum
"mon osd down out interval" time has elapsed from a failure. :) -Greg > > Thanks to all for helping ! > > Saverio > > > > 2015-03-23 14:58 GMT+01:00 Gregory Farnum : >> On Sun, Mar 22, 2015 at 2:55 AM, Saverio Proto wrote: >>> Hello, >>>

Re: [ceph-users] More writes on filestore than on journal ?

2015-03-23 Thread Gregory Farnum
On Mon, Mar 23, 2015 at 6:21 AM, Olivier Bonvalet wrote: > Hi, > > I'm still trying to find why there is much more write operations on > filestore since Emperor/Firefly than from Dumpling. Do you have any history around this? It doesn't sound familiar, although I bet it's because of the WBThrottl

Re: [ceph-users] Issue with free Inodes

2015-03-24 Thread Gregory Farnum
On Tue, Mar 24, 2015 at 12:13 AM, Christian Balzer wrote: > On Tue, 24 Mar 2015 09:41:04 +0300 Kamil Kuramshin wrote: > >> Yes I read it and do no not understand what you mean when say *verify >> this*? All 3335808 inodes are definetly files and direcories created by >> ceph OSD process: >> > What

Re: [ceph-users] Does crushtool --test --simulate do what cluster should do?

2015-03-24 Thread Gregory Farnum
On Tue, Mar 24, 2015 at 10:48 AM, Robert LeBlanc wrote: > I'm not sure why crushtool --test --simulate doesn't match what the > cluster actually does, but the cluster seems to be executing the rules > even though crushtool doesn't. Just kind of stinks that you have to > test the rules on actual da

Re: [ceph-users] error creating image in rbd-erasure-pool

2015-03-24 Thread Gregory Farnum
On Tue, Mar 24, 2015 at 12:09 PM, Brendan Moloney wrote: > >> Hi Loic and Markus, >> By the way, Inktank do not support snapshot of a pool with cache tiering : >> >>* >> https://download.inktank.com/docs/ICE%201.2%20-%20Cache%20and%20Erasure%20Coding%20FAQ.pdf > > Hi, > > You seem to be talki

Re: [ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-25 Thread Gregory Farnum
On Wed, Mar 25, 2015 at 1:20 AM, Udo Lembke wrote: > Hi, > due to two more hosts (now 7 storage nodes) I want to create an new > ec-pool and get an strange effect: > > ceph@admin:~$ ceph health detail > HEALTH_WARN 2 pgs degraded; 2 pgs stuck degraded; 2 pgs stuck unclean; 2 > pgs stuck undersized

Re: [ceph-users] ceph -w: Understanding "MB data" versus "MB used"

2015-03-25 Thread Gregory Farnum
On Wed, Mar 25, 2015 at 1:24 AM, Saverio Proto wrote: > Hello there, > > I started to push data into my ceph cluster. There is something I > cannot understand in the output of ceph -w. > > When I run ceph -w I get this kinkd of output: > > 2015-03-25 09:11:36.785909 mon.0 [INF] pgmap v278788: 2605

Re: [ceph-users] error creating image in rbd-erasure-pool

2015-03-25 Thread Gregory Farnum
Yes. On Wed, Mar 25, 2015 at 4:13 AM, Frédéric Nass wrote: > Hi Greg, > > Thank you for this clarification. It helps a lot. > > Does this "can't think of any issues" apply to both rbd and pool snapshots ? > > Frederic. > > > > On Tue, Mar 24, 2015 at 12:09 PM, Bre

Re: [ceph-users] how do I destroy cephfs? (interested in cephfs + tiering + erasure coding)

2015-03-25 Thread Gregory Farnum
On Wed, Mar 25, 2015 at 10:36 AM, Jake Grimmett wrote: > Dear All, > > Please forgive this post if it's naive, I'm trying to familiarise myself > with cephfs! > > I'm using Scientific Linux 6.6. with Ceph 0.87.1 > > My first steps with cephfs using a replicated pool worked OK. > > Now trying now t

Re: [ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Gregory Farnum
You shouldn't rely on "rados ls" when working with cache pools. It doesn't behave properly and is a silly operation to run against a pool of any size even when it does. :) More specifically, "rados ls" is invoking the "pgls" operation. Normal read/write ops will go query the backing store for obje

Re: [ceph-users] All pools have size=3 but "MB data" and "MB used" ratio is 1 to 5

2015-03-26 Thread Gregory Farnum
le on the same storage disk? If so, *that* is why the "data used" is large. I promise that your 5:1 ratio won't persist as you write more than 2GB of data into the cluster. -Greg > > Thank you > > Saverio > > > > 2015-03-25 14:55 GMT+01:00 Gregory Farnum : >>

Re: [ceph-users] ceph falsely reports clock skew?

2015-03-26 Thread Gregory Farnum
On Thu, Mar 26, 2015 at 7:44 AM, Lee Revell wrote: > I have a virtual test environment of an admin node and 3 mon + osd nodes, > built by just following the quick start guide. It seems to work OK but ceph > is constantly complaining about clock skew much greater than reality. > Clocksource on the

Re: [ceph-users] hadoop namenode not starting due to bindException while deploying hadoop with cephFS

2015-03-26 Thread Gregory Farnum
; I have used the core-site.xml configurations as mentioned in > http://ceph.com/docs/master/cephfs/hadoop/ > Please tell me how can this problem be solved? > > Regards, > > Ridwan Rashid Noel > > Doctoral Student, > Department of Computer Science, > University of Texas a

Re: [ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Gregory Farnum
-2" > fill the variable with name_vm-409-disk-2 and not with the content of the > file... > > Are there other tools for the rbd_directory? > > regards > > Udo > > Am 26.03.2015 15:03, schrieb Gregory Farnum: >> You shouldn't rely on "rados ls"

Re: [ceph-users] how do I destroy cephfs? (interested in cephfs + tiering + erasure coding)

2015-03-26 Thread Gregory Farnum
a newly-created small pool, but was never > able to actually remove cephfs altogether. > > On Thu, Mar 26, 2015 at 12:45 PM, Jake Grimmett > wrote: >> >> On 03/25/2015 05:44 PM, Gregory Farnum wrote: >>> >>> On Wed, Mar 25, 2015 at 10:36 AM, Jake Grimmett

Re: [ceph-users] All client writes block when 2 of 3 OSDs down

2015-03-26 Thread Gregory Farnum
Has the OSD actually been detected as down yet? You'll also need to set that min size on your existing pools ("ceph osd pool set min_size 1" or similar) to change their behavior; the config option only takes effect for newly-created pools. (Thus the "default".) On Thu, Mar 26, 2015 at 1:29 PM, L

Re: [ceph-users] All client writes block when 2 of 3 OSDs down

2015-03-26 Thread Gregory Farnum
On Thu, Mar 26, 2015 at 2:30 PM, Lee Revell wrote: > On Thu, Mar 26, 2015 at 4:40 PM, Gregory Farnum wrote: >> >> Has the OSD actually been detected as down yet? >> > > I believe it has, however I can't directly check because "ceph health" > starts

Re: [ceph-users] Migrating objects from one pool to another?

2015-03-26 Thread Gregory Farnum
On Thu, Mar 26, 2015 at 2:53 PM, Steffen W Sørensen wrote: > >> On 26/03/2015, at 21.07, J-P Methot wrote: >> >> That's a great idea. I know I can setup cinder (the openstack volume >> manager) as a multi-backend manager and migrate from one backend to the >> other, each backend linking to diff

Re: [ceph-users] Migrating objects from one pool to another?

2015-03-26 Thread Gregory Farnum
t; > On 26/03/2015, at 23.01, Gregory Farnum wrote: > > On Thu, Mar 26, 2015 at 2:53 PM, Steffen W Sørensen wrote: > > > On 26/03/2015, at 21.07, J-P Methot wrote: > > That's a great idea. I know I can setup cinder (the openstack volume > manager) as a multi-b

Re: [ceph-users] All client writes block when 2 of 3 OSDs down

2015-03-26 Thread Gregory Farnum
are showing you. -Greg > > Thanks & Regards > Somnath > > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Gregory Farnum > Sent: Thursday, March 26, 2015 2:40 PM > To: Lee Revell > Cc: ceph-users@lists.ceph.com

Re: [ceph-users] All client writes block when 2 of 3 OSDs down

2015-03-26 Thread Gregory Farnum
tradeoff between consistency and availability. The monitors are a Paxos cluster and Ceph is a 100% consistent system. -Greg > > Thanks & Regards > Somnath > > -Original Message- > From: Gregory Farnum [mailto:g...@gregs42.com] > Sent: Thursday, March 26, 2015 3:29 P

Re: [ceph-users] All client writes block when 2 of 3 OSDs down

2015-03-26 Thread Gregory Farnum
why not in case of point 3 ? > If this is the way Paxos works, should we say that in a cluster with say 3 > monitors it should be able to tolerate only one mon failure ? Yes, that is the case. > > Let me know if I am missing a point here. > > Thanks & Regards > Somnath &g

Re: [ceph-users] ceph -s slow return result

2015-03-27 Thread Gregory Farnum
Are all your monitors running? Usually a temporary hang means that the Ceph client tries to reach a monitor that isn't up, then times out and contacts a different one. I have also seen it just be slow if the monitors are processing so many updates that they're behind, but that's usually on a very

Re: [ceph-users] All pools have size=3 but "MB data" and "MB used" ratio is 1 to 5

2015-03-27 Thread Gregory Farnum
Ceph has per-pg and per-OSD metadata overhead. You currently have 26000 PGs, suitable for use on a cluster of the order of 260 OSDs. You have placed almost 7GB of data into it (21GB replicated) and have about 7GB of additional overhead. You might try putting a suitable amount of data into the clus

Re: [ceph-users] Snapshots and fstrim with cache tiers ?

2015-03-27 Thread Gregory Farnum
On Wed, Mar 25, 2015 at 3:14 AM, Frédéric Nass wrote: > Hello, > > > I have a few questions regarding snapshots and fstrim with cache tiers. > > > In the "cache tier and erasure coding FAQ" related to ICE 1.2 (based on > Firefly), Inktank says "Snapshots are not supported in conjunction with > cac

Re: [ceph-users] CephFS Slow writes with 1MB files

2015-03-27 Thread Gregory Farnum
So this is exactly the same test you ran previously, but now it's on faster hardware and the test is slower? Do you have more data in the test cluster? One obvious possibility is that previously you were working entirely in the MDS' cache, but now you've got more dentries and so it's kicking data

Re: [ceph-users] CephFS Slow writes with 1MB files

2015-03-27 Thread Gregory Farnum
et to explore a bit what client sessions there are and what they have permissions on and check; otherwise you'll have to figure it out from the client side. -Greg > > Thanks for the input! > > > On Fri, Mar 27, 2015 at 3:04 PM, Gregory Farnum wrote: >> So this is exactly

Re: [ceph-users] CephFS Slow writes with 1MB files

2015-03-30 Thread Gregory Farnum
g on the Ceph version you're running you can also examine the mds perfcounters (http://ceph.com/docs/master/dev/perf_counters/) and the op history (dump_ops_in_flight etc) and look for any operations which are noticeably slow. -Greg > > On Fri, Mar 27, 2015 at 4:50 PM, Gregory Farnum w

Re: [ceph-users] SSD Journaling

2015-03-30 Thread Gregory Farnum
On Mon, Mar 30, 2015 at 1:01 PM, Garg, Pankaj wrote: > Hi, > > I’m benchmarking my small cluster with HDDs vs HDDs with SSD Journaling. I > am using both RADOS bench and Block device (using fio) for testing. > > I am seeing significant Write performance improvements, as expected. I am > however se

Re: [ceph-users] Is it possible to change the MDS node after its been created

2015-03-30 Thread Gregory Farnum
On Mon, Mar 30, 2015 at 1:51 PM, Steve Hindle wrote: > > Hi! > > I mistakenly created my MDS node on the 'wrong' server a few months back. > Now I realized I placed it on a machine lacking IPMI and would like to move > it to another node in my cluster. > > Is it possible to non-destructively m

Re: [ceph-users] Is it possible to change the MDS node after its been created

2015-03-30 Thread Gregory Farnum
On Mon, Mar 30, 2015 at 3:15 PM, Francois Lafont wrote: > Hi, > > Gregory Farnum wrote: > >> The MDS doesn't have any data tied to the machine you're running it >> on. You can either create an entirely new one on a different machine, >> or simply copy

Re: [ceph-users] One host failure bring down the whole cluster

2015-03-30 Thread Gregory Farnum
On Mon, Mar 30, 2015 at 8:02 PM, Lindsay Mathieson wrote: > On Tue, 31 Mar 2015 02:42:27 AM Kai KH Huang wrote: >> Hi, all >> I have a two-node Ceph cluster, and both are monitor and osd. When >> they're both up, osd are all up and in, everything is fine... almost: > > > > Two things. > > 1 -

Re: [ceph-users] One of three monitors can not be started

2015-03-31 Thread Gregory Farnum
On Tue, Mar 31, 2015 at 2:50 AM, 张皓宇 wrote: > Who can help me? > > One monitor in my ceph cluster can not be started. > Before that, I added '[mon] mon_compact_on_start = true' to > /etc/ceph/ceph.conf on three monitor hosts. Then I did 'ceph tell > mon.computer05 compact ' on computer05, which ha

Re: [ceph-users] Weird cluster restart behavior

2015-03-31 Thread Gregory Farnum
On Tue, Mar 31, 2015 at 7:50 AM, Quentin Hartman wrote: > I'm working on redeploying a 14-node cluster. I'm running giant 0.87.1. Last > friday I got everything deployed and all was working well, and I set noout > and shut all the OSD nodes down over the weekend. Yesterday when I spun it > back up

Re: [ceph-users] Weird cluster restart behavior

2015-03-31 Thread Gregory Farnum
t. Unfortunately I don't remember the right keywords as I wasn't involved in the fix. -Greg > > QH > > On Tue, Mar 31, 2015 at 1:35 PM, Gregory Farnum wrote: >> >> On Tue, Mar 31, 2015 at 7:50 AM, Quentin Hartman >> wrote: >> > I'm working on re

Re: [ceph-users] Spurious MON re-elections

2015-04-01 Thread Gregory Farnum
On Wed, Apr 1, 2015 at 5:03 AM, Sylvain Munaut wrote: > Hi, > > > For some unknown reason, periodically, the master is kicked out and > another one becomes leader. And then a couple second later, the > original master calls for re-election and becomes leader again. > > This also seems to cause som

Re: [ceph-users] New Intel 750 PCIe SSD

2015-04-02 Thread Gregory Farnum
On Thu, Apr 2, 2015 at 10:03 AM, Mark Nelson wrote: > Thought folks might like to see this: > > http://hothardware.com/reviews/intel-ssd-750-series-nvme-pci-express-solid-state-drive-review > > Quick summary: > > - PCIe SSD based on the P3700 > - 400GB for $389! > - 1.2GB/s writes and 2.4GB/s read

Re: [ceph-users] How to unset lfor setting (from cache pool)

2015-04-06 Thread Gregory Farnum
On Mon, Apr 6, 2015 at 2:21 AM, Ta Ba Tuan wrote: > Hi all, > > I have ever to setup the cache-pool for my pool. > But had some proplems about cache-pool running, so I removed the cache pool > from My CEPH Cluster. > > The DATA pool currently don't use cache pool, but "lfor" setting still be > app

Re: [ceph-users] [Ceph-community] Interesting problem: 2 pgs stuck in EC pool with missing OSDs

2015-04-06 Thread Gregory Farnum
On Mon, Apr 6, 2015 at 7:48 AM, Patrick McGarry wrote: > moving this to ceph-user where it needs to be for eyeballs and responses. :) > > > On Mon, Apr 6, 2015 at 1:34 AM, Paul Evans wrote: >> Hello Ceph Community & thanks to anyone with advice on this interesting >> situation... >> =

Re: [ceph-users] CephFS as HDFS

2015-04-06 Thread Gregory Farnum
On Mon, Apr 6, 2015 at 4:17 AM, Dmitry Meytin wrote: > Hello, > > I want to use CephFS instead of vanilla HDFS. > > I have a question in regards to data locality. > > When I configure the object size (ceph.object.size) as 64MB what will happen > with data striping > (http://ceph.com/docs/master/ar

Re: [ceph-users] Getting placement groups to place evenly (again)

2015-04-08 Thread Gregory Farnum
Is this a problem with your PGs being placed unevenly, with your PGs being sized very differently, or both? CRUSH is never going to balance perfectly, but the numbers you're quoting look a bit worse than usual at first glance. -Greg On Tue, Apr 7, 2015 at 8:16 PM J David wrote: > Getting placeme

Re: [ceph-users] Getting placement groups to place evenly (again)

2015-04-08 Thread Gregory Farnum
"ceph pg dump" will output the size of each pg, among other things. On Wed, Apr 8, 2015 at 8:34 AM J David wrote: > On Wed, Apr 8, 2015 at 11:33 AM, Gregory Farnum wrote: > > Is this a problem with your PGs being placed unevenly, with your PGs > being > > siz

Re: [ceph-users] OSDs not coming up on one host

2015-04-08 Thread Gregory Farnum
Im on my phone so can't check exactly what those threads are trying to do, but the osd has several threads which are stuck. The FileStore threads are certainly trying to access the disk/local filesystem. You may not have a hardware fault, but it looks like something in your stack is not behaving wh

Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Gregory Farnum
Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by the cluster unless you made changes to the layout requiring it. If you did, the clients have to be upgraded to understand it. You could disable all the v4 features; that should let them connect again. -Greg On Thu, Apr 9, 2015 at 7

Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Gregory Farnum
Can you dump your crush map and post it on pastebin or something? On Thu, Apr 9, 2015 at 7:26 AM, Kyle Hutson wrote: > Nope - it's 64-bit. > > (Sorry, I missed the reply-all last time.) > > On Thu, Apr 9, 2015 at 9:24 AM, Gregory Farnum wrote: >> >> [Re-added t

Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Gregory Farnum
hing to enable anything else. Just changed my ceph repo from > 'giant' to 'hammer', then did 'yum update' and restarted services. > > On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum wrote: >> >> Did you enable the straw2 stuff? CRUSHV4 shouldn't b

Re: [ceph-users] OSDs not coming up on one host

2015-04-09 Thread Gregory Farnum
On Thu, Apr 9, 2015 at 8:14 AM, Jacob Reid wrote: > On Thu, Apr 09, 2015 at 06:43:45AM -0700, Gregory Farnum wrote: >> You can turn up debugging ("debug osd = 10" and "debug filestore = 10" >> are probably enough, or maybe 20 each) and see what comes out to ge

Re: [ceph-users] cache-tier do not evict

2015-04-09 Thread Gregory Farnum
On Thu, Apr 9, 2015 at 4:56 AM, Patrik Plank wrote: > Hi, > > > i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm > pool. > > these are my settings : > > > ceph osd tier add kvm cache-pool > > ceph osd tier cache-mode cache-pool writeback > > ceph osd tier set-overlay kvm cac

Re: [ceph-users] ceph-osd failure following 0.92 -> 0.94 upgrade

2015-04-09 Thread Gregory Farnum
On Thu, Apr 9, 2015 at 2:05 PM, Dirk Grunwald wrote: > Ceph cluster, U14.10 base system, OSD's using BTRFS, journal on same disk as > partition > (done using ceph-deploy) > > I had been running 0.92 without (significant) issue. I upgraded > to Hammer (0.94) be modifying /etc/apt/sources.list, apt-

Re: [ceph-users] OSDs not coming up on one host

2015-04-09 Thread Gregory Farnum
controller (or maybe its disks), regardless of what it's admitting to. ;) -Greg On Thu, Apr 9, 2015 at 1:28 AM, Jacob Reid wrote: > On Wed, Apr 08, 2015 at 03:42:29PM +, Gregory Farnum wrote: >> Im on my phone so can't check exactly what those threads are trying to do,

Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Gregory Farnum
//dpaste.de/POr1 > > > On Thu, Apr 9, 2015 at 9:49 AM, Gregory Farnum wrote: >> >> Can you dump your crush map and post it on pastebin or something? >> >> On Thu, Apr 9, 2015 at 7:26 AM, Kyle Hutson wrote: >> > Nope - it's 64-bit. >> > >> &

Re: [ceph-users] How to dispatch monitors in a multi-site cluster (ie in 2 datacenters)

2015-04-12 Thread Gregory Farnum
On Sun, Apr 12, 2015 at 1:58 PM, Francois Lafont wrote: > Somnath Roy wrote: > >> Interesting scenario :-).. IMHO, I don't think cluster will be in healthy >> state here if the connections between dc1 and dc2 is cut. The reason is the >> following. >> >> 1. only osd.5 can talk to both data cente

Re: [ceph-users] norecover and nobackfill

2015-04-14 Thread Gregory Farnum
On Tue, Apr 14, 2015 at 1:18 PM, Francois Lafont wrote: > Robert LeBlanc wrote: > >> HmmmI've been deleting the OSD (ceph osd rm X; ceph osd crush rm osd.X) >> along with removing the auth key. This has caused data movement, > > Maybe but if the flag "noout" is set, removing an OSD of the clus

Re: [ceph-users] Ceph site is very slow

2015-04-15 Thread Gregory Farnum
People are working on it but I understand there was/is a DoS attack going on. :/ -Greg On Wed, Apr 15, 2015 at 1:50 AM Ignazio Cassano wrote: > Many thanks > > 2015-04-15 10:44 GMT+02:00 Wido den Hollander : > >> On 04/15/2015 10:20 AM, Ignazio Cassano wrote: >> > Hi all, >> > why ceph.com is ver

Re: [ceph-users] ceph on Debian Jessie stopped working

2015-04-16 Thread Gregory Farnum
On Wed, Apr 15, 2015 at 9:31 AM, Chad William Seys wrote: > Hi All, > Earlier ceph on Debian Jessie was working. Jessie is running 3.16.7 . > > Now when I modprobe rbd , no /dev/rbd appear. > > # dmesg | grep -e rbd -e ceph > [ 15.814423] Key type ceph registered > [ 15.814461] libceph: loade

Re: [ceph-users] OSDs not coming up on one host

2015-04-16 Thread Gregory Farnum
-w" or in "ceph osd dump", although I can't say for certain in Firefly. -Greg On Fri, Apr 10, 2015 at 1:57 AM, Jacob Reid wrote: > On Fri, Apr 10, 2015 at 09:55:20AM +0100, Jacob Reid wrote: >> On Thu, Apr 09, 2015 at 05:21:47PM +0100, Jacob Reid wrote: >> >

Re: [ceph-users] ceph-osd failure following 0.92 -> 0.94 upgrade

2015-04-16 Thread Gregory Farnum
data. > > On Thu, Apr 9, 2015 at 7:13 PM, Dirk Grunwald > wrote: >> >> The solution to prevent this now (hours long) fix on my part was buried in >> material >> labeled as "upgrade form 0.80x giant". >> >> To prevent others from having the sam

Re: [ceph-users] Getting placement groups to place evenly (again)

2015-04-16 Thread Gregory Farnum
On Sat, Apr 11, 2015 at 12:11 PM, J David wrote: > On Thu, Apr 9, 2015 at 7:20 PM, Gregory Farnum wrote: >> Okay, but 118/85 = 1.38. You say you're seeing variance from 53% >> utilization to 96%, and 53%*1.38 = 73.5%, which is *way* off your >> numbers. > > 53% t

Re: [ceph-users] CRUSH rule for 3 replicas across 2 hosts

2015-04-20 Thread Gregory Farnum
On Mon, Apr 20, 2015 at 10:46 AM, Colin Corr wrote: > Greetings Cephers, > > I have hit a bit of a wall between the available documentation and my > understanding of it with regards to CRUSH rules. I am trying to determine if > it is possible to replicate 3 copies across 2 hosts, such that if on

Re: [ceph-users] CRUSH rule for 3 replicas across 2 hosts

2015-04-20 Thread Gregory Farnum
On Mon, Apr 20, 2015 at 11:17 AM, Dan van der Ster wrote: > I haven't tried, but wouldn't something like this work: > > step take default > step chooseleaf firstn 2 type host > step emit > step take default > step chooseleaf firstn -2 type osd > step emit > > We use something like that for an asym

[ceph-users] CephFS development since Firefly

2015-04-20 Thread Gregory Farnum
We’ve been hard at work on CephFS over the last year since Firefly was released, and with Hammer coming out it seemed like a good time to go over some of the big developments users will find interesting. Much of this is cribbed from John’s Linux Vault talk (http://events.linuxfoundation.org/sit

Re: [ceph-users] CRUSH rule for 3 replicas across 2 hosts

2015-04-21 Thread Gregory Farnum
The CRUSH min and max sizes are part of the "ruleset" facilities that we're slowly removing because they turned out to have no utility and be overly complicated to understand. You should probably just set them all to 1 and 10. The intention behind them was that you could have a single ruleset whic

Re: [ceph-users] Is CephFS ready for production?

2015-04-22 Thread Gregory Farnum
On Tue, Apr 21, 2015 at 9:53 PM, Mohamed Pakkeer wrote: > Hi sage, > > When can we expect the fully functional fsck for cephfs?. Can we get at next > major release?. Is there any roadmap or time frame for the fully functional > fsck release? We're working on it as fast as we can, and it'll be don

Re: [ceph-users] OSDs failing on upgrade from Giant to Hammer

2015-04-22 Thread Gregory Farnum
If you just rm the directory you're leaving behind all of the leveldb data about it. :) -Greg On Wed, Apr 22, 2015 at 3:23 AM Dan van der Ster wrote: > Hi, > > On Tue, Apr 21, 2015 at 6:05 PM, Scott Laird wrote: > > > > ceph-objectstore-tool --op remove --data-path /var/lib/ceph/osd/ceph-36/ > >

Re: [ceph-users] removing a ceph fs

2015-04-22 Thread Gregory Farnum
If you look at the "ceph --help" output you'll find some commands for removing MDSes from the system. -Greg On Wed, Apr 22, 2015 at 6:46 AM Kenneth Waegeman wrote: > forgot to mention I'm running 0.94.1 > > On 04/22/2015 03:02 PM, Kenneth Waegeman wrote: > > Hi, > > > > I tried to recreate a ceph

Re: [ceph-users] CephFS and Erasure Codes

2015-04-22 Thread Gregory Farnum
On Fri, Apr 17, 2015 at 3:29 PM, Loic Dachary wrote: > Hi, > > Although erasure coded pools cannot be used with CephFS, they can be used > behind a replicated cache pool as explained at > http://docs.ceph.com/docs/master/rados/operations/cache-tiering/. > > Cheers > > On 18/04/2015 00:26, Ben Ra

Re: [ceph-users] Tiering to object storage

2015-04-22 Thread Gregory Farnum
On Mon, Apr 20, 2015 at 3:31 PM, Blair Bethwaite wrote: > Hi all, > > I understand the present pool tiering infrastructure is intended to work for >>2 layers? We're presently considering backup strategies for large pools and > wondered how much of a stretch it would be to have a base tier sitting

Re: [ceph-users] Still CRUSH problems with 0.94.1 ? (explained)

2015-04-22 Thread Gregory Farnum
On Wed, Apr 22, 2015 at 3:18 AM, f...@univ-lr.fr wrote: > Hi all, > > responding to my yesterday email, I have interesting informations confirming > that the problem is not at all related to Hammer. > Seing really nothing explaining the weird comportment, I've reinstalled a > Giant and had the sam

Re: [ceph-users] cluster not coming up after reboot

2015-04-22 Thread Gregory Farnum
On Wed, Apr 22, 2015 at 8:17 AM, Kenneth Waegeman wrote: > Hi, > > I changed the cluster network parameter in the config files, restarted the > monitors , and then restarted all the OSDs (shouldn't have done that). Do you mean that you changed the IP addresses of the monitors in the config files

Re: [ceph-users] Getting placement groups to place evenly (again)

2015-04-22 Thread Gregory Farnum
On Wed, Apr 22, 2015 at 11:04 AM, J David wrote: > On Thu, Apr 16, 2015 at 8:02 PM, Gregory Farnum wrote: >> Since I now realize you did a bunch of reweighting to try and make >> data match up I don't think you'll find something like badly-sized >> LevelDB instan

Re: [ceph-users] cephfs map command deprecated

2015-04-22 Thread Gregory Farnum
On Wed, Apr 22, 2015 at 12:35 PM, Stillwell, Bryan wrote: > I have a PG that is in the active+inconsistent state and found the > following objects to have differing md5sums: > > -fa8298048c1958de3c04c71b2f225987 > ./DIR_5/DIR_0/DIR_D/DIR_9/1008a75.017c__head_502F9D05__0 > +b089c2dcd4f1d8b4

Re: [ceph-users] systemd unit files and multiple daemons

2015-04-22 Thread Gregory Farnum
On Wed, Apr 22, 2015 at 2:57 PM, Ken Dreyer wrote: > I could really use some eyes on the systemd change proposed here: > http://tracker.ceph.com/issues/11344 > > Specifically, on bullet #4 there, should we have a single > "ceph-mon.service" (implying that users should only run one monitor > daemon

Re: [ceph-users] Cephfs: proportion of data between data pool and metadata pool

2015-04-23 Thread Gregory Farnum
On Thu, Apr 23, 2015 at 12:55 AM, Steffen W Sørensen wrote: >> But in the menu, the use case "cephfs only" doesn't exist and I have >> no idea of the %data for each pools metadata and data. So, what is >> the proportion (approximatively) of %data between the "data" pool and >> the "metadata" pool

Re: [ceph-users] "Compacting" btrfs file storage

2015-04-23 Thread Gregory Farnum
On Thu, Apr 23, 2015 at 1:25 AM, Burkhard Linke wrote: > Hi, > > I've noticed that the btrfs file storage is performing some > cleanup/compacting operations during OSD startup. > > Before OSD start: > /dev/sdc1 2.8T 2.4T 390G 87% /var/lib/ceph/osd/ceph-58 > > After OSD start: > /de

Re: [ceph-users] removing a ceph fs

2015-04-23 Thread Gregory Farnum
I think you have to "ceph mds fail" the last one up, then you'll be able to remove it. -Greg On Thu, Apr 23, 2015 at 7:52 AM, Kenneth Waegeman wrote: > > > On 04/22/2015 06:51 PM, Gregory Farnum wrote: >> >> If you look at the "ceph --help" output y

Re: [ceph-users] Is CephFS ready for production?

2015-04-24 Thread Gregory Farnum
I think the VMWare plugin was going to be contracted out by the business people, and it was never going to be upstream anyway -- I've not heard anything since then but you'd need to ask them I think. -Greg On Fri, Apr 24, 2015 at 7:17 AM Marc wrote: > On 22/04/2015 16:04, Gregory

Re: [ceph-users] Radosgw and mds hardware configuration

2015-04-24 Thread Gregory Farnum
The MDS will run in 1GB, but the more RAM it has the more of the metadata you can cache in memory. The faster single-threaded performance your CPU has, the more metadata IOPS you'll get. We haven't done much work characterizing it, though. -Greg On Wed, Apr 22, 2015 at 5:39 PM Francois Lafont wrot

Re: [ceph-users] decrease pg number

2015-04-24 Thread Gregory Farnum
You can't migrate RBD objects via cppool right now as it doesn't handle snapshots at all. I think a few people have done it successfully by setting up existing pools as cache tiers on top of the target pool and then flushing them out, but I've not run through that. You can also just set the PG war

Re: [ceph-users] Cephfs: proportion of data between data pool and metadata pool

2015-04-25 Thread Gregory Farnum
ger, I have no idea of the code behind > it. This is just what it happens to be for us. > -- > Adam > > > On Sat, Apr 25, 2015 at 11:29 AM, François Lafont > wrote: > > Thanks Greg and Steffen for your answer. I will make some tests. > > > > Gregory Farn

Re: [ceph-users] Cephfs: proportion of data between data pool and metadata pool

2015-04-25 Thread Gregory Farnum
people.beocat.cis.ksu.edu/~mozes/ceph/getfattr_cephfs.txt > > rsync is ongoing, moving data into cephfs. It would seem the data is > truly there, both with metadata and file data. md5sums match for files > that I've tested. > -- > Adam > > On Sat, Apr 25, 2015 at 12:16 PM

Re: [ceph-users] Cephfs: proportion of data between data pool and metadata pool

2015-04-25 Thread Gregory Farnum
ssuming all of that usage is for > the metadata, it comes out to ~1.4KB per file. Still *much* less than > the 4K estimate, but probably more reasonable than a few bytes per > file :). > > -- > Adam > > On Sat, Apr 25, 2015 at 1:03 PM, Gregory Farnum wrote: > > That&

Re: [ceph-users] IOWait on SATA-backed with SSD-journals

2015-04-27 Thread Gregory Farnum
On Sat, Apr 25, 2015 at 11:36 PM, Josef Johansson wrote: > Hi, > > With inspiration from all the other performance threads going on here, I > started to investigate on my own as well. > > I’m seeing a lot iowait on the OSD, and the journal utilised at 2-7%, with > about 8-30MB/s (mostly around 8

Re: [ceph-users] [cephfs][ceph-fuse] cache size or memory leak?

2015-04-29 Thread Gregory Farnum
On Wed, Apr 29, 2015 at 1:33 AM, Dexter Xiong wrote: > The output of status command of fuse daemon: > "dentry_count": 128966, > "dentry_pinned_count": 128965, > "inode_count": 409696, > I saw the pinned dentry is nearly the same as dentry. > So I enabled debug log(debug client = 20/20) and read

Re: [ceph-users] Kicking 'Remapped' PGs

2015-04-30 Thread Gregory Farnum
On Wed, Apr 29, 2015 at 6:06 PM, Paul Evans wrote: > In one of our clusters we sometimes end up with PGs that are mapped > incorrectly and settle into a ‘remapped’ state (forever). Is there a way to > nudge a specific PG to recalculate placement and relocate the data? One > option that we’re *da

Re: [ceph-users] cache pool parameters and pressure

2015-04-30 Thread Gregory Farnum
On Thu, Apr 30, 2015 at 2:03 AM, Kenneth Waegeman wrote: > So the cache is empty, but I get warning when I check the health: > health HEALTH_WARN > mds0: Client cephtst.cubone.os failing to respond to cache > pressure > > Someone an idea what is happening here? This means that th

Re: [ceph-users] ceph-dokan mount error

2015-04-30 Thread Gregory Farnum
On Thu, Apr 30, 2015 at 9:49 AM, James Devine wrote: > So I am trying to get ceph-dokan to work. Upon running it with > ./ceph-dokan.exe -c ceph.conf -l e it indicates there was a mount > error and the monitor it connects to logs cephx server client.admin: > unexpected key: req.key=0 expected_key

Re: [ceph-users] Ceph Fuse Crashed when Reading and How to Backup the data

2015-04-30 Thread Gregory Farnum
The not permitted bit usually means that your client doesn't have access permissions to the data pool in use. I'm not sure why it would be getting aborted without any output though — is there any traceback at all in the logs? A message about the OOM-killer zapping it or something? -Greg On Thu,

Re: [ceph-users] RHEL7/HAMMER cache tier doesn't flush or evict?

2015-04-30 Thread Gregory Farnum
On Thu, Apr 30, 2015 at 10:57 AM, Nick Fisk wrote: > I'm using Inkscope to monitor my cluster and looking at the pool details I > saw that mode was set to none. I'm pretty sure there must be a ceph cmd line > to get the option state but I couldn't find anything obvious when I was > looking for it.

Re: [ceph-users] Kicking 'Remapped' PGs

2015-05-04 Thread Gregory Farnum
On Sun, May 3, 2015 at 5:18 AM, Paul Evans wrote: > Thanks, Greg. Following your lead, we discovered the proper > 'set_choose_tries xxx’ value had not been applied to *this* pool’s rule, and > we updated the cluster accordingly. We then moved a random OSD out and back > in to ‘kick’ things, but n

Re: [ceph-users] Kicking 'Remapped' PGs

2015-05-07 Thread Gregory Farnum
e why from these settings. You might have hit a bug I'm not familiar with that will be jostled by just restarting the OSDs in question... :/ -Greg On Tue, May 5, 2015 at 7:46 AM, Paul Evans wrote: > Gregory Farnum wrote: > > Oh. That's strange; they are all mapped to two

Re: [ceph-users] CephFS unexplained writes

2015-05-07 Thread Gregory Farnum
movexattr("/var/lib/ceph/osd/ceph-10/current/4.1es1_head/DIR_E/DIR_1/DIR_D/DIR_3", > "user.cephos.phash.contents@1") = -1 ENODATA (No data available) > > So it appears that the osd's aren't writing actual data to disk, but > metadata in the form of xattr'

Re: [ceph-users] RFC: Deprecating ceph-tool commands

2015-05-08 Thread Gregory Farnum
On Fri, May 8, 2015 at 4:55 PM, Joao Eduardo Luis wrote: > All, > > While working on #11545 (mon: have mon-specific commands under 'ceph mon > ...') I crashed into a slightly tough brick wall. > > The purpose of #11545 is to move certain commands, such as 'ceph scrub', > 'ceph compact' and 'ceph s

Re: [ceph-users] ceph-fuse options: writeback cache

2015-05-11 Thread Gregory Farnum
On Mon, May 11, 2015 at 1:57 AM, Kenneth Waegeman wrote: > Hi all, > > I have a few questions about ceph-fuse options: > - Is the fuse writeback cache being used? How can we see this? Can it be > turned on with allow_wbcache somehow? I'm not quite sure what you mean here. ceph-fuse does maintain

Re: [ceph-users] [cephfs][ceph-fuse] cache size or memory leak?

2015-05-11 Thread Gregory Farnum
On Fri, May 8, 2015 at 1:34 AM, Yan, Zheng wrote: > On Fri, May 8, 2015 at 11:15 AM, Dexter Xiong wrote: >> I tried "echo 3 > /proc/sys/vm/drop_caches" and dentry_pinned_count dropped. >> >> Thanks for your help. >> > > could you please try the attached patch I haven't followed the whole convers

Re: [ceph-users] cache pool parameters and pressure

2015-05-12 Thread Gregory Farnum
On Tue, May 12, 2015 at 5:54 AM, Kenneth Waegeman wrote: > > > On 04/30/2015 07:50 PM, Gregory Farnum wrote: >> >> On Thu, Apr 30, 2015 at 2:03 AM, Kenneth Waegeman >> wrote: >>> >>> So the cache is empty, but I get warning when I c

Re: [ceph-users] questions about CephFS

2015-05-12 Thread Gregory Farnum
[ Adding ceph-users to the CC ] On Mon, May 11, 2015 at 8:22 PM, zhao.ming...@h3c.com wrote: > Hi: > > I'm learning CephFS recently, and now I have some question about it; > > > > 1. I've seen the typical configuration is 'single MDS',and found some > resources from Internet which said 'singl

Re: [ceph-users] Cluster always in WARN state, failing to respond to cache pressure

2015-05-12 Thread Gregory Farnum
On Tue, May 12, 2015 at 12:03 PM, Cullen King wrote: > I'm operating a fairly small ceph cluster, currently three nodes (with plans > to expand to five in the next couple of months) with more than adequate > hardware. Node specs: > > 2x Xeon E5-2630 > 64gb ram > 2x RAID1 SSD for system > 2x 256gb

Re: [ceph-users] Write freeze when writing to rbd image and rebooting one of the nodes

2015-05-12 Thread Gregory Farnum
On Tue, May 12, 2015 at 11:39 PM, Vasiliy Angapov wrote: > Hi, colleagues! > > I'm testing a simple Ceph cluster in order to use it in production > environment. I have 8 OSDs (1Tb SATA drives) which are evenly distributed > between 4 nodes. > > I'v mapped rbd image on the client node and started

<    1   2   3   4   5   6   7   8   9   10   >