Re: [ceph-users] Deep scrub, cache pools, replica 1

2014-11-11 Thread Gregory Farnum
On Mon, Nov 10, 2014 at 10:58 PM, Christian Balzer wrote: > > Hello, > > One of my clusters has become busy enough (I'm looking at you, evil Window > VMs that I shall banish elsewhere soon) to experience client noticeable > performance impacts during deep scrub. > Before this I instructed all OSDs

Re: [ceph-users] mds isn't working anymore after osd's running full

2014-11-11 Thread Gregory Farnum
On Tue, Nov 11, 2014 at 5:06 AM, Jasper Siero wrote: > No problem thanks for helping. > I don't want to disable the deep scrubbing process itself because its very > useful but one placement group (3.30) is continuously deep scrubbing and it > should finish after some time but it won't. Hmm, how

Re: [ceph-users] Triggering shallow scrub on OSD where scrub is already in progress

2014-11-12 Thread Gregory Farnum
log [INF] > : 1.a8 scrub ok > 2014-11-12 16:25:53.012220 7f5026f31700 0 log_channel(default) log [INF] > : 1.a9 scrub ok > 2014-11-12 16:25:54.009265 7f5026f31700 0 log_channel(default) log [INF] > : 1.cb scrub ok > 2014-11-12 16:25:56.516569 7f5026f31700 0 log_channel(default

Re: [ceph-users] rados -p cache-flush-evict-all surprisingly slow

2014-11-12 Thread Gregory Farnum
My recollection is that the RADOS tool is issuing a special eviction command on every object in the cache tier using primitives we don't use elsewhere. Their existence is currently vestigial from our initial tiering work (rather than the present caching), but I have some hope we'll extend them agai

Re: [ceph-users] Deep scrub, cache pools, replica 1

2014-11-12 Thread Gregory Farnum
On Tue, Nov 11, 2014 at 2:32 PM, Christian Balzer wrote: > On Tue, 11 Nov 2014 10:21:49 -0800 Gregory Farnum wrote: > >> On Mon, Nov 10, 2014 at 10:58 PM, Christian Balzer wrote: >> > >> > Hello, >> > >> > One of my clusters has become busy enough

Re: [ceph-users] Log reading/how do I tell what an OSD is trying to connect to

2014-11-12 Thread Gregory Farnum
On Tue, Nov 11, 2014 at 6:28 PM, Scott Laird wrote: > I'm having a problem with my cluster. It's running 0.87 right now, but I > saw the same behavior with 0.80.5 and 0.80.7. > > The problem is that my logs are filling up with "replacing existing (lossy) > channel" log lines (see below), to the p

Re: [ceph-users] Very Basic question

2014-11-13 Thread Gregory Farnum
What does "ceph -s" output when things are working? Does the ceph.conf on your admin node contain the address of each monitor? (Paste is the relevant lines.) it will need to or the ceph tool won't be able to find the monitors even though the system is working. -Greg On Thu, Nov 13, 2014 at 9:11 AM

Re: [ceph-users] Multiple rules in a ruleset: any examples? Which rule wins?

2014-11-13 Thread Gregory Farnum
On Thu, Nov 13, 2014 at 2:58 PM, Anthony Alba wrote: > Hi list, > > > When there are multiple rules in a ruleset, is it the case that "first > one wins"? > > When will a rule faisl, does it fall through to the next rule? > Are min_size, max_size the only determinants? > > Are there any examples?

Re: [ceph-users] Multiple rules in a ruleset: any examples? Which rule wins?

2014-11-13 Thread Gregory Farnum
On Thu, Nov 13, 2014 at 3:11 PM, Anthony Alba wrote: > Thanks! What happens when the lone rule fails? Is there a fallback > rule that will place the blob in a random PG? Say I misconfigure, and > my choose/chooseleaf don't add up to pool min size. There's no built-in fallback rule or anything li

Re: [ceph-users] Recreating the OSD's with same ID does not seem to work

2014-11-14 Thread Gregory Farnum
You didn't remove them from the auth monitor's keyring. If you're removing OSDs you need to follow the steps in the documentation. -Greg On Fri, Nov 14, 2014 at 4:42 PM, JIten Shah wrote: > Hi Guys, > > I had to rekick some of the hosts where OSD’s were running and after > re-kick, when I try to

Re: [ceph-users] Recreating the OSD's with same ID does not seem to work

2014-11-14 Thread Gregory Farnum
. > > —Jiten > > On Nov 14, 2014, at 4:44 PM, Gregory Farnum wrote: > >> You didn't remove them from the auth monitor's keyring. If you're >> removing OSDs you need to follow the steps in the documentation. >> -Greg >> >> On Fri, Nov 14, 2

Re: [ceph-users] Concurrency in ceph

2014-11-18 Thread Gregory Farnum
On Tue, Nov 18, 2014 at 1:26 PM, hp cre wrote: > Hello everyone, > > I'm new to ceph but been working with proprietary clustered filesystem for > quite some time. > > I almost understand how ceph works, but have a couple of questions which > have been asked before here, but i didn't understand t

Re: [ceph-users] rados mkpool fails, but not ceph osd pool create

2014-11-18 Thread Gregory Farnum
On Tue, Nov 11, 2014 at 11:43 PM, Gauvain Pocentek wrote: > Hi all, > > I'm facing a problem on a ceph deployment. rados mkpool always fails: > > # rados -n client.admin mkpool test > error creating pool test: (2) No such file or directory > > rados lspool and rmpool commands work just fine, and t

Re: [ceph-users] Concurrency in ceph

2014-11-18 Thread Gregory Farnum
27;t need to create vm > instances on filesystems, am I correct? Right; these systems are doing the cache coherency (by duplicating all the memory, including that of ext4/whatever) so that they work. -Greg > > On 18 Nov 2014 23:33, "Gregory Farnum" wrote: >> >> On Tu

Re: [ceph-users] mds continuously crashing on Firefly

2014-11-18 Thread Gregory Farnum
On Thu, Nov 13, 2014 at 9:34 AM, Lincoln Bryant wrote: > Hi all, > > Just providing an update to this -- I started the mds daemon on a new server > and rebooted a box with a hung CephFS mount (from the first crash) and the > problem seems to have gone away. > > I'm still not sure why the mds was

Re: [ceph-users] Log reading/how do I tell what an OSD is trying to connect to

2014-11-18 Thread Gregory Farnum
rst, but it's logging tons of the same errors while > trying to talk to 10.2.0.34. > > On Wed Nov 12 2014 at 10:47:30 AM Gregory Farnum wrote: >> >> On Tue, Nov 11, 2014 at 6:28 PM, Scott Laird wrote: >> > I'm having a problem with my cluster. It's runnin

Re: [ceph-users] Unclear about CRUSH map and more than one "step emit" in rule

2014-11-18 Thread Gregory Farnum
On Sun, Nov 16, 2014 at 4:17 PM, Anthony Alba wrote: > The step emit documentation states > > "Outputs the current value and empties the stack. Typically used at > the end of a rule, but may also be used to pick from different trees > in the same rule." > > What use case is there for more than one

Re: [ceph-users] mds cluster degraded

2014-11-18 Thread Gregory Farnum
Hmm, last time we saw this it meant that the MDS log had gotten corrupted somehow and was a little short (in that case due to the OSDs filling up). What do you mean by "rebuilt the OSDs"? -Greg On Mon, Nov 17, 2014 at 12:52 PM, JIten Shah wrote: > After i rebuilt the OSD’s, the MDS went into the

Re: [ceph-users] Cache tiering and cephfs

2014-11-18 Thread Gregory Farnum
I believe the reason we don't allow you to do this right now is that there was not a good way of coordinating the transition (so that everybody starts routing traffic through the cache pool at the same time), which could lead to data inconsistencies. Looks like the OSDs handle this appropriately no

Re: [ceph-users] Bug or by design?

2014-11-18 Thread Gregory Farnum
On Tue, Nov 18, 2014 at 3:38 PM, Robert LeBlanc wrote: > I was going to submit this as a bug, but thought I would put it here for > discussion first. I have a feeling that it could be behavior by design. > > ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578) > > I'm using a cache pool an

Re: [ceph-users] incorrect pool size, wrong ruleset?

2014-11-18 Thread Gregory Farnum
On Wed, Nov 12, 2014 at 1:41 PM, houmles wrote: > Hi, > > I have 2 hosts with 8 2TB drive in each. > I want to have 2 replicas between both hosts and then 2 replicas between osds > on each host. That way even when I lost one host I still have 2 replicas. > > Currently I have this ruleset: > > rul

Re: [ceph-users] OSD balancing problems

2014-11-19 Thread Gregory Farnum
I think these numbers are about what is expected. You could try a couple things to improve it, but neither of them are common: 1) increase the number of PGs (and pgp_num) a lot more. I you decide to experiment with this, watch your CPU and memory numbers carefully. 2) try to correct for the inequ

Re: [ceph-users] How to add/remove/move an MDS?

2014-11-19 Thread Gregory Farnum
You don't really need to do much. There are some "ceph mds" commands that let you clean things up in the MDSMap if you like, but moving an MDS essentially it boils down to: 1) make sure your new node has a cephx key (probably for a new MDS entity named after the new host, but not strictly necessary

Re: [ceph-users] Ceph inconsistency after deep-scrub

2014-11-21 Thread Gregory Farnum
On Fri, Nov 21, 2014 at 2:35 AM, Paweł Sadowski wrote: > Hi, > > During deep-scrub Ceph discovered some inconsistency between OSDs on my > cluster (size 3, min size 2). I have fund broken object and calculated > md5sum of it on each OSD (osd.195 is acting_primary): > osd.195 - md5sum_ > osd.

Re: [ceph-users] OSD in uninterruptible sleep

2014-11-21 Thread Gregory Farnum
On Fri, Nov 21, 2014 at 4:56 AM, Jon Kåre Hellan wrote: > We are testing a Giant cluster - on virtual machines for now. We have seen > the same > problem two nights in a row: One of the OSDs gets stuck in uninterruptible > sleep. > The only way to get rid of it is apparently to reboot - kill -9, -

Re: [ceph-users] Problems starting up OSD

2014-11-22 Thread Gregory Farnum
Can you post the OSD log somewhere? It should have a few more details about what's going on here. (This backtrace looks like it's crashing in a call to phreads, which is a little unusual.) -Greg On Sat, Nov 22, 2014 at 1:01 PM, Jeffrey Ollie wrote: > -- One of my OSDs lost network connectivity fo

Re: [ceph-users] Problems starting up OSD

2014-11-22 Thread Gregory Farnum
On Sat, Nov 22, 2014 at 11:39 AM, Jeffrey Ollie wrote: > On Sat, Nov 22, 2014 at 1:22 PM, Gregory Farnum wrote: >> Can you post the OSD log somewhere? It should have a few more details >> about what's going on here. (This backtrace looks like it's crashing >> i

Re: [ceph-users] non-posix cephfs page deprecated

2014-11-23 Thread Gregory Farnum
On Thu, Nov 20, 2014 at 6:32 PM, Shawn Edwards wrote: > This page is marked for removal: > > http://ceph.com/docs/firefly/dev/differences-from-posix/ I'm not quite sure what that TODO means, there, but > Is the bug in the above webpage still in the code? If not, in which version > was it fixed?

Re: [ceph-users] Multiple MDS servers...

2014-11-23 Thread Gregory Farnum
On Fri, Nov 21, 2014 at 3:21 PM, JIten Shah wrote: > I am trying to setup 3 MDS servers (one on each MON) but after I am done > setting up the first one, it give me below error when I try to start it on > the other ones. I understand that only 1 MDS is functional at a time, but I > thought you can

Re: [ceph-users] Multiple MDS servers...

2014-11-24 Thread Gregory Farnum
On Sun, Nov 23, 2014 at 10:36 PM, JIten Shah wrote: > Hi Greg, > > I haven’t setup anything in ceph.conf as mds.cephmon002 nor in any ceph > folders. I have always tried to set it up as mds.lab-cephmon002, so I am > wondering where is it getting that value from? No idea, sorry. Probably some odd

Re: [ceph-users] ceph-announce list

2014-11-24 Thread Gregory Farnum
On Fri, Nov 21, 2014 at 12:34 AM, JuanFra Rodriguez Cardoso wrote: > Hi all: > > As it was asked weeks ago.. what is the way the ceph community uses to > stay tuned on new features and bug fixes? I asked Sage about this today and he said he'd set one up. Seems like a good idea; just not something

Re: [ceph-users] Client forward compatibility

2014-11-24 Thread Gregory Farnum
On Thu, Nov 20, 2014 at 9:08 AM, Dan van der Ster wrote: > Hi all, > What is compatibility/incompatibility of dumpling clients to talk to firefly > and giant clusters? We sadly don't have a good matrix about this yet, but in general you should assume that anything which changed the way the data i

Re: [ceph-users] Giant + nfs over cephfs hang tasks

2014-11-29 Thread Gregory Farnum
Ilya, do you have a ticket reference for the bug? Andrei, we run NFS tests on CephFS in our nightlies and it does pretty well so in the general case we expect it to work. Obviously not at the moment with whatever bug Ilya is looking at, though. ;) -Greg On Sat, Nov 29, 2014 at 4:51 AM Ilya Dryomov

Re: [ceph-users] Tip of the week: don't use Intel 530 SSD's for journals

2014-11-29 Thread Gregory Farnum
That's not actually so unusual: http://techreport.com/review/26058/the-ssd-endurance-experiment-data-retention-after-600tb The manufacturers are pretty conservative with their ratings and warranties. ;) -Greg On Thu, Nov 27, 2014 at 2:41 AM Andrei Mikhailovsky wrote: > Mark, if it is not too much

Re: [ceph-users] Client forward compatibility

2014-12-01 Thread Gregory Farnum
On Tue, Nov 25, 2014 at 1:00 AM, Dan Van Der Ster wrote: > Hi Greg, > > >> On 24 Nov 2014, at 22:01, Gregory Farnum wrote: >> >> On Thu, Nov 20, 2014 at 9:08 AM, Dan van der Ster >> wrote: >>> Hi all, >>> What is compatibility/incompatibility o

Re: [ceph-users] Giant + nfs over cephfs hang tasks

2014-12-01 Thread Gregory Farnum
On Sun, Nov 30, 2014 at 1:15 PM, Andrei Mikhailovsky wrote: > Greg, thanks for your comment. Could you please share what OS, kernel and > any nfs/cephfs settings you've used to achieve the pretty well stability? > Also, what kind of tests have you ran to check that? We're just doing it on our te

Re: [ceph-users] Revisiting MDS memory footprint

2014-12-01 Thread Gregory Farnum
On Mon, Dec 1, 2014 at 8:06 AM, John Spray wrote: > I meant to chime in earlier here but then the weekend happened, comments > inline > > On Sun, Nov 30, 2014 at 7:20 PM, Wido den Hollander wrote: >> Why would you want all CephFS metadata in memory? With any filesystem >> that will be a problem.

Re: [ceph-users] Official CentOS7 support

2014-12-02 Thread Gregory Farnum
We aren't currently doing any of the ongoing testing which that page covers on CentOS 7. I think that's because it's going to flow through the same Red Hat mechanisms as the RHEL7 builds, but I'm not on that team so I can't say for sure. -Greg On Tue, Dec 2, 2014 at 9:39 AM Frank Even wrote: > He

Re: [ceph-users] Official CentOS7 support

2014-12-02 Thread Gregory Farnum
On Tue, Dec 2, 2014 at 10:55 AM, Ken Dreyer wrote: > On 12/02/2014 10:59 AM, Gregory Farnum wrote: >> We aren't currently doing any of the ongoing testing which that page >> covers on CentOS 7. I think that's because it's going to flow through >> the same Re

Re: [ceph-users] Failed lossy con, dropping message

2014-12-04 Thread Gregory Farnum
It means that the connection from the client to the osd went away. This could happen just because the client shut down, but if so it quit before it had gotten commits from all its disk writes, which seems bad. It could also mean there was a networking problem of some kind. -Greg On Thu, Dec 4, 2014

Re: [ceph-users] experimental features

2014-12-05 Thread Gregory Farnum
On Fri, Dec 5, 2014 at 9:36 AM, Sage Weil wrote: > A while back we merged Haomai's experimental OSD backend KeyValueStore. > We named the config option 'keyvaluestore_dev', hoping to make it clear to > users that it was still under development, not fully tested, and not yet > ready for production.

Re: [ceph-users] OSD trashed by simple reboot (Debian Jessie, systemd?)

2014-12-05 Thread Gregory Farnum
On Thu, Dec 4, 2014 at 7:03 PM, Christian Balzer wrote: > > Hello, > > This morning I decided to reboot a storage node (Debian Jessie, thus 3.16 > kernel and Ceph 0.80.7, HDD OSDs with SSD journals) after applying some > changes. > > It came back up one OSD short, the last log lines before the reb

Re: [ceph-users] Unexplainable slow request

2014-12-08 Thread Gregory Farnum
n, 8 Dec 2014 19:51:00 -0800 Gregory Farnum wrote: > >> On Mon, Dec 8, 2014 at 6:39 PM, Christian Balzer wrote: >> > >> > Hello, >> > >> > Debian Jessie cluster, thus kernel 3.16, ceph 0.80.7. >> > 3 storage nodes with 8 OSDs (journals on 4 S

Re: [ceph-users] Unexplainable slow request

2014-12-08 Thread Gregory Farnum
On Mon, Dec 8, 2014 at 8:51 PM, Christian Balzer wrote: > On Mon, 8 Dec 2014 20:36:17 -0800 Gregory Farnum wrote: > >> They never fixed themselves? > As I wrote, it took a restart of OSD 8 to resolve this on the next day. > >> Did the reported times ever increase? >

Re: [ceph-users] Unexplainable slow request

2014-12-09 Thread Gregory Farnum
On Mon, Dec 8, 2014 at 6:39 PM, Christian Balzer wrote: > > Hello, > > Debian Jessie cluster, thus kernel 3.16, ceph 0.80.7. > 3 storage nodes with 8 OSDs (journals on 4 SSDs) each, 3 mons. > 2 compute nodes, everything connected via Infiniband. > > This is pre-production, currently there are only

Re: [ceph-users] active+degraded on an empty new cluster

2014-12-09 Thread Gregory Farnum
It looks like your OSDs all have weight zero for some reason. I'd fix that. :) -Greg On Tue, Dec 9, 2014 at 6:24 AM Giuseppe Civitella < giuseppe.civite...@gmail.com> wrote: > Hi, > > thanks for the quick answer. > I did try the force_create_pg on a pg but is stuck on "creating": > root@ceph-mon1:

Re: [ceph-users] Multiple MDS servers...

2014-12-09 Thread Gregory Farnum
MDSes. -Greg On Mon, Dec 8, 2014 at 10:48 AM, JIten Shah wrote: > Do I need to update the ceph.conf to support multiple MDS servers? > > —Jiten > > On Nov 24, 2014, at 6:56 AM, Gregory Farnum wrote: > >> On Sun, Nov 23, 2014 at 10:36 PM, JIten Shah wrote: >>>

Re: [ceph-users] Query about osd pool default flags & hashpspool

2014-12-09 Thread Gregory Farnum
On Tue, Dec 9, 2014 at 10:24 AM, Abhishek L wrote: > Hi > > I was going through various conf options to customize a ceph cluster and > came across `osd pool default flags` in pool-pg config ref[1]. Though > the value specifies an integer, though I couldn't find a mention of > possible values this

Re: [ceph-users] Is mon initial members used after the first quorum?

2014-12-10 Thread Gregory Farnum
On Tue, Dec 9, 2014 at 3:11 PM, Christopher Armstrong wrote: > Hi folks, > > I think we have a bit of confusion around how initial members is used. I > understand that we can specify a single monitor (or a subset of monitors) so > that the cluster can form a quorum when it first comes up. This is

Re: [ceph-users] Is mon initial members used after the first quorum?

2014-12-10 Thread Gregory Farnum
dr = 192.168.2.202:6789 > > > > [client.radosgw.gateway] > host = deis-store-gateway > keyring = /etc/ceph/ceph.client.radosgw.keyring > rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock > log file = /dev/stdout > > > On Wed, Dec 10, 2014 at 11:40 AM,

Re: [ceph-users] RESOLVED Re: Cluster with pgs in active (unclean) status

2014-12-11 Thread Gregory Farnum
Was there any activity against your cluster when you reduced the size from 3 -> 2? I think maybe it was just taking time to percolate through the system if nothing else was going on. When you reduced them to size 1 then data needed to be deleted so everything woke up and started processing. -Greg

Re: [ceph-users] unable to repair PG

2014-12-11 Thread Gregory Farnum
On Thu, Dec 11, 2014 at 2:57 AM, Luis Periquito wrote: > Hi, > > I've stopped OSD.16, removed the PG from the local filesystem and started > the OSD again. After ceph rebuilt the PG in the removed OSD I ran a > deep-scrub and the PG is still inconsistent. What led you to remove it from osd 16? Is

Re: [ceph-users] Is mon initial members used after the first quorum?

2014-12-11 Thread Gregory Farnum
On Thu, Dec 11, 2014 at 2:21 AM, Joao Eduardo Luis wrote: > On 12/11/2014 04:28 AM, Christopher Armstrong wrote: >> >> If someone could point me to where this fix should go in the code, I'd >> actually love to dive in - I've been wanting to contribute back to Ceph, >> and this bug has hit us perso

Re: [ceph-users] Missing some pools after manual deployment

2014-12-12 Thread Gregory Farnum
On Fri, Dec 12, 2014 at 11:06 AM, Patrick Darley wrote: > Hi there, > > I am using a custom Linux OS, with ceph v0.89. > > > I have been following the monitor bootstrap instructions [1]. > > I have a problem in that the OS is firmly on the systemd bandwagon > and lacks support to run the provided

Re: [ceph-users] unable to repair PG

2014-12-12 Thread Gregory Farnum
What version of Ceph are you running? Is this a replicated or erasure-coded pool? On Fri, Dec 12, 2014 at 1:11 AM, Luis Periquito wrote: > Hi Greg, > > thanks for your help. It's always highly appreciated. :) > > On Thu, Dec 11, 2014 at 6:41 PM, Gregory Farnum wrote: >&g

Re: [ceph-users] Is cache tiering production ready?

2014-12-17 Thread Gregory Farnum
Cache tiering is a stable, functioning system. Those particular commands are for testing and development purposes, not something you should run (although they ought to be safe). -Greg On Wed, Dec 17, 2014 at 1:44 AM Yujian Peng wrote: > Hi, > Since firefly, ceph can support cache tiering. > Cache

Re: [ceph-users] Double-mounting of RBD

2014-12-17 Thread Gregory Farnum
On Wed, Dec 17, 2014 at 2:31 PM, McNamara, Bradley wrote: > I have a somewhat interesting scenario. I have an RBD of 17TB formatted > using XFS. I would like it accessible from two different hosts, one > mapped/mounted read-only, and one mapped/mounted as read-write. Both are > shared using Sam

Re: [ceph-users] Content-length error uploading "big" files to radosgw

2014-12-18 Thread Gregory Farnum
On Thu, Dec 18, 2014 at 4:04 AM, Daniele Venzano wrote: > Hello, > > I have been trying to upload multi-gigabyte files to CEPH via the object > gateway, using both the swift and s3 APIs. > > With file up to about 2GB everything works as expected. > > With files bigger than that I get back a "400 B

Re: [ceph-users] Reproducable Data Corruption with cephfs kernel driver

2014-12-18 Thread Gregory Farnum
On Wed, Dec 17, 2014 at 8:52 PM, Lindsay Mathieson wrote: > I'be been experimenting with CephFS for funning KVM images (proxmox). > > cephfs fuse version - 0.87 > > cephfs kernel module - kernel version 3.10 > > > Part of my testing involves running a Windows 7 VM up and running > CrystalDiskMark

Re: [ceph-users] 1256 OSD/21 server ceph cluster performance issues.

2014-12-18 Thread Gregory Farnum
What kind of uploads are you performing? How are you testing? Have you looked at the admin sockets on any daemons yet? Examining the OSDs to see if they're behaving differently on the different requests is one angle of attack. The other is look into is if the RGW daemons are hitting throttler limit

Re: [ceph-users] 1256 OSD/21 server ceph cluster performance issues.

2014-12-19 Thread Gregory Farnum
On Thu, Dec 18, 2014 at 8:44 PM, Sean Sullivan wrote: > Thanks for the reply Gegory, > > Sorry if this is in the wrong direction or something. Maybe I do not > understand > > To test uploads I either use bash time and either python-swiftclient or boto > key.set_contents_from_filename to the radosg

Re: [ceph-users] Running ceph in Deis/Docker

2014-12-22 Thread Gregory Farnum
On Sun, Dec 21, 2014 at 8:20 PM, Jimmy Chu wrote: > Hi, > > This is a followup question to my previous question. When the last monitor > in a ceph monitor set is down, what is the proper way to boot up the ceph > monitor set again? > > On one hand, we could try not to make this happen, but on the

Re: [ceph-users] Ceph on ArmHF Ubuntu 14.4LTS?

2014-12-22 Thread Gregory Farnum
On Sun, Dec 21, 2014 at 11:54 PM, Christopher Kunz wrote: > Hi all, > > I'm trying to get a working PoC installation of Ceph done on an armhf > platform. I'm failing to find working Ceph packages (so does > ceph-deploy, too) for Ubuntu Trusty LTS. The ceph.com repos don't have > anything besides c

Re: [ceph-users] Slow requests: waiting_for_osdmap

2014-12-22 Thread Gregory Farnum
On Mon, Dec 22, 2014 at 8:20 AM, Wido den Hollander wrote: > Hi, > > While investigating slow requests on a Firefly (0.80.7) I looked at the > historic ops from the admin socket. > > On a OSD which just spitted out some slow requests I noticed: > > "received_at": "2014-12-22 17:08:41.496

Re: [ceph-users] Slow requests: waiting_for_osdmap

2014-12-22 Thread Gregory Farnum
On Mon, Dec 22, 2014 at 10:30 AM, Wido den Hollander wrote: > For example, two ops: > > #1: > > { "description": "osd_sub_op(client.2433432.0:61603164 20.424 > 19038c24\/rbd_data.d7c912ae8944a.08b6\/head\/\/20 [] v > 63283'8301089 snapset=0=[]:[] snapc=0=[])", > "received_at

Re: [ceph-users] Not running multiple services on the same machine?

2015-01-02 Thread Gregory Farnum
I think it's just for service isolation that people recommend splitting them. The only technical issue I can think of is that you don't want to put kernel clients on the same OS as an OSD (due to deadlock scenarios under memory pressure and writeback). -Greg On Sat, Dec 27, 2014 at 12:11 PM Christo

Re: [ceph-users] RadosGW slow gc

2015-01-02 Thread Gregory Farnum
You can store radosgw data in a regular EC pool without any caching in front. I suspect this will work better for you, as part of the slowness is probably the OSDs trying to look up all the objects in the ec pool before deleting them. You should be able to check if that's the case by looking at the

Re: [ceph-users] Weighting question

2015-01-02 Thread Gregory Farnum
The meant-for-human-consumption free space estimates and things won't be accurate if you weight evenly instead of by size, but otherwise things should work just fine -- you'll simply get full OSD warnings when you have 1TB/OSD. -Greg On Thu, Jan 1, 2015 at 3:10 PM Lindsay Mathieson < lindsay.mathie

Re: [ceph-users] Adding Crush Rules

2015-01-02 Thread Gregory Farnum
I'm on my phone at the moment, but I think if you run "ceph osd crush rule" it will prompt you with the relevant options? On Tue, Dec 30, 2014 at 6:00 PM Lindsay Mathieson < lindsay.mathie...@gmail.com> wrote: > Is there a command to do this without decompiling/editing/compiling the > crush > set?

Re: [ceph-users] OSD weights and space usage

2015-01-03 Thread Gregory Farnum
On Saturday, January 3, 2015, Max Power < mailli...@ferienwohnung-altenbeken.de> wrote: > Ceph is a cool software but from time to time I am getting gray hairs > with it. And I hope that's because of a misunderstanding. This time I > want to balance the load between three osd's evenly (same usage

Re: [ceph-users] Added OSD's, weighting

2015-01-03 Thread Gregory Farnum
You might try temporarily increasing the backfill allowance params so that the stuff can move around more quickly. Given the cluster is idle it's definitely hitting those limits. ;) -Greg On Saturday, January 3, 2015, Lindsay Mathieson wrote: > I just added 4 OSD's to my 2 OSD "cluster" (2 Nodes

Re: [ceph-users] rbd snapshot slow restore

2015-01-05 Thread Gregory Farnum
On Mon, Jan 5, 2015 at 12:11 PM, Robert LeBlanc wrote: > If Ceph snapshots work like VM snapshots (and I don't have any reason to > believe otherwise), the snapshot will never grow larger than the size of the > base image. If the same blocks are rewritten, then they are just rewritten > in the sna

Re: [ceph-users] Added OSD's, weighting

2015-01-05 Thread Gregory Farnum
On Sat, Jan 3, 2015 at 8:53 PM, Christian Balzer wrote: > On Sat, 3 Jan 2015 16:21:29 +1000 Lindsay Mathieson wrote: > >> I just added 4 OSD's to my 2 OSD "cluster" (2 Nodes, now have 3 OSD's per >> node). >> >> Given its the weekend and not in use, I've set them all to weight 1, but >> looks like

Re: [ceph-users] Cache tiers flushing logic

2015-01-06 Thread Gregory Farnum
On Tue, Dec 30, 2014 at 11:38 AM, Erik Logtenberg wrote: >> >> Hi Erik, >> >> I have tiering working on a couple test clusters. It seems to be >> working with Ceph v0.90 when I set: >> >> ceph osd pool set POOL hit_set_type bloom >> ceph osd pool set POOL hit_set_count 1 >> ceph osd pool set PO

Re: [ceph-users] What to do when a parent RBD clone becomes corrupted

2015-01-06 Thread Gregory Farnum
On Thu, Dec 18, 2014 at 1:21 PM, Robert LeBlanc wrote: > Before we base thousands of VM image clones off of one or more snapshots, I > want to test what happens when the snapshot becomes corrupted. I don't > believe the snapshot will become corrupted through client access to the > snapshot, but so

Re: [ceph-users] OSDs with btrfs are down

2015-01-06 Thread Gregory Farnum
On Sun, Jan 4, 2015 at 8:10 AM, Lionel Bouton wrote: > On 01/04/15 16:25, Jiri Kanicky wrote: >> Hi. >> >> I have been experiencing same issues on both nodes over the past 2 >> days (never both nodes at the same time). It seems the issue occurs >> after some time when copying a large number of f

Re: [ceph-users] OSDs with btrfs are down

2015-01-06 Thread Gregory Farnum
I'm afraid I don't know what would happen if you change those options. Hopefully we've set it up so things continue to work, but we definitely don't test it. -Greg On Tue, Jan 6, 2015 at 8:22 AM Lionel Bouton wrote: > On 01/06/15 02:36, Gregory Farnum wrote: > > [.

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2015-01-08 Thread Gregory Farnum
On Wed, Jan 7, 2015 at 9:55 PM, Christian Balzer wrote: > On Wed, 7 Jan 2015 17:07:46 -0800 Craig Lewis wrote: > >> On Mon, Dec 29, 2014 at 4:49 PM, Alexandre Oliva wrote: >> >> > However, I suspect that temporarily setting min size to a lower number >> > could be enough for the PGs to recover.

Re: [ceph-users] Uniform distribution

2015-01-09 Thread Gregory Farnum
100GB objects (or ~40 on a hard drive!) are way too large for you to get an effective random distribution. -Greg On Thu, Jan 8, 2015 at 5:25 PM, Mark Nelson wrote: > On 01/08/2015 03:35 PM, Michael J Brewer wrote: >> >> Hi all, >> >> I'm working on filling a cluster to near capacity for testing p

Re: [ceph-users] Documentation of ceph pg query

2015-01-09 Thread Gregory Farnum
On Fri, Jan 9, 2015 at 1:24 AM, Christian Eichelmann wrote: > Hi all, > > as mentioned last year, our ceph cluster is still broken and unusable. > We are still investigating what has happened and I am taking more deep > looks into the output of ceph pg query. > > The problem is that I can find so

Re: [ceph-users] ceph on peta scale

2015-01-09 Thread Gregory Farnum
On Thu, Jan 8, 2015 at 5:46 AM, Zeeshan Ali Shah wrote: > I just finished configuring ceph up to 100 TB with openstack ... Since we > are also using Lustre in our HPC machines , just wondering what is the > bottle neck in ceph going on Peta Scale like Lustre . > > any idea ? or someone tried it I

Re: [ceph-users] Is ceph production ready? [was: Ceph PG Incomplete = Cluster unusable]

2015-01-09 Thread Gregory Farnum
On Fri, Jan 9, 2015 at 2:00 AM, Nico Schottelius wrote: > Lionel, Christian, > > we do have the exactly same trouble as Christian, > namely > > Christian Eichelmann [Fri, Jan 09, 2015 at 10:43:20AM +0100]: >> We still don't know what caused this specific error... > > and > >> ...there is currently

Re: [ceph-users] cephfs modification time

2015-01-12 Thread Gregory Farnum
What versions of all the Ceph pieces are you using? (Kernel client/ceph-fuse, MDS, etc) Can you provide more details on exactly what the program is doing on which nodes? -Greg On Fri, Jan 9, 2015 at 5:15 PM, Lorieri wrote: > first 3 stat commands shows blocks and size changing, but not the times

Re: [ceph-users] ceph on peta scale

2015-01-12 Thread Gregory Farnum
; On Fri, Jan 9, 2015 at 7:15 PM, Gregory Farnum wrote: >> >> On Thu, Jan 8, 2015 at 5:46 AM, Zeeshan Ali Shah >> wrote: >> > I just finished configuring ceph up to 100 TB with openstack ... Since >> > we >> > are also using Lustre in our HPC machines ,

Re: [ceph-users] cephfs modification time

2015-01-12 Thread Gregory Farnum
gt; https://github.com/ActiveState/tail > FAILED -> /usr/bin/tail of a Google docker image running debian wheezy > PASSED -> /usr/bin/tail of a ubuntu 14.04 docker image > PASSED -> /usr/bin/tail of the coreos release 494.5.0 > > > Tests in machine #1 (same machine that

Re: [ceph-users] reset osd perf counters

2015-01-12 Thread Gregory Farnum
"perf reset" on the admin socket. I'm not sure what version it went in to; you can check the release logs if it doesn't work on whatever you have installed. :) -Greg On Mon, Jan 12, 2015 at 2:26 PM, Shain Miley wrote: > Is there a way to 'reset' the osd perf counters? > > The numbers for osd 73

Re: [ceph-users] cephfs modification time

2015-01-14 Thread Gregory Farnum
Awesome, thanks for the bug report and the fix, guys. :) -Greg On Mon, Jan 12, 2015 at 11:18 PM, 严正 wrote: > I tracked down the bug. Please try the attached patch > > Regards > Yan, Zheng > > > > >> 在 2015年1月13日,07:40,Gregory Farnum 写道: >> >> Zheng, t

Re: [ceph-users] NUMA zone_reclaim_mode

2015-01-14 Thread Gregory Farnum
On Mon, Jan 12, 2015 at 8:25 AM, Dan Van Der Ster wrote: > > On 12 Jan 2015, at 17:08, Sage Weil wrote: > > On Mon, 12 Jan 2015, Dan Van Der Ster wrote: > > Moving forward, I think it would be good for Ceph to a least document > this behaviour, but better would be to also detect when > zone_recla

Re: [ceph-users] How to tell a VM to write more local ceph nodes than to the network.

2015-01-14 Thread Gregory Farnum
On Tue, Jan 13, 2015 at 1:03 PM, Roland Giesler wrote: > I have a 4 node ceph cluster, but the disks are not equally distributed > across all machines (they are substantially different from each other) > > One machine has 12 x 1TB SAS drives (h1), another has 8 x 300GB SAS (s3) and > two machines

Re: [ceph-users] How to tell a VM to write more local ceph nodes than to the network.

2015-01-16 Thread Gregory Farnum
On Fri, Jan 16, 2015 at 2:52 AM, Roland Giesler wrote: > On 14 January 2015 at 21:46, Gregory Farnum wrote: >> >> On Tue, Jan 13, 2015 at 1:03 PM, Roland Giesler >> wrote: >> > I have a 4 node ceph cluster, but the disks are not equally distributed >

Re: [ceph-users] Cache data consistency among multiple RGW instances

2015-01-19 Thread Gregory Farnum
On Sun, Jan 18, 2015 at 6:40 PM, ZHOU Yuan wrote: > Hi list, > > I'm trying to understand the RGW cache consistency model. My Ceph > cluster has multiple RGW instances with HAProxy as the load balancer. > HAProxy would choose one RGW instance to serve the request(with > round-robin). > The questio

Re: [ceph-users] Cache data consistency among multiple RGW instances

2015-01-19 Thread Gregory Farnum
n > > > On Mon, Jan 19, 2015 at 10:58 PM, Gregory Farnum wrote: > > On Sun, Jan 18, 2015 at 6:40 PM, ZHOU Yuan wrote: > >> Hi list, > >> > >> I'm trying to understand the RGW cache consistency model. My Ceph > >> cluster has multiple RGW inst

Re: [ceph-users] Behaviour of Ceph while OSDs are down

2015-01-20 Thread Gregory Farnum
On Tue, Jan 20, 2015 at 2:40 AM, Christian Eichelmann wrote: > Hi all, > > I want to understand what Ceph does if several OSDs are down. First of our, > some words to our Setup: > > We have 5 Monitors and 12 OSD Server, each has 60x2TB Disks. These Servers > are spread across 4 racks in our datace

Re: [ceph-users] Automatically timing out/removing dead hosts?

2015-01-20 Thread Gregory Farnum
On Tue, Jan 20, 2015 at 1:32 AM, Christopher Armstrong wrote: > Hi folks, > > We have many users who run Deis on AWS, and our default configuration places > hosts in an autoscaling group. Ceph runs on all hosts in the cluster > (monitors and OSDs), and users have reported losing quorum after havin

Re: [ceph-users] Is it possible to compile and use ceph with Raspberry Pi single-board computers?

2015-01-21 Thread Gregory Farnum
Joao has done it in the past so it's definitely possible, but I confess I don't know what if anything he had to hack up to make it work or what's changed since then. ARMv6 is definitely not something we worry about when adding dependencies. :/ -Greg On Thu, Jan 15, 2015 at 12:17 AM, Prof. Dr. Chri

Re: [ceph-users] CEPHFS with Erasure Coded Pool for Data and Replicated Pool for Meta Data

2015-01-21 Thread Gregory Farnum
On Tue, Jan 20, 2015 at 5:48 AM, Mohamed Pakkeer wrote: > > Hi all, > > We are trying to create 2 PB scale Ceph storage cluster for file system > access using erasure coded profiles in giant release. Can we create Erasure > coded pool (k+m = 10 +3) for data and replicated (4 replicas) pool for > m

Re: [ceph-users] CEPHFS with Erasure Coded Pool for Data and Replicated Pool for Meta Data

2015-01-21 Thread Gregory Farnum
> release of CephFS happen with erasure coded pool ? We are ready to test >> > peta-byte scale CephFS cluster with erasure coded pool. >> > >> > >> > -Mohammed Pakkeer >> > >> > On Wed, Jan 21, 2015 at 9:11 AM, Gregory Farnum >> &g

Re: [ceph-users] How to do maintenance without falling out of service?

2015-01-21 Thread Gregory Farnum
On Mon, Jan 19, 2015 at 8:40 AM, J David wrote: > A couple of weeks ago, we had some involuntary maintenance come up > that required us to briefly turn off one node of a three-node ceph > cluster. > > To our surprise, this resulted in failure to write on the VM's on that > ceph cluster, even thoug

Re: [ceph-users] 4 GB mon database?

2015-01-21 Thread Gregory Farnum
On Mon, Jan 19, 2015 at 2:48 PM, Brian Rak wrote: > Awhile ago, I ran into this issue: http://tracker.ceph.com/issues/10411 > > I did manage to solve that by deleting the PGs, however ever since that > issue my mon databases have been growing indefinitely. At the moment, I'm > up to 3404 sst file

Re: [ceph-users] CEPHFS with Erasure Coded Pool for Data and Replicated Pool for Meta Data

2015-01-21 Thread Gregory Farnum
creating CephFS? Also we would like to know, when will the production > release of CephFS happen with erasure coded pool ? We are ready to test > peta-byte scale CephFS cluster with erasure coded pool. > > > -Mohammed Pakkeer > > On Wed, Jan 21, 2015 at 9:11 AM, Gregory Farnum

<    4   5   6   7   8   9   10   11   12   13   >