from:"Gregory Farnum"

Re: [ceph-users] "full ratio" - how does this work with multiple pools on seprate OSDs?

2014-03-04 Thread Gregory Farnum

lable in both > Rack buckets and continue running, or only rebalance in one rack bucket, > resulting in exceeding the full ratio and locking up? > > Thanks, > > -Tom > > -Original Message- > From: Gregory Farnum [mailto:g...@inktank.com] > Sent: Tuesday, Marc

Re: [ceph-users] High fs_apply_latency on one node

2014-03-04 Thread Gregory Farnum

[ Re-adding the list. ] On Mon, Mar 3, 2014 at 3:28 PM, Chris Kitzmiller wrote: > On Mar 3, 2014, at 4:19 PM, Gregory Farnum wrote: >> The apply latency is how long it's taking for the backing filesystem to ack >> (not sync to disk) writes from the OSD. Either it's gett

Re: [ceph-users] mds crashes constantly

2014-03-10 Thread Gregory Farnum

Hmm, at first glance it looks like you're using multiple active MDSes and you've created some snapshots and part of that state got corrupted somehow. The log files should have a slightly more helpful (including line numbers) stack trace at the end, and might have more context for what's gone wrong.

Re: [ceph-users] rbd create ... STRIPINGV2 and format 2 or later required

2014-03-11 Thread Gregory Farnum

If the stripe size and object size are the same it's just chunking -- that's our default. Should work fine. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Mar 11, 2014 at 8:23 AM, Jean-Charles LOPEZ wrote: > Hi Dieter, > > you have a problem with your command. > > You

Re: [ceph-users] qemu-rbd

2014-03-11 Thread Gregory Farnum

On Tue, Mar 11, 2014 at 1:38 PM, Sushma Gurram wrote: > Hi, > > > > I'm trying to follow the instructions for QEMU rbd installation at > http://ceph.com/docs/master/rbd/qemu-rbd/ > > > > I tried to write a raw qemu image to ceph cluster using the following > command > > qemu-img convert -f raw -O

Re: [ceph-users] qemu-rbd

2014-03-11 Thread Gregory Farnum

On Tue, Mar 11, 2014 at 2:24 PM, Sushma Gurram wrote: > It seems good with master branch. Sorry about the confusion. > > On a side note, is it possible to create/access the block device using librbd > and run fio on it? ...yes? librbd is the userspace library that QEMU is using to access it to b

Re: [ceph-users] Some Questions about using ceph with VMware

2014-03-12 Thread Gregory Farnum

On Wednesday, March 12, 2014, Florian Krauß wrote: > Hello everyone, > > this is the first time i ever write to a mailing list, please be patient > with me (especially for my poor english)... > Im trying to reach my Bachelors Degree in Computer Science, Im doing a > Project which involves ceph. >

Re: [ceph-users] Replication lag in block storage

2014-03-13 Thread Gregory Farnum

On Thu, Mar 13, 2014 at 3:56 PM, Greg Poirier wrote: > We've been seeing this issue on all of our dumpling clusters, and I'm > wondering what might be the cause of it. > > In dump_historic_ops, the time between op_applied and sub_op_commit_rec or > the time between commit_sent and sub_op_applied i

Re: [ceph-users] Replication lag in block storage

2014-03-13 Thread Gregory Farnum

> rbd_data.67b14a2ae8944a.8fac [write 3325952~868352] 6.5255f5fd > e660)", > "received_at": "2014-03-13 20:41:40.227813", > "age": "320.017087", > "duration": "0.086852", >

Re: [ceph-users] Replication lag in block storage

2014-03-14 Thread Gregory Farnum

alleviated by migrating journals to SSDs, but I am looking to > rebuild in the near future--so am willing to hobble in the meantime. > > I am surprised that our all SSD cluster is also underperforming. I am trying > colocating the journal on the same disk with all SSDs at the moment

Re: [ceph-users] why my "fs_commit_latency" is so high ? is it normal ?

2014-03-16 Thread Gregory Farnum

That seems a little high; how do you have your system configured? That latency is how long it takes for the hard drive to durably write out something to the journal. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Mar 16, 2014 at 9:59 PM, wrote: > > [root@storage1 ~]#

Re: [ceph-users] 答复: Re: why my "fs_commit_latency" is so high ? is it normal ?

2014-03-16 Thread Gregory Farnum

t; [osd.8] > host = storage1 > > [osd.9] > host = storage1 > > [osd.10] > host = storage1 > > [osd.11] > host = storage1 > > [osd.12] > host = storage1 > > [osd.13] > host = storage1 > > [osd.14] > host = storage1 > > [osd.15] > h

Re: [ceph-users] Pool Count incrementing on each create even though I removed the pool each time

2014-03-18 Thread Gregory Farnum

On Tue, Mar 18, 2014 at 12:20 PM, Sage Weil wrote: > On Tue, 18 Mar 2014, John Spray wrote: >> Hi Matt, >> >> This is expected behaviour: pool IDs are not reused. > > The IDs go up, but I think the 'count' shown there should not.. i.e. > num_pools != max_pool_id. So probably a subtle bug, I expec

Re: [ceph-users] Broken bucket on upgrade

2014-03-19 Thread Gregory Farnum

Exactly what errors did you see, from which log? In general the OSD does suicide on filesystem errors. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Mar 19, 2014 at 4:06 AM, Mike Bryant wrote: > So I've done some more digging, and running the radosgw in debug mode I

Re: [ceph-users] ceph mapreduce history server issue

2014-03-19 Thread Gregory Farnum

I haven't worked with Hadoop in a while, but from the error it sounds like the map reduce server needs another config option set specifying which filesystem to work with. I don't think those instructions you linked to are tested with hadoop 2. -Greg Software Engineer #42 @ http://inktank.com | http

Re: [ceph-users] Questions about Ceph

2014-03-21 Thread Gregory Farnum

When starting this you should be aware that the filesystem is not yet fully supported. On Thursday, March 20, 2014, Jordi Sion wrote: > Hello, > > I plan to setup a Ceph cluster for a small size hosting company. The aim > is to have customers data (website and mail folders) in a distributed > cl

Re: [ceph-users] Ceph RBD 0.78 Bug or feature?

2014-03-24 Thread Gregory Farnum

I don't remember what features should exist where, but I expect that the cluster is making use of features that the kernel client doesn't support yet (despite the very new kernel). Have you checked to see if there's anything interesting in dmesg? -Greg Software Engineer #42 @ http://inktank.com | h

Re: [ceph-users] Cannot create keys for new 0.78 deployment - protocol mismatch

2014-03-24 Thread Gregory Farnum

That is pretty strange, but I think probably there's somehow a mismatch between the installed versions. Can you check with the --version flag on both binaries? On Monday, March 24, 2014, Mark Kirkwood wrote: > Hi, > > I'm redeploying my development cluster after building 0.78 from src on > Ubunt

Re: [ceph-users] OSD Restarts cause excessively high load average and "requests are blocked > 32 sec"

2014-03-25 Thread Gregory Farnum

How long does it take for the OSDs to restart? Are you just issuing a restart command via upstart/sysvinit/whatever? How many OSDMaps are generated from the time you issue that command to the time the cluster is healthy again? This sounds like an issue we had for a while where OSDs would start pee

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-25 Thread Gregory Farnum

On Mon, Mar 24, 2014 at 6:26 PM, hjcho616 wrote: > I tried the patch twice. First time, it worked. There was no issue. > Connected back to MDS and was happily running. All three MDS demons were > running ok. > > Second time though... all three demons were alive. Health was reported OK. > Howev

Re: [ceph-users] Monitors stuck in "electing"

2014-03-25 Thread Gregory Farnum

On Tue, Mar 25, 2014 at 9:24 AM, Travis Rhoden wrote: > Okay, last one until I get some guidance. Sorry for the spam, but wanted to > paint a full picture. Here are debug logs from all three mons, capturing > what looks like an election sequence to me: > > ceph0: > 2014-03-25 16:17:24.324846 7fa

Re: [ceph-users] Monitors stuck in "electing"

2014-03-25 Thread Gregory Farnum

evels showing what the individual pipes are doing will narrow it down on the Ceph side.) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Mar 25, 2014 at 10:05 AM, Travis Rhoden wrote: > > > > On Tue, Mar 25, 2014 at 12:53 PM, Gregory Farnum wrote: >> >

Re: [ceph-users] Cannot create keys for new 0.78 deployment - protocol mismatch

2014-03-25 Thread Gregory Farnum

on 0.78-325-ge5a4f5e (e5a4f5ed005c9349be94b19ef33d6fe08271c798) >>> >>> On 25/03/14 14:16, Mark Kirkwood wrote: >>>> >>>> Yeah, that is my feeling too - however both ceph and ceph-mon claim to >>>> be the same version...and the dates on the var

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-25 Thread Gregory Farnum

On Tue, Mar 25, 2014 at 9:56 AM, hjcho616 wrote: > I am merely putting the client to sleep and waking it up. When it is up, > running ls on the mounted directory. As far as I am concerned at very high > level I am doing the same thing. All are running 3.13 kernel Debian > provided. > > When tha

Re: [ceph-users] OSD Restarts cause excessively high load average and "requests are blocked > 32 sec"

2014-03-31 Thread Gregory Farnum

es to 300+ > > http://pastie.org/pastes/8968950/text?key=0e0bs1ojbm2arnexn52iwq > > Regards, > Quenten > > -Original Message- > From: Gregory Farnum [mailto:g...@inktank.com] > Sent: Wednesday, 26 March 2014 2:02 AM > To: Quenten Grasso > Cc: Kyle Bader; ceph-users@lists.cep

Re: [ceph-users] Backward compatibility of librados in Firefly

2014-03-31 Thread Gregory Farnum

I believe it's just that there was an issue for a while where the return codes were incorrectly not being filled in, and now they are. So the prval params you pass in when constructing the compound ops are where the values will be set. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.

Re: [ceph-users] Problem with object size in rados bench

2014-03-31 Thread Gregory Farnum

That's not an expected error when running this test; have you validated that your cluster is working in any other ways? Eg, what's the output of "ceph -s". -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Mar 28, 2014 at 5:53 AM, wrote: > Hi, > > When I try "rados -p

Re: [ceph-users] cephx key for CephFS access only

2014-03-31 Thread Gregory Farnum

At present, the only security permission on the MDS is "allowed to do stuff", so "rwx" and "*" are synonymous. In general "*" means "is an admin", though, so you'll be happier in the future if you use "rwx". You may also want a more restrictive set of monitor capabilities as somebody else recently

Re: [ceph-users] Security Hole?

2014-03-31 Thread Gregory Farnum

Hmm, this might be considered a bit of a design oversight. Looking at the auth keys is a read operation, and the client has read permissions... You might want to explore the more fine-grained command-based monitor permissions as a workaround, but I've created a ticket to try and close that read per

Re: [ceph-users] Mon hangs when started after Emperor upgrade

2014-03-31 Thread Gregory Farnum

Is the mon process doing anything (that is, does it have any CPU usage)? This looks to be an internal leveldb issue, but not one that we've run into before, so I think there must be something unique about the leveldb store involved. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com

Re: [ceph-users] How do I know which object takes storage space?

2014-03-31 Thread Gregory Farnum

Not directly. However, that "used" total is compiled by summing up the output of "df" from each individual OSD disk. There's going to be some space used up by the local filesystem metadata, by RADOS metadata like OSD maps, and (depending on configuration) your journal files. 2350MB / 48 OSDs = ~49M

Re: [ceph-users] OSD mystery

2014-03-31 Thread Gregory Farnum

If you wait longer, you should see the remaining OSDs get marked down. We detect down OSDs in two ways: 1) OSDs heartbeat each other frequently and issue reports when the heartbeat responses take too long. (This is the main way.) 2) OSDs periodically send statistics to the monitors, and if these st

Re: [ceph-users] OSD mystery

2014-03-31 Thread Gregory Farnum

; dk > > On Mon, Mar 31, 2014 at 12:47 PM, Gregory Farnum wrote: >> >> If you wait longer, you should see the remaining OSDs get marked down. >> We detect down OSDs in two ways: >> 1) OSDs heartbeat each other frequently and issue reports when the >> heartbeat res

Re: [ceph-users] OSDs crashing frequently

2014-03-31 Thread Gregory Farnum

Can you reproduce this with "debug osd = 20" and "debug ms = 1" set on the OSD? I think we'll need that data to track down what exactly has gone wrong here. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Mar 31, 2014 at 1:22 PM, Aaron Ten Clay wrote: > Hello fellow Ce

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-31 Thread Gregory Farnum

Yes, Zheng's fix for the MDS crash is in current mainline and will be in the next Firefly RC release. Sage, is there something else we can/should be doing when a client goes to sleep that we aren't already? (ie, flushing out all dirty data or something and disconnecting?) -Greg Software Engineer #

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-01 Thread Gregory Farnum

On Tue, Apr 1, 2014 at 7:12 AM, Yan, Zheng wrote: > On Tue, Apr 1, 2014 at 10:02 PM, Kenneth Waegeman > wrote: >> After some more searching, I've found that the source of the problem is with >> the mds and not the mon.. The mds crashes, generates a core dump that eats >> the local space, and in t

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-02 Thread Gregory Farnum

fails (the first of "mds.0.16 is_laggy 600.641332 > 15 since last acked beacon") and see if there's anything tell-tale going on at the time. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Apr 2, 2014 at 3:39 AM, Kenneth Waegeman wrote: > >

Re: [ceph-users] MDS crash when client goes to sleep

2014-04-02 Thread Gregory Farnum

Emperor release (0.72) ? > > I think I have the same problem than hjcho616 : Debian Wheezy with 3.13 > backports, and MDS dying when a client shutdown. > > On 03/31/2014 11:46 PM, Gregory Farnum wrote: >> Yes, Zheng's fix for the MDS crash is in current mainline and will be

Re: [ceph-users] Setting root directory in fstab with Fuse

2014-04-02 Thread Gregory Farnum

It's been a while, but I think you need to use the long form "client_mountpoint" config option here instead. If you search the list archives it'll probably turn up; this is basically the only reason we ever discuss "-r". ;) Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Apr

Re: [ceph-users] cephx key for CephFS access only

2014-04-02 Thread Gregory Farnum

t; > Since the mds permissions are functionally equivalent, either I need extra > rights on the monitor, or the OSDs. Does a client need to access the > metadata pool in order to do a CephFS mount? > > I'll experiment a bit and report back. > > > On Mon, Mar 31, 2014 at

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-02 Thread Gregory Farnum

; >>> --- begin dump of recent events --- >>> -1> 2014-04-01 11:59:27.137779 7ffec89c6700 5 mds.0.10 initiating >>> monitor reconnect; maybe we're not the slow one >>> -> 2014-04-01 11:59:27.137787 7ffec89c6700 10 monclient(hunting): >>> _reo

Re: [ceph-users] Setting root directory in fstab with Fuse

2014-04-03 Thread Gregory Farnum

Yes. On Thu, Apr 3, 2014 at 12:56 AM, Florent B wrote: > Thank you Gregory ! > > I think I found all options : > https://github.com/ceph/ceph/blob/master/src/common/config_opts.h > > Is that right ? > > On 04/02/2014 04:19 PM, Gregory Farnum wrote: >> It's be

Re: [ceph-users] CephFS behaviour for missing objects

2014-04-03 Thread Gregory Farnum

The filesystem interprets nonexistent file objects as holes -- so, zeroes. This is expected. If you actually deleted *metadata* objects it would detect that and fail. -Greg On Thursday, April 3, 2014, Danny Luhde-Thompson < da...@meantradingsystems.com> wrote: > I accidentally removed some MDS ob

Re: [ceph-users] out then rm / just rm an OSD?

2014-04-03 Thread Gregory Farnum

On Thursday, April 3, 2014, Chad Seys wrote: > On Thursday, April 03, 2014 07:57:58 Dan Van Der Ster wrote: > > Hi, > > By my observation, I don't think that marking it out before crush rm > would > > be any safer. > > > > Normally what I do (when decommissioning an OSD or whole server) is stop >

Re: [ceph-users] Live database files on Ceph

2014-04-03 Thread Gregory Farnum

Ceph will allow anything; it's just providing a block device. How it performs will depend quite a lot on the database workload you're applying, though. We've heard from people who think it's wonderful and others who don't, depending on what hardware they're using and what their use case is. You'll

Re: [ceph-users] Ceph User Committee monthly meeting #1 : executive summary

2014-04-04 Thread Gregory Farnum

On Fri, Apr 4, 2014 at 11:15 AM, Milosz Tanski wrote: > Loic, > > The writeup has been helpful. > > What I'm curious about (and hasn't been mentioned) is can we use > erasure with CephFS? What steps have to be taken in order to setup > erasure coding for CephFS? Lots. CephFS takes advantage of al

Re: [ceph-users] Recover cluster health after loosing 100% of OSDs.

2014-04-07 Thread Gregory Farnum

On Sat, Apr 5, 2014 at 10:00 AM, Max Kutsevol wrote: > Hello! > > I am new to ceph, please take that into account. > > I'm experimenting with 3mons+2osds setup and got into situation when I > recreated both of osds. > > My pools: > ceph> osd lspools > 0 data,1 metadata, > > These are just the def

Re: [ceph-users] RGW and Object Lifecycle Managment

2014-04-07 Thread Gregory Farnum

Nope, that's not supported. See http://ceph.com/docs/master/radosgw/s3/#features-support -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Apr 7, 2014 at 6:41 PM, Craig Lewis wrote: > Does RGW support the S3 Object Lifecycle Management? > http://docs.aws.amazon.com/Amazo

Re: [ceph-users] How to detect journal problems

2014-04-08 Thread Gregory Farnum

On Tuesday, April 8, 2014, Christian Balzer wrote: > On Tue, 08 Apr 2014 14:19:20 +0200 Josef Johansson wrote: > > > > On 08/04/14 10:39, Christian Balzer wrote: > > > On Tue, 08 Apr 2014 10:31:44 +0200 Josef Johansson wrote: > > > > > >> On 08/04/14 10:04, Christian Balzer wrote: > > >>> Hello,

Re: [ceph-users] Question about mark_unfound_lost on RGW metadata.

2014-04-08 Thread Gregory Farnum

On Tue, Apr 8, 2014 at 4:57 PM, Craig Lewis wrote: > >> >> pg query says the recovery state is: >> "might_have_unfound": [ >> { "osd": 11, >> "status": "querying"}, >> { "osd": 13, >> "status": "already probed"}], >> > I

Re: [ceph-users] CephFS feature set mismatch with v0.79 and recent kernel

2014-04-08 Thread Gregory Farnum

This flag won't be listed as required if you don't have any erasure coding parameters in your OSD/crush maps. So if you aren't using it, you should remove the EC rules and the kernel should be happy. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Apr 8, 2014 at 6:08 PM

Re: [ceph-users] Ceph and shared backend storage.

2014-04-08 Thread Gregory Farnum

Ceph is designed to handle reliability in its system rather than in an external one. You could set it up to use that storage and not do its own replication, but then you lose availability if the OSD process hosts disappear, etc. And the filesystem (which I guess is the part you're interested in) is

Re: [ceph-users] OSD space usage 2x object size after rados put

2014-04-09 Thread Gregory Farnum

I don't think the backing store should be seeing any effects like that. What are the filenames which are using up that space inside the folders? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Apr 9, 2014 at 1:58 AM, Mark Kirkwood wrote: > Hi all, > > I've noticed that

Re: [ceph-users] How to detect journal problems

2014-04-09 Thread Gregory Farnum

much is waiting to get into the durable journal, not waiting to get flushed out of it. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Apr 9, 2014 at 3:06 AM, Christian Balzer wrote: > On Tue, 8 Apr 2014 09:35:19 -0700 Gregory Farnum wrote: > >> On Tuesd

Re: [ceph-users] Question about mark_unfound_lost on RGW metadata.

2014-04-09 Thread Gregory Farnum

e +1.714.602.1309 > Email cle...@centraldesktop.com > > Central Desktop. Work together in ways you never thought possible. > Connect with us Website | Twitter | Facebook | LinkedIn | Blog > > On 4/8/14 18:27 , Gregory Farnum wrote: > > On Tue, Apr 8, 2014 at 4:57 PM, Cr

Re: [ceph-users] How to detect journal problems

2014-04-09 Thread Gregory Farnum

On Wed, Apr 9, 2014 at 8:03 AM, Christian Balzer wrote: > > Hello, > > On Wed, 9 Apr 2014 07:31:53 -0700 Gregory Farnum wrote: > >> journal_max_write_bytes: the maximum amount of data the journal will >> try to write at once when it's coalescing multiple pen

Re: [ceph-users] CephFS feature set mismatch with v0.79 and recent kernel

2014-04-09 Thread Gregory Farnum

neral? Even if the kernel > doesn't support EC pools directly, but would work in a cluster with EC pools > in use? > > Thanks, > -mike > > > On Wed, 9 Apr 2014, Gregory Farnum wrote: > >> This flag won't be listed as required if you don't have any erasure &g

Re: [ceph-users] OSD space usage 2x object size after rados put

2014-04-09 Thread Gregory Farnum

y on these multi host configurations that > have osd's using whole devices (both setups installed using ceph-deploy, so > in theory nothing exotic about 'em except for the multi 'hosts' are actually > VMs). > > Regards > > Mark > > On 10/04/14 02:27

Re: [ceph-users] create multiple OSDs without changing CRUSH until one last step

2014-04-10 Thread Gregory Farnum

Sounds like you want to explore the auto-in settings, which can prevent new OSDs from being automatically accepted into the cluster. Should turn up if you search ceph.com/docs. :) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Apr 10, 2014 at 1:45 PM, wrote: > Hi All

Re: [ceph-users] ceph osd reweight cleared on reboot

2014-04-10 Thread Gregory Farnum

Yes. It's awkward and the whole "two weights" thing needs a bit of UI reworking, but it's expected behavior. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Apr 10, 2014 at 3:59 PM, Craig Lewis wrote: > I've got some OSDs that are nearfull. Hardware is ordered, and I'

Re: [ceph-users] CephFS MDS manual deployment

2014-04-10 Thread Gregory Farnum

I don't know if there's any formal documentation, but it's a lot simpler than the other components because it doesn't use any local storage (except for the keyring). You basically just need to generate a key and turn it on. Have you set one up by hand before? -Greg On Thursday, April 10, 2014, Ada

Re: [ceph-users] create multiple OSDs without changing CRUSH until one last step

2014-04-10 Thread Gregory Farnum

How many monitors do you have? It's also possible that re-used numbers won't get caught in this, depending on the process you went through to clean them up, but I don't remember the details of the code here. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Apr 10, 2014 a

Re: [ceph-users] create multiple OSDs without changing CRUSH until one last step

2014-04-11 Thread Gregory Farnum

If you never ran "osd rm" then the monitors still believe it's an existing OSD. You can run that command after doing the crush rm stuff, but you should definitely do so. On Friday, April 11, 2014, Chad Seys wrote: > Hi Greg, > > How many monitors do you have? > > 1 . :) > > > It's also possible

Re: [ceph-users] OSD space usage 2x object size after rados put

2014-04-11 Thread Gregory Farnum

On Wed, Apr 9, 2014 at 8:41 PM, Mark Kirkwood wrote: > Redoing (attached, 1st file is for 2x space, 2nd for normal). I'm seeing: > > $ diff osd-du.0.txt osd-du.1.txt > 924,925c924,925 > < 2048 /var/lib/ceph/osd/ceph-1/current/5.1a_head/file__head_2E6FB49A__5 > < 2048/var/lib/ceph/osd/ceph-1/cu

Re: [ceph-users] mon memory usage (again)

2014-04-11 Thread Gregory Farnum

On Fri, Apr 11, 2014 at 11:12 PM, Christian Balzer wrote: > > Hello, > > 3 node cluster (2 storage with 2 OSDs one dedicated mon), 3 mons total. > Debian Jessie, thus 3.13 kernel and Ceph 0.72.2. > > 2 of the mons (including the leader) are using around 100MB RSS and one > was using about 1.1GB. >

Re: [ceph-users] Upgrading from Dumpling to Emperor

2014-04-14 Thread Gregory Farnum

That bug was resolved a long time ago; as long as you're using one of the Emperor point releases you'll be fine. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Apr 14, 2014 at 1:46 AM, Stanislav Yanchev wrote: > Hello, I have a question about upgrading from the latest

Re: [ceph-users] CephFS MDS manual deployment

2014-04-14 Thread Gregory Farnum

On Thu, Apr 10, 2014 at 7:27 PM, Adam Clark wrote: > Wow, that was quite simple > > mkdir /var/lib/ceph/mds/ceph-0 > ceph auth get-or-create mds.0 mds 'allow' osd 'allow *' mon 'allow *' > > /var/lib/ceph/mds/ceph-0/keyring > ceph-mds --id 0 > > mount -t ceph ceph-mon01:6789:/ /mnt -o name=admin,

Re: [ceph-users] atomic + asynchr

2014-04-14 Thread Gregory Farnum

You just need to wait for the ondisk or complete ack in whatever interface you choose. It won't come back until the data is persisted to all extant copies. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Apr 7, 2014 at 4:08 PM, Steven Paster wrote: > I am using the Cep

Re: [ceph-users] ceph hbase issue

2014-04-14 Thread Gregory Farnum

This looks like some kind of HBase issue to me (which I can't help with; I've never used it), but I guess if I were looking at Ceph I'd check if it was somehow configured such that the needed files are located in different pools (or other separate security domains) that might be set up wrong. -Greg

Re: [ceph-users] ceph mds log

2014-04-15 Thread Gregory Farnum

Don't do that. I'm pretty sure it doesn't actually work, and if it does it certainly won' perform better than with it off. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Apr 15, 2014 at 1:53 PM, Qing Zheng wrote: > Hi - > > We have a question on mds journaling. > > Is

Re: [ceph-users] ceph mds log

2014-04-15 Thread Gregory Farnum

ion to? > > Cheers, > > -- Qing > > -Original Message- > From: Gregory Farnum [mailto:g...@inktank.com] > Sent: Tuesday, April 15, 2014 5:02 PM > To: Qing Zheng > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] ceph mds log > > Don't do tha

Re: [ceph-users] force_create_pg not working

2014-04-15 Thread Gregory Farnum

What are the results of "ceph osd pg 11.483 query"? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Apr 15, 2014 at 4:01 PM, Craig Lewis wrote: > I have 1 incomplete PG. The data is gone, but I can upload it again. I > just need to make the cluster start working so I

Re: [ceph-users] Troubles MDS

2014-04-16 Thread Gregory Farnum

What's the backtrace from the MDS crash? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Apr 16, 2014 at 7:11 AM, Georg Höllrigl wrote: > Hello, > > Using Ceph MDS with one active and one standby server - a day ago one of the > mds crashed and I restarted it. > Tonight

Re: [ceph-users] RBD write access patterns and atime

2014-04-16 Thread Gregory Farnum

On Wed, Apr 16, 2014 at 8:08 AM, Dan van der Ster wrote: > Dear ceph-users, > > I've recently started looking through our FileStore logs to better > understand the VM/RBD IO patterns, and noticed something interesting. Here > is a snapshot of the write lengths for one OSD server (with 24 OSDs) --

Re: [ceph-users] force_create_pg not working

2014-04-16 Thread Gregory Farnum

s > Senior Systems Engineer > Office +1.714.602.1309 > Email cle...@centraldesktop.com > > Central Desktop. Work together in ways you never thought possible. > Connect with us Website | Twitter | Facebook | LinkedIn | Blog > > On 4/15/14 16:07 , Gregory Farnum

Re: [ceph-users] Troubles MDS

2014-04-17 Thread Gregory Farnum

On Thu, Apr 17, 2014 at 12:45 AM, Georg Höllrigl wrote: > Hello Greg, > > I've searched - but don't see any backtraces... I've tried to get some more > info out of the logs. I really hope, there is something interesting in it: > > It all started two days ago with an authentication error: > > 2014-

Re: [ceph-users] Public access to RBD

2014-04-21 Thread Gregory Farnum

On Monday, April 21, 2014, Loic Dachary wrote: > Hi, > > I would like to allow users to create,use and delete RBD volumes, up to X > GB, from a single pool. The user is a Debian GNU/Linux box using krbd. The > sysadmin of the box is not trusted to have unlimited access to the Ceph > cluster but (

Re: [ceph-users] Troubles MDS

2014-04-24 Thread Gregory Farnum

On Thursday, April 24, 2014, Georg Höllrigl wrote: > > And that's exactly what it sounds like — the MDS isn't finding objects >> that are supposed to be in the RADOS cluster. >> > > I'm not sure, what I should think about that. MDS shouldn't access data > for RADOS and vice versa? The metadata

Re: [ceph-users] Pool with empty name recreated

2014-04-24 Thread Gregory Farnum

Yehuda says he's fixed several of these bugs in recent code, but if you're seeing it from a recent dev release, please file a bug! Likewise if you're on a named release and would like to see a backport. :) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Apr 24, 2014 at

Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image

2014-04-25 Thread Gregory Farnum

If you had it working in Havana I think you must have been using a customized code base; you can still do the same for Icehouse. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Apr 25, 2014 at 12:55 AM, Maciej Gałkiewicz wrote: > Hi > > After upgrading my OpenStack clu

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-25 Thread Gregory Farnum

Hmm, it looks like your on-disk SessionMap is horrendously out of date. Did your cluster get full at some point? In any case, we're working on tools to repair this now but they aren't ready for use yet. Probably the only thing you could do is create an empty sessionmap with a higher version than t

Re: [ceph-users] mon osd min down reports

2014-04-25 Thread Gregory Farnum

The monitor requires at least number of reports, from a set of OSDs whose size is at least . So with 9 reporters and 3 reports, it would wait until 9 OSDs had reported an OSD down (basically ignoring the reports setting, as it is smaller). -Greg On Friday, April 25, 2014, Craig Lewis wrote: >

Re: [ceph-users] bandwidth with Ceph - v0.59 (Bobtail)

2014-04-25 Thread Gregory Farnum

Bobtail is really too old to draw any meaningful conclusions from; why did you choose it? That's not to say that performance on current code will be better (though it very much might be), but the internal architecture has changed in some ways that will be particularly important for the futex profi

Re: [ceph-users] Ceph "stale+active+clean"

2014-04-26 Thread Gregory Farnum

This usually means that your OSDs all stopped running at the same time, and will eventually be marked down by the monitors. You should verify that they're running. -Greg On Saturday, April 26, 2014, Srinivasa Rao Ragolu wrote: > Hi, > > My monitor node and osd nodes are running fine. But my clus

Re: [ceph-users] Only one OSD log available per node?

2014-04-28 Thread Gregory Farnum

It is not. My guess from looking at the time stamps is that maybe you have a log rotation system set up that isn't working properly? -Greg On Sunday, April 27, 2014, Indra Pramana wrote: > Dear all, > > I have multiple OSDs per node (normally 4) and I realised that for all the > nodes that I hav

Re: [ceph-users] Only one OSD log available per node?

2014-04-28 Thread Gregory Farnum

3 ceph-osd.15.log.3.gz > > Any advice? > > Thank you. > > > On Mon, Apr 28, 2014 at 11:26 PM, Gregory Farnum > > > wrote: > >> It is not. My guess from looking at the time stamps is that maybe you >> have a log rotation system set up that isn't wor

Re: [ceph-users] Only one OSD log available per node?

2014-04-29 Thread Gregory Farnum

one. > > Is there a way I can verify if the logs are actually being written by the > ceph-osd processes? > > Looking forward to your reply, thank you. > > Cheers. > > > > On Tue, Apr 29, 2014 at 12:28 PM, Gregory Farnum wrote: >> >> Are your OSDs actual

Re: [ceph-users] Information needed

2014-04-29 Thread Gregory Farnum

That's not quite how Ceph works. I recommend perusing some of the introductory documentation at ceph.com/docs, but in short: When you set up a ceph pool, you are specifying groups of hard drives which will be used together. When you create an RBD volume in a pool, you are saying "I want this volume

Re: [ceph-users] Mons deadlocked after they all died

2014-04-29 Thread Gregory Farnum

Monitor keys don't change; I think something else must be going on. Did you remove any of their stores? Are the local filesystems actually correct (fsck)? The ceph-create-keys is a red herring and will stop as soon as. The monitors do get into a quorum. -Greg On Tuesday, April 29, 2014, Marc wro

Re: [ceph-users] Mons deadlocked after they all died

2014-04-29 Thread Gregory Farnum

ind it > when needed... maybe somewhere in the debugging section of the wiki? > > On 29/04/2014 18:25, Gregory Farnum wrote: >> Monitor keys don't change; I think something else must be going on. Did you >> remove any of their stores? Are the local filesystems actually correct

Re: [ceph-users] Unable to bring cluster up

2014-04-29 Thread Gregory Farnum

You'll need to go look at the individual OSDs to determine why they aren't on. All the cluster knows is that the OSDs aren't communicating properly. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Apr 29, 2014 at 3:06 AM, Gandalf Corvotempesta wrote: > After a simple "

Re: [ceph-users] OSD could not start and PG down & incomplete

2014-04-29 Thread Gregory Farnum

It looks like the OSD is expecting a file to be there, and it is, but it's incorrectly empty or something. Did you lose power to the node? Have you run fsck on the local filesystem? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Apr 29, 2014 at 3:05 AM, vernon1...@126.

Re: [ceph-users] Ceph unstable when upgrading from emperor (v0.72.2) to firefly (v0.80-rc1-16-g2708c3c)

2014-04-29 Thread Gregory Farnum

Hmm, I think this might actually be another instance of http://tracker.ceph.com/issues/8232, which was just reported yesterday. That said, I think that if you restart one OSD at a time, you should be able to avoid the race condition. It was restarting all of them simultaneously that got you into tr

Re: [ceph-users] Mons deadlocked after they all died

2014-04-29 Thread Gregory Farnum

On Tue, Apr 29, 2014 at 3:28 PM, Marc wrote: > Thank you for the help so far! I went for option 1 and that did solve > that problem. However quorum has not been restored. Here's the > information I can get: > > mon a+b are in state Electing and have been for more than 2 hours now. > mon c does rep

Re: [ceph-users] Ceph unstable when upgrading from emperor (v0.72.2) to firefly (v0.80-rc1-16-g2708c3c)

2014-05-01 Thread Gregory Farnum

to reboot the hosts as Guang Yang reported in the issue > tracking #8232 to resolve this issue? > > Best regards, > Thanh Tran > > > On Wed, Apr 30, 2014 at 12:53 AM, Gregory Farnum wrote: >> >> Hmm, I think this might actually be another instance of >> http

Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Gregory Farnum

What's your cluster look like? I wonder if you can just remove the bad PG from osd.4 and let it recover from the existing osd.1 -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sat, May 3, 2014 at 9:17 AM, Jeff Bachtel wrote: > This is all on firefly rc1 on CentOS 6 > > I ha

Re: [ceph-users] some unfound object

2014-05-05 Thread Gregory Farnum

"Need" means "I know this version of the object has existed at some time in the cluster". "Have" means "this is the newest version of the object I currently have available". If you're missing OSDs (or have been in the past) you may need to invoke some of the "lost" commands to tell the OSDs to just

Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Gregory Farnum

e >> tried downing osd.4 and manually deleting the pg directory in question with >> the hope that the cluster would roll back epochs for 0.2f, but all it does >> is recreate the pg directory (empty) on osd.4. >> >> Jeff >> >> On 05/05/2014 04:33 PM, Gregory Fa

Re: [ceph-users] some unfound object

2014-05-06 Thread Gregory Farnum

down. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com > > ________ > vernon1...@126.com > > From: Gregory Farnum > Date: 2014-05-06 05:06 > To: vernon1...@126.com > CC: ceph-users > Subject: Re: [ceph-users] some unfound object &

Re: [ceph-users] Cache tiering

2014-05-07 Thread Gregory Farnum

On Wed, May 7, 2014 at 5:05 AM, Gandalf Corvotempesta wrote: > Very simple question: what happen if server bound to the cache pool goes down? > For example, a read-only cache could be archived by using a single > server with no redudancy. > Is ceph smart enough to detect that cache is unavailable

< 6 7 8 9 10 11 12 13 14 15 >

1001 - 1100 of 2358 matches

Mail list logo