Would love to hear if you discover a way to get zapping incomplete PGs!
Perhaps this is a common enough issue to open an issue?
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Find out which OSD it is:
ceph health detail
Squeeze blocks off the affected OSD:
ceph osd reweight OSDNUM 0.8
Repeat with any OSD which becomes toofull.
Your cluster is only about 50% used, so I think this will be enough.
Then when it finishes, allow data back on OSD:
ceph osd reweight OSDN
Thanks Craig,
I'll jiggle the OSDs around to see if that helps.
Otherwise, I'm almost certain removing the pool will work. :/
Have a good one,
Chad.
> I had the same experience with force_create_pg too.
>
> I ran it, and the PGs sat there in creating state. I left the cluster
> overnight, and
Hi all,
Did I notice correctly that firefly is going to be supported "long term"
whereas Giant is not going to be supported as long?
http://ceph.com/releases/v0-80-firefly-released/
This release will form the basis for our long-term supported release Firefly,
v0.80.x.
http://ceph.com/uncategor
Hi Craig,
> If all of your PGs now have an empty down_osds_we_would_probe, I'd run
> through this discussion again.
Yep, looks to be true.
So I ran:
# ceph pg force_create_pg 2.5
and it has been creating for about 3 hours now. :/
# ceph health detail | grep creating
pg 2.5 is stuck inactive
Hi Craig and list,
> > > If you create a real osd.20, you might want to leave it OUT until you
> > > get things healthy again.
I created a real osd.20 (and it turns out I needed an osd.21 also).
ceph pg x.xx query no longer lists down osds for probing:
"down_osds_we_would_probe": [],
But I ca
Hi Craig,
> You'll have trouble until osd.20 exists again.
>
> Ceph really does not want to lose data. Even if you tell it the osd is
> gone, ceph won't believe you. Once ceph can probe any osd that claims to
> be 20, it might let you proceed with your recovery. Then you'll probably
> need to
Hi Sam,
> > Amusingly, that's what I'm working on this week.
> >
> > http://tracker.ceph.com/issues/7862
Well, thanks for any bugfixes in advance! :)
> Also, are you certain that osd 20 is not up?
> -Sam
Yep.
# ceph osd metadata 20
Error ENOENT: osd.20 does not exist
So part of ceph thinks
Hi Sam,
> 'ceph pg query'.
Thanks.
Looks like ceph is looking for and osd.20 which no longer exists:
"probing_osds": [
"1",
"7",
"15",
"16"],
"down_osds_we_would_probe": [
20],
So perhaps during
Hi Sam,
> Incomplete usually means the pgs do not have any complete copies. Did
> you previously have more osds?
No. But could have OSDs quitting after hitting assert(0 == "we got a bad
state machine event"), or interacting with kernel 3.14 clients have caused the
incomplete copies?
How can
On Monday, November 03, 2014 17:34:06 you wrote:
> If you have osds that are close to full, you may be hitting 9626. I
> pushed a branch based on v0.80.7 with the fix, wip-v0.80.7-9626.
> -Sam
Thanks Sam I may have been hitting that as well. I certainly hit too_full
conditions often. I am abl
>
> No, it is a change, I just want to make sure I understand the
> scenario. So you're reducing CRUSH weights on full OSDs, and then
> *other* OSDs are crashing on these bad state machine events?
That is right. The other OSDs shutdown sometime later. (Not immediately.)
I really haven't tested
On Monday, November 03, 2014 13:50:05 you wrote:
> On Mon, Nov 3, 2014 at 11:41 AM, Chad Seys wrote:
> > On Monday, November 03, 2014 13:22:47 you wrote:
> >> Okay, assuming this is semi-predictable, can you start up one of the
> >> OSDs that is going to fail wi
On Monday, November 03, 2014 13:22:47 you wrote:
> Okay, assuming this is semi-predictable, can you start up one of the
> OSDs that is going to fail with "debug osd = 20", "debug filestore =
> 20", and "debug ms = 1" in the config file and then put the OSD log
> somewhere accessible after it's cras
> There's a "ceph osd metadata" command, but i don't recall if it's in
> Firefly or only giant. :)
It's in firefly. Thanks, very handy.
All the OSDs are running 0.80.7 at the moment.
What next?
Thanks again,
Chad.
___
ceph-users mailing list
ceph-us
P.S. The OSDs interacted with some 3.14 krbd clients before I realized that
kernel version was too old for the firefly CRUSH map.
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi All,
I upgraded from emperor to firefly. Initial upgrade went smoothly and all
placement groups were active+clean .
Next I executed
'ceph osd crush tunables optimal'
to upgrade CRUSH mapping.
Now I keep having OSDs go down or have requests blocked for long periods of
time.
I start
Hi Craig,
> It's part of the way the CRUSH hashing works. Any change to the CRUSH map
> causes the algorithm to change slightly.
Dan@cern could not replicate my observations, so I plan to follow his
procedure (fake create an OSD, wait for rebalance, remove fake OSD) in the
near future to see i
Hi Dan,
I'd like to decommission a node to reproduce the problem and post enough
information for you (at least) to understand what is going on.
Unfortunately I'm a ceph newbie, so I'm not sure what info would be of
interest before/during the drain.
Probably the crushmap would be of interest
Hi Dan,
I'm using Emperor (0.72). Though I would think CRUSH maps have not changed
that much btw versions?
> That sounds bizarre to me, and I can't reproduce it. I added an osd (which
> was previously not in the crush map) to a fake host=test:
>
>ceph osd crush create-or-move osd.52 1.0 r
Hi Mariusz,
> Usually removing OSD without removing host happens when you
> remove/replace dead drives.
>
> Hosts are in map so
>
> * CRUSH wont put 2 copies on same node
> * you can balance around network interface speed
That does not answer the original question IMO: "Why does the CRUSH map d
Hi all,
When I remove all OSDs on a given host, then wait for all objects (PGs?) to
be to be active+clean, then remove the host (ceph osd crush remove hostname),
that causes the objects to shuffle around the cluster again.
Why does the CRUSH map depend on hosts that no longer have OSDs on the
Hi All,
Does anyone have a script or sequence of commands to prepare all drives on a
single computer for use by ceph, and then start up all OSDs on the computer at
one time?
I feel this would be faster and less network traffic than adding one drive
at a time, which is what the current script
Hi All,
Is it possible to decrease pg_num? I was able to decrease pgp_num, but when
I try to decrease pg_num I get an error:
# ceph osd pool set tibs pg_num 1024
specified pg_num 1024 <= current 2048
Thanks!
C.
___
ceph-users mailing list
ceph-users
Thanks for the link Blairo!
I can think of a use case already! (combo replicated pool / erasure pool for
a virtual tape library)
! Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi All,
Could someone point me to a document (possibly a FAQ :) ) describing the
limitations of erasure coded pools? Hopefully it would contain the when and
how to use them as well.
E.g. I read about people using replicated pools as a front end to erasure
coded pools, but I don't know why
Hi John,
Thanks for the reply! Yes, I agree Ceph is exciting! Keep up the good
work!
> Using librbd, as you've pointed out, doesn't run afoul of potential Linux
> kernel deadlocks; however, you normally wouldn't encounter this type of
> situation in a production cluster anyway as you'd likel
Hi All,
What are the pros and cons of running a virtual machine (with qemu-kvm)
whose image is accessed via librbd or by mounting /dev/rbdX ?
I've heard that the librbd method has the advantage of not being vulnerable
to deadlocks due to memory allocation problems. ?
Would one also benefit
> This is for mapping kernel rbd devices on system startup, and belong with
> ceph-common (which hasn't yet been but soon will be split out from ceph)
Great! Yeah, I was hoping to map /dev/rbd without installing all the ceph
daemons!
> along with the 'rbd' cli utility. It isn't directly relat
Hi all,
Also /etc/ceph/rbdmap in librbd1 rather than ceph?
Thanks,
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi all,
Shouldn't /etc/init.d/rbdmap be in the librbd package rather than in "ceph"?
Thanks,
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
a
> given PG will start up to 5 recovery operations at time of a total of 15
> operations active at a time. This allows recovery to spread operations
> across more or less PGs at any given time.
>
> David Zafman
> Senior Developer
> http://www.inktank.com
>
> On Apr 24,
Hi All,
What does osd_recovery_max_single_start do? I could not find a description
of it.
Thanks!
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Thanks for the tip Brian!
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hello all,
I want to set the following value for ceph:
osd recovery max active = 1
Where do I place this setting? And how do I ensure that it is active?
Do I place it only in /etc/ceph/ceph.conf on the monitor in a section like so:
[osd]
osd recovery max active = 1
Or do I have to place i
Hi Greg,
> How many monitors do you have?
1 . :)
> It's also possible that re-used numbers won't get caught in this,
> depending on the process you went through to clean them up, but I
> don't remember the details of the code here.
Yeah, too bad. I'm following the standard removal procedure in
Hi Sage et al,
Thanks for the info! How stable are the cutting edge kernels like 3.13 ?
Is 3.8 (e.g. from Ubuntu Raring) a better choice?
Thanks again!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-use
Hi,
I'm running Debian Wheezy which has kernel version 3.2.54-2 .
Should I be using rbd-fuse 0.72.2 or the kernel client to mount rbd devices?
I.e. This is an old kernel relative to Emperor, but maybe bugs are
backported to the kernel?
Thanks!
Chad.
On Thursday, April 03, 2014 07:57:58 Dan Van Der Ster wrote:
> Hi,
> By my observation, I don't think that marking it out before crush rm would
> be any safer.
>
> Normally what I do (when decommissioning an OSD or whole server) is stop
> the OSD process, then crush rm / osd rm / auth del the OSD
> Backfilling process can be stopped/paused at some point due to config
> settings or other reasons, so ceph reflects current state of PGs that are
> in fact degraded because replica is missing on fresh OSD. Those PGs
> actually being backfilled display 'degraded+backfilling' state.
Also makes se
y exist and might be appropriate.
Thanks again,
Chad.
On Friday, March 28, 2014 04:49:02 you wrote:
> On 28.03.14, 0:38, Chad Seys wrote:
> > Hi all,
> >
> >Beginning with a cluster with only "active+clean" PGS, adding an OSD
> >causes
> >
> > o
Hi all,
Beginning with a cluster with only "active+clean" PGS, adding an OSD causes
objects to be "degraded".
Does this mean that ceph deletes replicas before copying them to the new
OSD?
Or does degraded also mean that there are not replicas on the target OSD,
even though there are alrea
42 matches
Mail list logo