Can you reproduce with
debug osd = 20
debug filestore = 20
debug ms = 1
?
-Sam
On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
wrote:
> Hi,
>
> I join :
> - osd.20 is one of osd that I detect which makes crash other OSD.
> - osd.23 is one of osd which crash when i start osd.20
> - mds, is one
> osd.20's ?
>
> Thank you so much for the help
>
> Regards
> Pierre
>
> Le 01/07/2014 23:51, Samuel Just a écrit :
>
>> Can you reproduce with
>> debug osd = 20
>> debug filestore = 20
>> debug ms = 1
>> ?
>> -Sam
>>
>> On
the osd.20 some other osd crash. I pass from 31 osd up to 16.
> I remark that after this the number of down+peering PG decrease from 367 to
> 248. It's "normal" ? May be it's temporary, the time that the cluster
> verifies all the PG ?
>
> Regards
> Pierre
>
&
Also, what version did you upgrade from, and how did you upgrade?
-Sam
On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just wrote:
> Ok, in current/meta on osd 20 and osd 23, please attach all files matching
>
> ^osdmap.13258.*
>
> There should be one such file on each osd. (should look
Joao: this looks like divergent osdmaps, osd 20 and osd 23 have
differing ideas of the acting set for pg 2.11. Did we add hashes to
the incremental maps? What would you want to know from the mons?
-Sam
On Wed, Jul 2, 2014 at 3:10 PM, Samuel Just wrote:
> Also, what version did you upgrade f
OSD and 3 mds.
>
> Pierre
>
> PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta
> directory.
>
> Le 03/07/2014 00:10, Samuel Just a écrit :
>
>> Also, what version did you upgrade from, and how did you upgrade?
>> -Sam
>>
>>
rre: do you recall how and when that got set?
-Sam
On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just wrote:
> Yeah, divergent osdmaps:
> 555ed048e73024687fc8b106a570db4f osd-20_osdmap.13258__0_4E62BB79__none
> 6037911f31dc3c18b05499d24dcdbe5c osd-23_osdmap.13258__0_4E62BB79__none
>
> Joao:
Can you confirm from the admin socket that all monitors are running
the same version?
-Sam
On Wed, Jul 2, 2014 at 4:15 PM, Pierre BLONDEAU
wrote:
> Le 03/07/2014 00:55, Samuel Just a écrit :
>
>> Ah,
>>
>> ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crus
n":"0.82"}
> # ceph --admin-daemon /var/run/ceph/ceph-mon.joe.asok version
> {"version":"0.82"}
>
> Pierre
>
> Le 03/07/2014 01:17, Samuel Just a écrit :
>
>> Can you confirm from the admin socket that all monitors are running
>> t
Can you attach your ceph.conf for your osds?
-Sam
On Thu, Jul 10, 2014 at 8:01 AM, Christian Eichelmann
wrote:
> I can also confirm that after upgrading to firefly both of our clusters
> (test and live) were going from 0 scrub errors each for about 6 Month to
> about 9-12 per week...
> This also
;b. is this reconciliation done automatically during deep-scrub
> or does it have to be done "manually" because there is no majority?
>
> Thanks,
>
> -Sudip
>
>
> -Original Message-----
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com]
>> failures than with inconsistencies found during deep scrub - would you
>> agree?
>>
>> Re: repair - do you mean the "repair" process during deep scrub - if yes,
>> this is automatic - correct?
>> Or
>> Are you referring to the explicit
When you get the next inconsistency, can you copy the actual objects
from the osd store trees and get them to us? That might provide a
clue.
-Sam
On Fri, Jul 11, 2014 at 6:52 AM, Randy Smith wrote:
>
>
>
> On Thu, Jul 10, 2014 at 4:40 PM, Samuel Just wrote:
>>
>> It cou
gt;
> sage
>
>
> On Fri, 11 Jul 2014, Samuel Just wrote:
>
>> When you get the next inconsistency, can you copy the actual objects
>> from the osd store trees and get them to us? That might provide a
>> clue.
>> -Sam
>>
>> On Fri, Jul 11, 2014 at 6
ead/DIR_6/DIR_C/DIR_5/rb.0.b0ce3.238e1f29.000b__head_34DC35C6__3
> ?
>
>
> On Fri, Jul 11, 2014 at 2:00 PM, Samuel Just wrote:
>>
>> Also, what filesystem are you using?
>> -Sam
>>
>> On Fri, Jul 11, 2014 at 10:37 AM, Sage Weil wrote:
>> > One other thing w
And grab the xattrs as well.
-Sam
On Fri, Jul 11, 2014 at 2:39 PM, Samuel Just wrote:
> Right.
> -Sam
>
> On Fri, Jul 11, 2014 at 2:05 PM, Randy Smith wrote:
>> Greetings,
>>
>> I'm using xfs.
>>
>> Also, when, in a previous email, you asked if I
> www.adams.edu
> 719-587-7741
>
> On Jul 12, 2014 10:34 AM, "Samuel Just" wrote:
>>
>> Here's a diff of the two files. One of the two files appears to
>> contain ceph leveldb keys? Randy, do you have an idea of what this
>> rbd image is being use
That seems reasonable. Bug away!
-Sam
On Mon, Sep 8, 2014 at 5:11 PM, Somnath Roy wrote:
> Hi Sage/Sam,
>
>
>
> I faced a crash in OSD with latest Ceph master. Here is the log trace for
> the same.
>
>
>
> ceph version 0.85-677-gd5777c4 (d5777c421548e7f039bb2c77cb0df2e9c7404723)
>
> 1: ceph-osd(
Added a comment about the approach.
-Sam
On Tue, Sep 9, 2014 at 1:33 PM, Somnath Roy wrote:
> Hi Sam/Sage,
>
> As we discussed earlier, enabling the present OpTracker code degrading
> performance severely. For example, in my setup a single OSD node with 10
> clients is reaching ~103K read iops wi
.@vger.kernel.org
> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel Just
> Sent: Wednesday, September 10, 2014 11:17 AM
> To: Somnath Roy
> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org;
> ceph-users@lists.ceph.com
> Subject: Re: OpTracker optimization
>
I don't quite understand.
-Sam
On Wed, Sep 10, 2014 at 2:38 PM, Somnath Roy wrote:
> Thanks Sam.
> So, you want me to go with optracker/shadedopWq , right ?
>
> Regards
> Somnath
>
> -Original Message-
> From: Samuel Just [mailto:sam.j...@inktank.com]
&g
acker for the ios going through
> ms_dispatch path.
>
> 2. Additionally, for ios going through ms_fast_dispatch, you want me to
> implement optracker (without internal shard) per opwq shard
>
> Am I right ?
>
> Thanks & Regards
> Somnath
>
> -Original
th
>
> -Original Message-
> From: Sage Weil [mailto:sw...@redhat.com]
> Sent: Wednesday, September 10, 2014 8:33 PM
> To: Somnath Roy
> Cc: Samuel Just; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
> Subject: RE: OpTracker optimization
>
> I had two s
lander)
> * osd: add fadvise flags to ObjectStore API (Jianpeng Ma)
> * osd: add get_latest_osdmap asok command (#9483 #9484 Mykola Golub)
> * osd: EIO on whole-object reads when checksum is wrong (Sage Weil)
> * osd: filejournal: don't cache journal when not using direct IO (Jianp
The fix for this should be in 0.93, so this must be something different,
can you reproduce with
debug osd = 20
debug ms = 1
debug filestore = 20
and post the log to http://tracker.ceph.com/issues/11027?
On Wed, 2015-03-04 at 00:04 +0100, Yann Dupont wrote:
> Le 03/03/2015 22:03, Italo Santos a é
You'll probably have to recreate osds with the same ids (empty ones),
let them boot, stop them, and mark them lost. There is a feature in the
tracker to improve this behavior: http://tracker.ceph.com/issues/10976
-Sam
On Mon, 2015-03-09 at 12:24 +, joel.merr...@gmail.com wrote:
> Hi,
>
> I'm
What do you mean by "unblocked" but still "stuck"?
-Sam
On Mon, 2015-03-09 at 22:54 +, joel.merr...@gmail.com wrote:
> On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just wrote:
> > You'll probably have to recreate osds with the same ids (empty ones),
> > let
Can you reproduce this with
debug osd = 20
debug filestore = 20
debug ms = 1
on the crashing osd? Also, what sha1 are the other osds and mons running?
-Sam
- Original Message -
From: "Malcolm Haak"
To: ceph-users@lists.ceph.com
Sent: Tuesday, March 10, 2015 3:28:26 AM
Subject: [ceph-us
Joao, it looks like map 2759 is causing trouble, how would he get the
full and incremental maps for that out of the mons?
-Sam
On Tue, 2015-03-10 at 14:12 +, Malcolm Haak wrote:
> Hi Samuel,
>
> The sha1? I'm going to admit ignorance as to what you are looking for. They
> are all running the
tree, osd dump etc). There were blocked_by
> operations that no longer exist after doing the OSD addition.
>
> Side note, spent some time yesterday writing some bash to do this
> programatically (might be useful to others, will throw on github)
>
> On Tue, Mar 10, 2015 at
.467.query
https://gist.github.com/05dbcdc9ee089bd52d0c
On Tue, Mar 10, 2015 at 2:49 PM, Samuel Just wrote:
Yeah, get a ceph pg query on one of the stuck ones.
-Sam
On Tue, 2015-03-10 at 14:41 +, joel.merr...@gmail.com wrote:
Stuck unclean and stuck inactive. I can fire up a full query and
h
ances). When you say complicated
and fragile, could you expand?
Thanks again!
Joel
On Wed, Mar 11, 2015 at 1:21 PM, Samuel Just wrote:
Ok, you lost all copies from an interval where the pgs went active. The
recovery from this is going to be complicated and fragile. Are the pools
valuable?
-Sam
Most likely fixed in firefly.
-Sam
- Original Message -
From: "Kostis Fardelas"
To: "ceph-users"
Sent: Tuesday, March 17, 2015 12:30:43 PM
Subject: [ceph-users] Random OSD failures - FAILED assert
Hi,
we are running Ceph v.0.72.2 (emperor) from the ceph emperor repo. The
latest week we
You'll want to at least include the backtrace.
-Sam
On 03/27/2015 10:55 AM, samuel wrote:
Hi all,
In a fully functional ceph installation today we suffer a problem with
ceph monitors, that started crashing with following error:
include/interval_set.h: 340: FAILED assert(0)
Is there any relat
I have a suspicion about what caused this. Can you restart one of the problem
osds with
debug osd = 20
debug filestore = 20
debug ms = 1
and attach the resulting log from startup to crash along with the osdmap binary
(ceph osd getmap -o ).
-Sam
- Original Message -
From: "Scott Laird"
m the
crashing osds using the ceph-objectstore-tool.
-Sam
- Original Message -
From: "Scott Laird"
To: "Samuel Just"
Cc: "Robert LeBlanc" , "'ceph-users@lists.ceph.com'
(ceph-users@lists.ceph.com)"
Sent: Monday, April 20, 2015 6:13:06 AM
Can you explain exactly what you mean by:
"Also I created one pool for tier to be able to move data without outage."
-Sam
- Original Message -
From: "tuomas juntunen"
To: "Ian Colle"
Cc: ceph-users@lists.ceph.com
Sent: Monday, April 27, 2015 4:23:44 AM
Subject: Re: [ceph-users] Upgrade
force-nonempty flag
for that operation, I think. I think the immediate answer is probably to
disallow pools with snapshots as a cache tier altogether until we think of a
good way to make it work.
-Sam
- Original Message -
From: "tuomas juntunen"
To: "Samuel J
I took a bit of time to get a feel for how different the straw2 mappings are vs
straw1 mappings. For a bucket in which all weights are the same, I saw no
changed mappings, which is as expected. However, on a map with 3 hosts each of
which has 4 osds with weights 1,2,3, and 4 (crush-different-w
In short, the drawback is false positives which can cause unnecessary cluster
churn.
-Sam
- Original Message -
From: "Robert LeBlanc"
To: "Vasiliy Angapov"
Cc: "Sage Weil" , "ceph-users"
Sent: Wednesday, May 13, 2015 12:21:16 PM
Subject: Re: [ceph-users] Write freeze when writing to rb
You have most likely hit http://tracker.ceph.com/issues/11429. There are some
workarounds in the bugs marked as duplicates of that bug, or you can wait for
the next hammer point release.
-Sam
- Original Message -
From: "Berant Lemmenes"
To: ceph-users@lists.ceph.com
Sent: Monday, May 1
You appear to be using pool snapshots with radosgw, I suspect that's what is
causing the issue. Can you post a longer log? Preferably with
debug osd = 20
debug filestore = 20
debug ms = 1
from startup to crash on an osd?
-Sam
- Original Message -
From: "Daniel Schneller"
To: ceph-use
If 2.14 is part of a non-existent pool, you should be able to rename it out of
current/ in the osd directory to prevent the osd from seeing it on startup.
-Sam
- Original Message -
From: "Berant Lemmenes"
To: "Samuel Just"
Cc: ceph-users@lists.ceph.com
Sent: Tuesday
Many people have reported that they need to lower the osd recovery config
options to minimize the impact of recovery on client io. We are talking about
changing the defaults as follows:
osd_max_backfills to 1 (from 10)
osd_recovery_max_active to 3 (from 15)
osd_recovery_op_priority to 1 (from 1
ObjectWriteOperations currently allow you to perform a clone_range from another
object with the same object locator. Years ago, rgw used this as part of
multipart upload. Today, the implementation complicates the OSD considerably,
and it doesn't appear to have any users left. Is there anyone
Looks like it's just a stat error. The primary appears to have the correct
stats, but the replica for some reason doesn't (thinks there's an object for
some reason). I bet it clears itself it you perform a write on the pg since
the primary will send over its stats. We'd need information from
Annoying that we don't know what caused the replica's stat structure to get out
of sync. Let us know if you see it recur. What were those pools used for?
-Sam
- Original Message -
From: "Dan van der Ster"
To: "Samuel Just"
Cc: ceph-users@lists.ceph.com
Oh, if you were running dev releases, it's not super surprising that the stat
tracking was at some point buggy.
-Sam
- Original Message -
From: "Dan van der Ster"
To: "Samuel Just"
Cc: ceph-users@lists.ceph.com
Sent: Thursday, July 23, 2015 8:21:07 AM
Subj
Hmm, that's odd. Can you attach the osdmap and ceph pg dump prior to the
addition (with all pgs active+clean), then the osdmap and ceph pg dump
afterwards?
-Sam
- Original Message -
From: "Chad William Seys"
To: "Samuel Just" , "ceph-users"
Sent
sage -
From: "Chad William Seys"
To: "Samuel Just"
Cc: "ceph-users"
Sent: Tuesday, July 28, 2015 7:40:31 AM
Subject: Re: [ceph-users] why are there "degraded" PGs when adding OSDs?
Hi Sam,
Trying again today with crush tunables set to firefly. Degraded pe
Hrm, that's certainly supposed to work. Can you make a bug? Be sure
to note what version you are running (output of ceph-osd -v).
-Sam
On Mon, Aug 3, 2015 at 12:34 PM, Andras Pataki
wrote:
> Summary: I am having problems with inconsistent PG's that the 'ceph pg
> repair' command does not fix.
It seems like it's about time for us to make the jump to C++11. This
is probably going to have an impact on users of the librados C++
bindings. It seems like such users would have to recompile code using
the librados C++ libraries after upgrading the librados library
version. Is that reasonable?
It will cause a large amount of data movement. Each new pg after the
split will relocate. It might be ok if you do it slowly. Experiment
on a test cluster.
-Sam
On Mon, Aug 3, 2015 at 12:57 AM, 乔建峰 wrote:
> Hi Cephers,
>
> This is a greeting from Jevon. Currently, I'm experiencing an issue whi
Is the number of inconsistent objects growing? Can you attach the
whole ceph.log from the 6 hours before and after the snippet you
linked above? Are you using cache/tiering? Can you attach the osdmap
(ceph osd getmap -o )?
-Sam
On Tue, Aug 18, 2015 at 4:15 AM, Voloshanenko Igor
wrote:
> ceph -
Also, what command are you using to take snapshots?
-Sam
On Tue, Aug 18, 2015 at 8:48 AM, Samuel Just wrote:
> Is the number of inconsistent objects growing? Can you attach the
> whole ceph.log from the 6 hours before and after the snippet you
> linked above? Are you using cache/tier
1. We've kicked this around a bit. What kind of failure semantics
would you be comfortable with here (that is, what would be reasonable
behavior if the client side cache fails)?
2. We've got a branch which should merge soon (tomorrow probably)
which actually does allow writes to be proxied, so th
lume to new one.
>
> But after that - scrub errors growing... Was 15 errors.. .Now 35... We laos
> try to out OSD which was lead, but after rebalancing this 2 pgs still have
> 35 scrub errors...
>
> ceph osd getmap -o - attached
>
>
> 2015-08-18 18:48 GMT+03:00 Samuel Just :
&
Also, was there at any point a power failure/power cycle event,
perhaps on osd 56?
-Sam
On Thu, Aug 20, 2015 at 9:23 AM, Samuel Just wrote:
> Ok, you appear to be using a replicated cache tier in front of a
> replicated base tier. Please scrub both inconsistent pgs and post the
> ceph
What was the issue?
-Sam
On Thu, Aug 20, 2015 at 9:41 AM, Voloshanenko Igor
wrote:
> Samuel, we turned off cache layer few hours ago...
> I will post ceph.log in few minutes
>
> For snap - we found issue, was connected with cache tier..
>
> 2015-08-20 19:23 GMT+03:00 Samuel Ju
requested from cache layer.
>
> 2015-08-20 19:53 GMT+03:00 Samuel Just :
>>
>> What was the issue?
>> -Sam
>>
>> On Thu, Aug 20, 2015 at 9:41 AM, Voloshanenko Igor
>> wrote:
>> > Samuel, we turned off cache layer few hours ago...
>> > I
Which docs?
-Sam
On Thu, Aug 20, 2015 at 9:57 AM, Voloshanenko Igor
wrote:
> Not yet. I will create.
> But according to mail lists and Inktank docs - it's expected behaviour when
> cache enable
>
> 2015-08-20 19:56 GMT+03:00 Samuel Just :
>>
>> Is there a bug
/ceph-users@lists.ceph.com/msg18338.html
>
> 2015-08-20 20:06 GMT+03:00 Samuel Just :
>>
>> Which docs?
>> -Sam
>>
>> On Thu, Aug 20, 2015 at 9:57 AM, Voloshanenko Igor
>> wrote:
>> > Not yet. I will create.
>> > But according to mail li
The feature bug for the tool is http://tracker.ceph.com/issues/12740.
-Sam
On Thu, Aug 20, 2015 at 2:52 PM, Samuel Just wrote:
> Ah, this is kind of silly. I think you don't have 37 errors, but 2
> errors. pg 2.490 object
> 3fac9490/rbd_data.eb5f22eb141f2.04ba/snapdir
it?
>
> I mean i can help with coding/testing/etc...
>
> 2015-08-21 0:52 GMT+03:00 Samuel Just :
>>
>> Ah, this is kind of silly. I think you don't have 37 errors, but 2
>> errors. pg 2.490 object
>> 3fac9490/rbd_data.eb5f22eb141f2.04ba/snapd
block names started with this...
>
>> Actually, now that I think about it, you probably didn't remove the
>> images for 3fac9490/rbd_data.eb5f22eb141f2.04ba/snapdir//2
>> and 22ca30c4/rbd_data.e846e25a70bf7.0307/snapdir//2
>
>
>
>
> 2015-08
you'll probably just remove the images.
-Sam
On Thu, Aug 20, 2015 at 3:45 PM, Voloshanenko Igor
wrote:
> Image? One?
>
> We start deleting images only to fix thsi (export/import)m before - 1-4
> times per day (when VM destroyed)...
>
>
>
> 2015-08-21 1:44 GMT+03:00 Sa
Snapshotting with cache/tiering *is* supposed to work. Can you open a bug?
-Sam
On Thu, Aug 20, 2015 at 3:36 PM, Andrija Panic wrote:
> This was related to the caching layer, which doesnt support snapshooting per
> docs...for sake of closing the thread.
>
> On 17 August 2015 at 21:15, Voloshanen
Also, can you include the kernel version?
-Sam
On Thu, Aug 20, 2015 at 3:51 PM, Samuel Just wrote:
> Snapshotting with cache/tiering *is* supposed to work. Can you open a bug?
> -Sam
>
> On Thu, Aug 20, 2015 at 3:36 PM, Andrija Panic
> wrote:
>> This was related to the
17 17:37:22 UTC
> 2015 x86_64 x86_64 x86_64 GNU/Linux
>
> 2015-08-21 1:54 GMT+03:00 Samuel Just :
>>
>> Also, can you include the kernel version?
>> -Sam
>>
>> On Thu, Aug 20, 2015 at 3:51 PM, Samuel Just wrote:
>> > Snapshotting with cache/t
x-s5 4.0.4-040004-generic #201505171336 SMP Sun May 17 17:37:22 UTC
>> 2015 x86_64 x86_64 x86_64 GNU/Linux
>>
>> 2015-08-21 1:54 GMT+03:00 Samuel Just :
>>>
>>> Also, can you include the kernel version?
>>> -Sam
>>>
>>> On Thu, Aug 2
Certainly, don't reproduce this with a cluster you care about :).
-Sam
On Thu, Aug 20, 2015 at 4:02 PM, Samuel Just wrote:
> What's supposed to happen is that the client transparently directs all
> requests to the cache pool rather than the cold pool when there is a
> cache p
So you started draining the cache pool before you saw either the
inconsistent pgs or the anomalous snap behavior? (That is, writeback
mode was working correctly?)
-Sam
On Thu, Aug 20, 2015 at 4:07 PM, Voloshanenko Igor
wrote:
> Good joke )
>
> 2015-08-21 2:06 GMT+03:00 Sa
Created a ticket to improve our testing here -- this appears to be a hole.
http://tracker.ceph.com/issues/12742
-Sam
On Thu, Aug 20, 2015 at 4:09 PM, Samuel Just wrote:
> So you started draining the cache pool before you saw either the
> inconsistent pgs or the anomalous snap behavior?
th data and evict/flush started...
>
>
>
> 2015-08-21 2:09 GMT+03:00 Samuel Just :
>>
>> So you started draining the cache pool before you saw either the
>> inconsistent pgs or the anomalous snap behavior? (That is, writeback
>> mode was working correctly?)
>> -Sam
>
Also, what do you mean by "change journal side"?
-Sam
On Thu, Aug 20, 2015 at 4:15 PM, Samuel Just wrote:
> Not sure what you mean by:
>
> but it's stop to work in same moment, when cache layer fulfilled with
> data and evict/flush started...
> -Sam
>
notification from monitoring that we collect about 750GB in
> hot pool ) So i changed values for max_object_bytes to be 0,9 of disk
> size... And then evicting/flushing started...
>
> And issue with snapshots arrived
>
> 2015-08-21 2:15 GMT+03:00 Samuel Just :
>>
>> Not sur
Yeah, I'm trying to confirm that the issues did happen in writeback mode.
-Sam
On Thu, Aug 20, 2015 at 4:21 PM, Voloshanenko Igor
wrote:
> Right. But issues started...
>
> 2015-08-21 2:20 GMT+03:00 Samuel Just :
>>
>> But that was still in writeback mode, right?
&g
Specifically, the snap behavior (we already know that the pgs went
inconsistent while the pool was in writeback mode, right?).
-Sam
On Thu, Aug 20, 2015 at 4:22 PM, Samuel Just wrote:
> Yeah, I'm trying to confirm that the issues did happen in writeback mode.
> -Sam
>
> On Thu,
And you adjusted the journals by removing the osd, recreating it with
a larger journal, and reinserting it?
-Sam
On Thu, Aug 20, 2015 at 4:24 PM, Voloshanenko Igor
wrote:
> Right ( but also was rebalancing cycle 2 day before pgs corrupted)
>
> 2015-08-21 2:23 GMT+03:00 Sa
Ok, create a ticket with a timeline and all of this information, I'll
try to look into it more tomorrow.
-Sam
On Thu, Aug 20, 2015 at 4:25 PM, Voloshanenko Igor
wrote:
> Exactly
>
> пятница, 21 августа 2015 г. пользователь Samuel Just написал:
>
>> And you adjusted the j
}-${TYPE} weight 9.000/item ${HOST}-${TYPE} weight
> 1.000/" cm
>
> echo "Compile new CRUSHMAP"
> crushtool -c cm -o cm.new
>
> echo "Inject new CRUSHMAP"
> ceph osd setcrushmap -i cm.new
>
> #echo "Clean..."
> #rm -rf cm cm.new
>
Odd, did you happen to capture osd logs?
-Sam
On Thu, Aug 20, 2015 at 8:10 PM, Ilya Dryomov wrote:
> On Fri, Aug 21, 2015 at 2:02 AM, Samuel Just wrote:
>> What's supposed to happen is that the client transparently directs all
>> requests to the cache pool rather than the
I think I found the bug -- need to whiteout the snapset (or decache
it) upon evict.
http://tracker.ceph.com/issues/12748
-Sam
On Fri, Aug 21, 2015 at 8:04 AM, Ilya Dryomov wrote:
> On Fri, Aug 21, 2015 at 5:59 PM, Samuel Just wrote:
>> Odd, did you happen to capture osd logs?
>
&
David, does this look familiar?
-Sam
On Fri, Aug 28, 2015 at 10:43 AM, Aaron Ten Clay wrote:
> Hi Cephers,
>
> I'm trying to resolve an inconsistent pg on an erasure-coded pool, running
> Ceph 9.0.2. I can't seem to get Ceph to run a repair or even deep-scrub the
> pg again. Here's the background
What version are you running? How did you move the osds from 2TB to 4TB?
-Sam
On Wed, Jul 17, 2013 at 12:59 AM, Ta Ba Tuan wrote:
> Hi everyone,
>
> I converted every osds from 2TB to 4TB, and when moving complete, show log
> Ceph realtime"ceph -w":
> displays error: "I don't have pgid 0.2c8"
>
on't know how to remove those pgs?.
> Please guiding this error help me!
>
> Thank you!
> --tuantaba
> TA BA TUAN
>
>
> On 07/18/2013 01:16 AM, Samuel Just wrote:
>
> What version are you running? How did you move the osds from 2TB to 4TB?
> -Sam
>
> On We
Can you attach the output of ceph -s?
-Sam
On Fri, Aug 9, 2013 at 11:10 AM, Suresh Sadhu wrote:
> how to repair laggy storage cluster,able to create images on the pools even
> if HEATH state shows WARN,
>
>
>
> sudo ceph
>
> HEALTH_WARN 181 pgs degraded; 676 pgs stuck unclean; recovery 2/107 degr
Can you post more of the log? There should be a line towards the bottom
indicating the line with the failed assert. Can you also attach ceph pg
dump, ceph osd dump, ceph osd tree?
-Sam
On Mon, Aug 12, 2013 at 11:54 AM, John Wilkins wrote:
> Stephane,
>
> You should post any crash bugs with sta
Did you try using ceph-deploy disk zap ceph001:sdaa first?
-Sam
On Mon, Aug 12, 2013 at 6:21 AM, Pavel Timoschenkov
wrote:
> Hi.
>
> I have some problems with create journal on separate disk, using ceph-deploy
> osd prepare command.
>
> When I try execute next command:
>
> ceph-deploy osd prepare
Can you elaborate on what behavior you are looking for?
-Sam
On Fri, Aug 9, 2013 at 4:37 AM, Georg Höllrigl
wrote:
> Hi,
>
> I'm using ceph 0.61.7.
>
> When using ceph-fuse, I couldn't find a way, to only mount one pool.
>
> Is there a way to mount a pool - or is it simply not supported?
>
>
>
>
Can you attach the output of ceph osd tree?
Also, can you run
ceph osd getmap -o /tmp/osdmap
and attach /tmp/osdmap?
-Sam
On Fri, Aug 9, 2013 at 4:28 AM, Jeff Moskow wrote:
> Thanks for the suggestion. I had tried stopping each OSD for 30 seconds,
> then restarting it, waiting 2 minutes and t
I have referred you to someone more conversant with the details of
mkcephfs, but for dev purposes, most of us use the vstart.sh script in
src/ (http://ceph.com/docs/master/dev/).
-Sam
On Fri, Aug 9, 2013 at 2:59 AM, Nulik Nol wrote:
> Hi,
> I am configuring a single node for developing purposes,
Are you using any kernel clients? Will osds 3,14,16 be coming back?
-Sam
On Mon, Aug 12, 2013 at 2:26 PM, Jeff Moskow wrote:
> Sam,
>
> I've attached both files.
>
> Thanks!
> Jeff
>
> On Mon, Aug 12, 2013 at 01:46:57PM -0700, Samuel Just wrote:
&g
I think the docs you are looking for are
http://ceph.com/docs/master/man/8/cephfs/ (specifically the set_layout
command).
-Sam
On Thu, Aug 8, 2013 at 7:48 AM, Da Chun wrote:
> Hi list,
> I saw the info about data striping in
> http://ceph.com/docs/master/architecture/#data-striping .
> But couldn
Can you attach the output of:
ceph -s
ceph pg dump
ceph osd dump
and run
ceph osd getmap -o /tmp/osdmap
and attach /tmp/osdmap/
-Sam
On Wed, Aug 7, 2013 at 1:58 AM, Howarth, Chris wrote:
> Hi,
>
> One of our OSD disks failed on a cluster and I replaced it, but when it
> failed it did not
Can you give a step by step account of what you did prior to the error?
-Sam
On Tue, Aug 6, 2013 at 10:52 PM, 於秀珠 wrote:
> using the ceph-deploy to manage a existing cluster,i follow the steps in the
> document ,but there is some errors that i can not gather the keys.
> when i run the command "ce
> On Mon, Aug 12, 2013 at 02:41:11PM -0700, Samuel Just wrote:
>> Are you using any kernel clients? Will osds 3,14,16 be coming back?
>> -Sam
>>
>> On Mon, Aug 12, 2013 at 2:26 PM, Jeff Moskow wrote:
>> > Sam,
>> >
>> > I've attac
You can run 'ceph pg 0.cfa mark_unfound_lost revert'. (Revert Lost
section of http://ceph.com/docs/master/rados/operations/placement-groups/).
-Sam
On Tue, Aug 13, 2013 at 6:50 AM, Jens-Christian Fischer
wrote:
> We have a cluster with 10 servers, 64 OSDs and 5 Mons on them. The OSDs are
> 3TB di
vered": 45,
> "num_bytes_recovered": 188743680,
> "num_keys_recovered": 0},
> "stat_cat_sum": {},
> "up": [
> 5,
> 4],
> "acting": [
> 5,
>
Cool!
-Sam
On Tue, Aug 13, 2013 at 4:49 AM, Jeff Moskow wrote:
> Sam,
>
> Thanks that did it :-)
>
>health HEALTH_OK
>monmap e17: 5 mons at
> {a=172.16.170.1:6789/0,b=172.16.170.2:6789/0,c=172.16.170.3:6789/0,d=172.16.170.4:6789/0,e=172.16.170.5:6789/0},
> election epoch 9794, quorum
1 - 100 of 305 matches
Mail list logo