Hi Team,
We upgraded ceph version from 0.94.9 hammer to 10.2.5 jewel . Still
some clients are showing older version while mounting with debug mode, is this
caused any issue with OSD and MON. How to find the solution.
New version and properly working client
root@172.20.25.162
are you sure you have ceph-fuse upgraded?
#ceph-fuse --version
2017-02-23 16:07 GMT+08:00 gjprabu :
> Hi Team,
>
> We upgraded ceph version from 0.94.9 hammer to 10.2.5 jewel .
> Still some clients are showing older version while mounting with debug
> mode, is this caused any issue w
Hi Team,
I have recently deployed a new CEPH cluster for OEL6 boxes for my testing.
I am getting below error on the admin host. not sure how can i fix it.
2017-02-23 02:13:04.166366 7f9c85efb700 0 librados: client.admin
authentication error (1) Operation not permitted
Error connecting to cluster
You need ceph.client.admin.keyring in /etc/ceph/
On Thu, Feb 23, 2017 at 8:13 PM, Chaitanya Ravuri
wrote:
> Hi Team,
>
> I have recently deployed a new CEPH cluster for OEL6 boxes for my testing. I
> am getting below error on the admin host. not sure how can i fix it.
>
> 2017-02-23 02:13:04.1663
I have a RADOS pool with nearly a million objects in it--but I don't
exactly know how many, and that's the point.
I ran a long list_objects() overnight and, at first glance this morning,
the output looks good, but it is thousands of objects fewer than
get_stats() said are there. I am just doin
On Wed, Feb 22, 2017 at 4:06 PM, Marius Vaitiekunas <
mariusvaitieku...@gmail.com> wrote:
> Hi Cephers,
>
> We are running latest jewel (10.2.5). Bucket index sharding is set to 8.
> rgw pools except data are placed on SSD.
> Today I've done some testing and run bucket index check on a bucket with
you can copy the corrupt osdmap file from osd.1 and then restart osd,
we met this before, and that works for us.
2017-02-23 22:33 GMT+08:00 tao chang :
> HI,
>
> I have a ceph cluster (ceph 10.2.5) witch 3 node, each has two osds.
>
> It was a power outage last night and all the server are resta
On 02/23/2017 07:43 AM, Kent Borg wrote:
I ran a long list_objects() overnight and, at first glance this
morning, the output looks good, but it is thousands of objects fewer
than get_stats() said are there.
Update: I scripted up a quick check and every object name I would expect
to be in my p
Hi zhong,
Yes, one of the client was not upgraded ceph-fuse version , now it's working
thank you Regards
Prabu GJ
On Thu, 23 Feb 2017 15:08:42 +0530 zhong2p...@gmail.com wrote
are you sure you have ceph-fuse upgraded?
#ceph-fuse --version
2017-02-23 16:07 GMT+08:00 gjprabu :
Hi T
Since we need this pool to work again, we decided to take the data loss and try
to move on.
So far, no luck. We tried a force create but, as expected, with a PG that is
not peering this did absolutely nothing.
We also tried rm-past-intervals and remove from ceph-objectstore-tool and
manually de
On Thu, Feb 23, 2017 at 6:55 AM, Kent Borg wrote:
> On 02/23/2017 07:43 AM, Kent Borg wrote:
>>
>> I ran a long list_objects() overnight and, at first glance this morning,
>> the output looks good, but it is thousands of objects fewer than get_stats()
>> said are there.
>
>
> Update: I scripted up
On 02/23/2017 02:13 PM, Gregory Farnum wrote:
Did you run a pg split or something? That's the only off-hand way I
can think of the number of objects going over, though I don't recall
how snapshots impact those numbers and obviously it's very wonky if
you were to use a cache tier.
We did increa
Yeah, that's why. It'll fix itself once all the newly-split PGS have
scrubbed, but in order to keep the splitting operation constant-time it has
to estimate how many objects ended up in each of the new ones.
-Greg
On Thu, Feb 23, 2017 at 11:26 AM Kent Borg wrote:
> On 02/23/2017 02:13 PM, Gregor
On 02/23/2017 02:51 PM, Gregory Farnum wrote:
Yeah, that's why. It'll fix itself once all the newly-split PGS have
scrubbed, but in order to keep the splitting operation constant-time
it has to estimate how many objects ended up in each of the new ones.
That makes some sense. Thanks!
While I
On Thu, Feb 23, 2017 at 12:11 PM Kent Borg wrote:
> On 02/23/2017 02:51 PM, Gregory Farnum wrote:
> > Yeah, that's why. It'll fix itself once all the newly-split PGS have
> > scrubbed, but in order to keep the splitting operation constant-time
> > it has to estimate how many objects ended up in e
On 02/23/2017 03:13 PM, Gregory Farnum wrote:
If your PG count isn't a power of two, some of them will have double
the number of objects of the others. It mostly doesn't matter, though
at low counts it can improve balance. There's no breakage that Ceph
cares about.
-Greg
Good to know. This s
ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
We are seeing a weird behavior or not sure how to diagnose what could be
going on. We started monitoring the overall_status from the json query and
every once in a while we would get a HEALTH_WARN for a minute or two.
Monitoring logs.
On Thu, Feb 23, 2017 at 09:49:21PM +, Scottix wrote:
> ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
>
> We are seeing a weird behavior or not sure how to diagnose what could be
> going on. We started monitoring the overall_status from the json query and
> every once in a whil
On Thu, Feb 16, 2017 at 9:19 AM, Benjeman Meekhof wrote:
> I tried starting up just a couple OSD with debug_osd = 20 and
> debug_filestore = 20.
>
> I pasted a sample of the ongoing log here. To my eyes it doesn't look
> unusual but maybe someone else sees something in here that is a
> problem:
Hi Greg,
Appreciate you looking into it. I'm concerned about CPU power per
daemon as well...though we never had this issue when restarting our
dense nodes under Jewel. Is the rapid rate of OSDmap generation a
one-time condition particular to post-update processing or to Kraken
in general?
We di
On Thu, Feb 23, 2017 at 2:34 PM, Benjeman Meekhof wrote:
> Hi Greg,
>
> Appreciate you looking into it. I'm concerned about CPU power per
> daemon as well...though we never had this issue when restarting our
> dense nodes under Jewel. Is the rapid rate of OSDmap generation a
> one-time condition
Ya the ceph-mon.$ID.log
I was running ceph -w when one of them occurred too and it never output
anything.
Here is a snippet for the the 5:11AM occurrence.
On Thu, Feb 23, 2017 at 1:56 PM Robin H. Johnson wrote:
> On Thu, Feb 23, 2017 at 09:49:21PM +, Scottix wrote:
> > ceph version 10.2.5
On Thu, Feb 23, 2017 at 9:49 PM, Scottix wrote:
> ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
>
> We are seeing a weird behavior or not sure how to diagnose what could be
> going on. We started monitoring the overall_status from the json query and
> every once in a while we woul
On Thu, Feb 23, 2017 at 10:40:31PM +, Scottix wrote:
> Ya the ceph-mon.$ID.log
>
> I was running ceph -w when one of them occurred too and it never output
> anything.
>
> Here is a snippet for the the 5:11AM occurrence.
Yep, I don't see anything in there that should have triggered
HEALTH_WARN
There are multiple approaches to give you more information about the Health
state. CLI has these 2 options:
ceph health detail
ceph status
I also like using ceph-dash. ( https://github.com/Crapworks/ceph-dash ) It
has an associated nagios check to scrape the ceph-dash page.
I personally do `
That sounds about right, I do see blocked requests sometimes when it is
under really heavy load.
Looking at some examples I think summary should list the issues.
"summary": [],
"overall_status": "HEALTH_OK",
I'll try logging that too.
Scott
On Thu, Feb 23, 2017 at 3:00 PM David Turner
wrote:
On Thu, Feb 23, 2017 at 5:18 PM, Schlacta, Christ wrote:
> So I updated suse leap, and now I'm getting the following error from
> ceph. I know I need to disable some features, but I'm not sure what
> they are.. Looks like 14, 57, and 59, but I can't figure out what
> they correspond to, nor ther
aarcane@densetsu:~$ ceph --cluster rk osd crush show-tunables
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 1,
"chooseleaf_stable": 1,
"straw_calc_version": 1,
"allowed_bucket
On Fri, Feb 24, 2017 at 11:00 AM, Schlacta, Christ wrote:
> aarcane@densetsu:~$ ceph --cluster rk osd crush show-tunables
> {
> "choose_local_tries": 0,
> "choose_local_fallback_tries": 0,
> "choose_total_tries": 50,
> "chooseleaf_descend_once": 1,
> "chooseleaf_vary_r": 1,
>
-- Forwarded message --
From: Schlacta, Christ
Date: Thu, Feb 23, 2017 at 5:56 PM
Subject: Re: [ceph-users] Upgrade Woes on suse leap with OBS ceph.
To: Brad Hubbard
They're from the suse leap ceph team. They maintain ceph, and build
up to date versions for suse leap. What I d
-- Forwarded message --
From: Schlacta, Christ
Date: Thu, Feb 23, 2017 at 6:06 PM
Subject: Re: [ceph-users] Upgrade Woes on suse leap with OBS ceph.
To: Brad Hubbard
So setting the above to 0 by sheer brute force didn't work, so it's
not crush or osd problem.. also, the errors
Is your change reflected in the current crushmap?
On Fri, Feb 24, 2017 at 12:07 PM, Schlacta, Christ wrote:
> -- Forwarded message --
> From: Schlacta, Christ
> Date: Thu, Feb 23, 2017 at 6:06 PM
> Subject: Re: [ceph-users] Upgrade Woes on suse leap with OBS ceph.
> To: Brad Hubb
insofar as I can tell, yes. Everything indicates that they are in effect.
On Thu, Feb 23, 2017 at 7:14 PM, Brad Hubbard wrote:
> Is your change reflected in the current crushmap?
>
> On Fri, Feb 24, 2017 at 12:07 PM, Schlacta, Christ
> wrote:
>> -- Forwarded message --
>> From:
Did you dump out the crushmap and look?
On Fri, Feb 24, 2017 at 1:36 PM, Schlacta, Christ wrote:
> insofar as I can tell, yes. Everything indicates that they are in effect.
>
> On Thu, Feb 23, 2017 at 7:14 PM, Brad Hubbard wrote:
>> Is your change reflected in the current crushmap?
>>
>> On Fri
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
# typ
Hmm,
What's interesting is the feature set reported by the servers has only
changed from
e0106b84a846a42
Bit 1 set Bit 6 set Bit 9 set Bit 11 set Bit 13 set Bit 14 set Bit 18
set Bit 23 set Bit 25 set Bit 27 set Bit 30 set Bit 35 set Bit 36 set
Bit 37 set Bit 39 set Bit 41 set Bit 42 set Bit 48
Kefu has just pointed out that this has the hallmarks of
https://github.com/ceph/ceph/pull/13275
On Fri, Feb 24, 2017 at 3:00 PM, Brad Hubbard wrote:
> Hmm,
>
> What's interesting is the feature set reported by the servers has only
> changed from
>
> e0106b84a846a42
>
> Bit 1 set Bit 6 set Bit 9
So hopefully when the suse ceph team get 11.2 released it should fix this,
yes?
On Feb 23, 2017 21:06, "Brad Hubbard" wrote:
> Kefu has just pointed out that this has the hallmarks of
> https://github.com/ceph/ceph/pull/13275
>
> On Fri, Feb 24, 2017 at 3:00 PM, Brad Hubbard wrote:
> > Hmm,
> >
On Fri, Feb 24, 2017 at 3:07 PM, Schlacta, Christ wrote:
> So hopefully when the suse ceph team get 11.2 released it should fix this,
> yes?
Definitely not a question I can answer.
What I can tell you is the fix is only in master atm, not yet
backported to kraken http://tracker.ceph.com/issues/1
39 matches
Mail list logo