Hey,
I am just playing around with luminous RC. As far as I can see it works nice.
Studying around I found the following discussion about wal and block db size:
http://marc.info/?l=ceph-devel&m=149978799900866&w=2
Creating an osd with the following command:
ceph-deploy osd create --bluestore -
Hi,
I'm trying to find reason for strange recovery issues I'm seeing on
our cluster..
it's mostly idle, 4 node cluster with 26 OSDs evenly distributed
across nodes. jewel 10.2.9
the problem is that after some disk replaces and data moves, recovery
is progressing extremely slowly.. pgs seem to be
I forgot to add that OSD daemons really seem to be idle, no disk
activity, no CPU usage.. it just looks to me like some kind of
deadlock, as they were waiting for each other..
and so I'm trying to get last 1.5% of misplaced / degraded PGs
for almost a week..
On Fri, Jul 28, 2017 at 10:56:02AM +
It look like the osd in your cluster is not all the same size.
can you show ceph osd df output?
At 2017-07-28 17:24:29, "Nikola Ciprich" wrote:
>I forgot to add that OSD daemons really seem to be idle, no disk
>activity, no CPU usage.. it just looks to me like some kind of
>deadlock, as they
On Fri, Jul 28, 2017 at 05:43:14PM +0800, linghucongsong wrote:
>
>
> It look like the osd in your cluster is not all the same size.
>
> can you show ceph osd df output?
you're right, they're not.. here's the output:
[root@v1b ~]# ceph osd df tree
ID WEIGHT REWEIGHT SIZE USE AVAIL %U
You have two crush rule? One is ssd the other is hdd?
Can you show ceph osd dump|grep pool
ceph osd crush dump
At 2017-07-28 17:47:48, "Nikola Ciprich" wrote:
>
>On Fri, Jul 28, 2017 at 05:43:14PM +0800, linghucongsong wrote:
>>
>>
>> It look like the osd in your cluster is not all th
On Fri, Jul 28, 2017 at 05:52:29PM +0800, linghucongsong wrote:
>
>
>
> You have two crush rule? One is ssd the other is hdd?
yes, exactly..
>
> Can you show ceph osd dump|grep pool
>
pool 3 'vm' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins
pg_num 1024 pgp_num 1024 last
1 You have 3 size pool I do not know why you set min_size 1. It is too dangous.
2 You had better use the same size and same num osds each host for crush.
now you can try ceph osd reweight-by-utilization. command. When there is no
user in you cluster.
and I will go home.
At 2017-07-28 17
Hello,
Recently we got an underlying issue with osd.10 which mapped to /dev/sde .
So we tried to removed it from the crush
===
#systemctl stop ceph-osd@10.service
#for x in {10..10}; do ceph osd out $x;ceph osd crush remove osd.$x;ceph
auth del osd.$x;ceph osd rm osd.$x ;done
marked out osd.10.
Hi all,
We are trying to outsource the disk replacement process for our ceph
clusters to some non-expert sysadmins.
We could really use a tool that reports if a Ceph OSD *would* or
*would not* be safe to stop, e.g.
# ceph-osd-safe-to-stop osd.X
Yes it would be OK to stop osd.X
(which of course m
Hi David,
Thanks a lot for your comments!
I just want to utilize a different network than the public one (where dns
resolves the name) for ceph-deploy and client connections.
For example with 3 nics:
Nic1: Public (internet acces)
Nic2: Ceph-mon (clients and ceph-deploy)
Nic3: Ceph-osd
Thanks a
Hi!
Just found strange thing while testing deep-scrub on 10.2.7.
1. Stop OSD
2. Change primary copy's contents (using vi)
3. Start OSD
Then 'rados get' returns "No such file or directory". No error messages seen in
OSD log, cluster status "HEALTH_OK".
4. ceph pg repair
Then 'rados get' works
Hello Dan,
Something like this maybe?
https://github.com/CanonicalLtd/ceph_safe_disk
Cheers,
Alex
2017-07-28 9:36 GMT-04:00 Dan van der Ster :
> Hi all,
>
> We are trying to outsource the disk replacement process for our ceph
> clusters to some non-expert sysadmins.
> We could really use a to
Hello Dan,
Based on what I know and what people told me on IRC, this means basicaly
the condition that the osd is not acting nor up for any pg. And for one
person (fusl on irc) that said there was a unfound objects bug when he
had size = 1, also he said if reweight (and I assume crush weight) is 0
On Fri, Jul 28, 2017 at 8:16 AM Дмитрий Глушенок wrote:
> Hi!
>
> Just found strange thing while testing deep-scrub on 10.2.7.
> 1. Stop OSD
> 2. Change primary copy's contents (using vi)
> 3. Start OSD
>
> Then 'rados get' returns "No such file or directory". No error messages
> seen in OSD log,
yaoning, haomai, Json
what about the "recovery what is really modified" feature? I didn't see any
update on github recently, will it be further developed?
https://github.com/ceph/ceph/pull/3837 (PG:: recovery optimazation: recovery
what is really modified)
Thanks a lot.
donglifec...@gmail.co
16 matches
Mail list logo