Hi,
is there a way to find out which files on CephFS is are using a given
pg? I'd like to check whether those files are corrupted...
Also, how do I translate a bluestore like this
X8
2017-12-17 03:04:29.512839 7f86c6347700 -1
bl
BTW: Ceph version is 12.2.2 (the cluster was setup with 12.2.1, then
updated to 12.2.2.2 on Debian 9).
services:
mon: 3 daemons, quorum ceph1,ceph2,ceph3
mgr: ceph1(active), standbys: ceph2
mds: cephfs-1/1/1 up {0=ceph1=up:active}, 2 up:standby
osd: 10 osds: 10 up, 10 in
data:
Hi,
We have a live cluster with 8 OSD nodes all having 5-6 disks each.
We would like to add a new host and expand the cluster.
We have 4 pools
- 3 replicated pools with replication factor 5 and 3
- 1 erasure coded pool with k=5, m=3
So my concern is, is there any precautions that are needed to
Hi All,
I am testing luminous 12.2.2 and find a strange behavior of my cluster.
I was testing my cluster throughput by using fio on a mounted rbd with
follow fio parameters:
fio -directory=fiotest -direct=1 -thread -rw=write -ioengine=libaio
-size=200G -group_reporting -bs=1m -i
I like to avoid adding disks from more than 1 failure domain at a time in
case some of the new disks are bad. In your example of only adding 1 new
node, I would say that adding all of the disks at the same time is the
better way to do it.
Adding only 1 disk in the new node at a time would actually
Hi David,
Thank you for your response.
Failure domain for ec profile is 'host'. So I guess it is okay to add a
node and activate 5 disks at a time ?
$ ceph osd erasure-code-profile get profile5by3
crush-device-class=
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=fals
Hi,
it is poosible to configure the rgw logging to a unix socket, with
this you are able to use a json stream.
In a POC we put events into a rediscache to do async processing.
sadly i cant find the needed configlines at the moment.
hope it helps,
Ansgar
__
Hi,
This is just a tip, I do not know if this actually applies to you, but
some ssds are decreasing their write throughput on purpose so they do
not wear out the cells before the warranty period is over.
Denes.
On 12/17/2017 06:45 PM, shadow_lin wrote:
Hi All,
I am testing luminous 12.2.
hi John
thanks for your answer.
in normal condition, i can run "ceph mds fiail" before reboot.
but if the host reboots by itself for some reason, i can do nothing!
if this happens, data must be losed.
so, is there any other way to stop data from being losed?
thanks
13605702...@163.com
Fr
On Mon, Dec 18, 2017 at 9:24 AM, 13605702...@163.com
<13605702...@163.com> wrote:
> hi John
>
> thanks for your answer.
>
> in normal condition, i can run "ceph mds fiail" before reboot.
> but if the host reboots by itself for some reason, i can do nothing!
> if this happens, data must be losed.
>
hi Yan
1. run "ceph mds fail" before rebooting host
2. host reboot by itself for some reason
you means no data get lost in the BOTH conditions?
in my test, i echo the date string per second into the file under cephfs dir,
when i reboot the master mds, there are 15 lines got lost.
thanks
136
On Mon, Dec 18, 2017 at 10:10 AM, 13605702...@163.com
<13605702...@163.com> wrote:
> hi Yan
>
> 1. run "ceph mds fail" before rebooting host
> 2. host reboot by itself for some reason
>
cephfs client was also on the rebooted host?
> you means no data get lost in the BOTH conditions?
>
> in my te
hi Yan
cephfs client was also on the rebooted host?
NO, the cephfs client is an indepentent vm
13605702...@163.com
From: Yan, Zheng
Date: 2017-12-18 10:36
To: 13605702...@163.com
CC: John Spray; ceph-users
Subject: Re: Re: [ceph-users] cephfs miss data for 15s when master mds rebooting
On Mo
On Mon, Dec 18, 2017 at 10:10 AM, 13605702...@163.com
<13605702...@163.com> wrote:
> hi Yan
>
> 1. run "ceph mds fail" before rebooting host
> 2. host reboot by itself for some reason
>
> you means no data get lost in the BOTH conditions?
>
> in my test, i echo the date string per second into the
Tried restarting all osds. Still no luck.
Will adding a new disk to any of the server forces a rebalance and fix it?
Karun Josy
On Sun, Dec 17, 2017 at 12:22 PM, Cary wrote:
> Karun,
>
> Could you paste in the output from "ceph health detail"? Which OSD
> was just added?
>
> Cary
> -Dynamic
>
On Fri, Dec 15, 2017 at 6:08 PM, John Spray wrote:
> On Fri, Dec 15, 2017 at 1:45 AM, 13605702...@163.com
> <13605702...@163.com> wrote:
>> hi
>>
>> i used 3 nodes to deploy mds (each node also has mon on it)
>>
>> my config:
>> [mds.ceph-node-10-101-4-17]
>> mds_standby_replay = true
>> mds_stand
hi Yan
my test script:
#!/bin/sh
rm -f /root/cephfs/time.txt
while true
do
echo `date` >> /root/cephfs/time.txt
sync
sleep 1
done
i run this scripte and then reboot master mds
from the file /root/cephfs/time.txt, i can see there are more than 15 lines got
lost:
Mon Dec 18 03:07:
On Mon, Dec 18, 2017 at 11:11 AM, 13605702...@163.com
<13605702...@163.com> wrote:
> hi Yan
>
> my test script:
>
> #!/bin/sh
>
> rm -f /root/cephfs/time.txt
>
> while true
> do
> echo `date` >> /root/cephfs/time.txt
> sync
> sleep 1
> done
>
> i run this scripte and then reboot master
hi Yan
> Mon Dec 18 03:07:47 UTC 2017 <-- reboot
> Mon Dec 18 03:08:05 UTC 2017 <-- mds failover works
this is caused by write stall
but the data below got lost, is this normal?
Mon Dec 18 03:07:48 UTC 2017
Mon Dec 18 03:07:49 UTC 2017
Mon Dec 18 03:07:50 UTC 2017
Mon Dec 18 03:07:51 UTC 2017
Maybe try outing the disk that should have a copy of the PG, but doesn't.
Then mark it back in. It might check that it has everything properly and
pull a copy of the data it's missing. I dunno.
On Sun, Dec 17, 2017, 10:00 PM Karun Josy wrote:
> Tried restarting all osds. Still no luck.
>
> Will
On Mon, Dec 18, 2017 at 11:34 AM, 13605702...@163.com
<13605702...@163.com> wrote:
> hi Yan
>
>> Mon Dec 18 03:07:47 UTC 2017 <-- reboot
>> Mon Dec 18 03:08:05 UTC 2017 <-- mds failover works
>
> this is caused by write stall
>
> but the data below got lost, is this normal?
your script never wri
The lines might not be in the file, but did the thing writing to the file
say it succeeded to write or did it fail to write? I'm guessing the latter
which means that just check that the write was successful and don't just
assume it was before continuing on.
On Sun, Dec 17, 2017, 10:07 PM Wei Jin
Hi Yan,
Sorry for late reply , it is kernel client and ceph version 10.2.3.
Its not reproducible in other mounts.
Regards
Prabu GJ
On Thu, 14 Dec 2017 12:18:52 +0530 Yan, Zheng
wrote
On Thu, Dec 14, 2017 at 2:14 PM, gjprabu
I am testing luminous 12.2.2 and find a strange behavior of my cluster.
Check your block.db usage. Luminous 12.2.2 is affected
http://tracker.ceph.com/issues/22264
[root@ceph-osd0]# ceph daemon osd.46 perf dump | jq '.bluefs' | grep -E
'(db|slow)'
"db_total_bytes": 30064762880,
"db_used_
hi Yan
you are right, the data didn't get lost. it is caused by write stall.
thanks
13605702...@163.com
From: Yan, Zheng
Date: 2017-12-18 12:01
To: 13605702...@163.com
CC: John Spray; ceph-users
Subject: Re: Re: [ceph-users] cephfs miss data for 15s when master mds rebooting
On Mon, Dec 18,
25 matches
Mail list logo