Hello,
Each rack works on different trees or is everything parallelized ?
The meta pools would be distributed over racks 1,2,4,5 ?
If it is distributed, even if the addressed MDS is on the same switch as
the client, you will always have this MDS which will consult/write (nvme)
OSDs on the other ra
>
> I'm designing a new Ceph storage from scratch and I want to increase CephFS
> speed and decrease latency.
> Usually I always build (WAL+DB on NVME with Sas-Sata SSD's)
Just go with pure-NVMe servers. NVMe SSDs shouldn't cost much if anything more
than the few remaining SATA or especially
Hello folks!
I'm designing a new Ceph storage from scratch and I want to increase CephFS
speed and decrease latency.
Usually I always build (WAL+DB on NVME with Sas-Sata SSD's) and I deploy
MDS and MON's on the same servers.
This time a weird idea came to my mind and I think it has great potential
Once recovery is underway way simply restarting the RGWs should be enough to
reset them and get your object store back up.
Bloomberg doesn’t use cephfs so hopefully David’s suggestions work or if anyone
else in the community can chip in for that part.
Sent from Bloomberg Professional for iPh
if rebalancing tasks have been launched it's not a big deal, but I don't
think it's the priority.
The priority being to get the MDS back on its feet.
I haven't seen an answer to this question: can you stop/unmount cephfs
clients or not ?
There are other solutions but as you are not comfortable I a
Thank you so much, Matthew. Pls keep an eye on my thread.
You and Mr Anthony made my day.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Thank you so much, Sir. You make my day T.T
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
> Low space hindering backfill (add storage if this doesn't resolve
> itself): 21 pgs backfill_toofull
^^^ Ceph even told you what you need to do ;)
If your have recovery taking place and the numbers of misplaced objects and
*full PGs/pools keeps decreasing, then yes wait.
As for ge
Sudo watch ceph -s
You should see stats on the recovery and see PGs transition from all the
backfill* states to active+clean
Once you get everything active clean then we can focus on your rgws and MDSs
Sent from Bloomberg Professional for iPhone
- Original Message -
From: nguyenvand
Thank you Matthew.
Im following guidance from Mr Anthony and now my recovery progress speed is
much faster.
I will update my case day by day.
Thank you so much
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users
Hi Mr Anthony,
Forget it, the osd is UP and recovery speed is x10times
Amazing
And now we just wait, right ?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Yes, Sir.
We added 10TIb to cephosd02 node. Now the disk is IN, but DOWN state.
What should we do now :(
For additional, the recovery speed is x10 times :)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...
Anthony is correct, this is what I was getting at as well when seeing your ceph
-s output. More details in the Ceph docs here if you want to understand the
details of why you need to balance your nodes.
https://docs.ceph.com/en/quincy/rados/operations/monitoring-osd-pg/
But you need to get you
Your recovery is stuck because there are no OSDs that have enough space to
accept data.
Your second OSD host appears to only have 9 OSDs currently, so you should be
able to add a 10TB OSD there without removing anything.
That will enable data to move to all three of your 10TB OSDs.
> On Feb 24
HolySh***
First, we change the mon_max_pg_per_osd to 1000
About adding disk for cephosd02, for more detail , what is TO, sir ? I ll make
conversation with my boss. To be honest, im thinking that the volume recovery
progress will get problem...
___
ce
You aren’t going to be able to finish recovery without having somewhere to
recover TO.
> On Feb 24, 2024, at 10:33 AM, nguyenvand...@baoviet.com.vn wrote:
>
> Thank you, Sir. But i think i ll wait for PG BACKFILLFULL finish, my boss is
> very angry now and will not allow me to add one more disk
You also might want to increase mon_max_pg_per_osd since you have a wide spread
of OSD sizes.
Default is 250. Set it to 1000.
> On Feb 24, 2024, at 10:30 AM, Anthony D'Atri wrote:
>
> Add a 10tb HDD to the third node as I suggested, that will help your cluster.
>
>
>> On Feb 24, 2024, at 10
Thank you, Sir. But i think i ll wait for PG BACKFILLFULL finish, my boss is
very angry now and will not allow me to add one more disk( this action make him
think that ceph would take more time for recovering and rebalancing ). We want
to wait volume recovering progress finish
__
Add a 10tb HDD to the third node as I suggested, that will help your cluster.
> On Feb 24, 2024, at 10:29 AM, nguyenvand...@baoviet.com.vn wrote:
>
> I will correct some small things:
>
> we have 6 nodes, 3 osd node and 3 gaeway node ( which run RGW, mds and nfs
> service)
> you r corrct, 2/3
and sure, we have one more 10tib disk which cephosd02 will get it.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
I will correct some small things:
we have 6 nodes, 3 osd node and 3 gaeway node ( which run RGW, mds and nfs
service)
you r corrct, 2/3 osd node have ONE-NEW 10tib disk
About your suggestion, add another osd host, we will. But we need to end this
nightmare, my NFS folder which have 10tib data i
# ceph osd dump | grep ratio
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
Read the four sections here:
https://docs.ceph.com/en/quincy/rados/operations/health-checks/#osd-out-of-order-full
> On Feb 24, 2024, at 10:12 AM, nguyenvand...@baoviet.com.vn wrote:
>
> Hi Mr Anthony, Cou
There ya go.
You have 4 hosts, one of which appears to be down and have a single OSD that is
so small as to not be useful. Whatever cephgw03 is, it looks like a mistake.
OSDs much smaller than, say, 1TB often aren’t very useful.
Your pools appear to be replicated, size=3.
So each of your cep
Hi Mr Anthony, Could you tell me more details about raising the full and
backfullfull threshold
is it
ceph tell 'osd.*' injectargs --osd-max-backfills=2 --osd-recovery-max-active=6
??
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe sen
Hi Mr Anthony
pls check the output
https://anotepad.com/notes/s7nykdmc
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hi Mr Anthony,
pls check the output
https://anotepad.com/notes/s7nykdmc
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hi Mathew,
1) We have 2 MDS service running before this nightmare. Now we trying to apply
mds on 3 nodes, but all of them will stop within 2 minutes.
2) You are correct. We just add two 10TIB disk to cluster ( which currently
have 27 x 4TIB disk), all of them have weight 1.0
About volume recov
Hi David,
I ll follow your suggestion. Do you have Telegram ? If yes, could you pls add
my Telegram, +84989177619. Thank you so much
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
So I have done some further digging. Seems similar to this : Bug #54172: ceph version 16.2.7 PG scrubs not progressing - RADOS - Ceph Apart from: 1/ I have restarted all OSD's/forced a re-peer
and the issue is still there 2/ Setting noscrub stops the scrubs "appearing" Checking a PG seems its jus
>
> 2) It looks like you might have an interesting crush map. Allegedly you have
> 41TiB of space but you can’t finish rococering you have lots of PGs stuck as
> their destination is too full. Are you running homogenous hardware or do you
> have different drive sizes? Are all the weights set c
It looks like you have quite a few problems I’ll try and address them one by
one.
1) Looks like you had a bunch of crashes, from the ceph -s it looks like you
don’t have enough MDS daemons running for a quorum. So you’ll need to restart
the crashed containers.
2) It looks like you might have
Do you have the possibility to stop/unmount cephfs clients ?
If so, do that and restart the MDS.
It should restart.
Have the clients restart one by one and check that the MDS does not crash
(by monitoring the logs)
Cordialement,
*David C
Le 22/02/2024 à 18:07:51+0300, Konstantin Shalygin a écrit
Hi,
Thanks.
>
> Yes you can, this controlled by option
>
>
> client quota df = false
But I'm unable to make it work. Is this the correct syntaxe ?
[global]
fsid = ***
mon_host = [v2:10
Hi Mathew
Pls chekc my ceph -s
ceph -s
cluster:
id: 258af72a-cff3-11eb-a261-d4f5ef25154c
health: HEALTH_WARN
3 failed cephadm daemon(s)
1 filesystem is degraded
insufficient standby MDS daemons available
1 nearfull osd(s)
34 matches
Mail list logo