[ceph-users] Re: ceph health "overall_status": "HEALTH_WARN"

2022-07-25 Thread Monish Selvaraj
Hi all, Recently, I deployed ceph orch ( pacific ) in my nodes with 5 mons 5 mgrs 238 osds and 5 rgw. Yesterday , 4 osds went out and 2 rgws down. So, i restart whole rgw by "ceph orch restart rgw.rgw". After two minutes , the whole rgw nodes goes down. Then I turned up the 4 osds and also waite

[ceph-users] External RGW always down

2022-09-09 Thread Monish Selvaraj
Hi all, I have one critical issue in my prod cluster. When the customer's data comes from 600 MiB . My Osds are down *8 to 20 from 238* . Then I manually up my osds . After a few minutes, my all rgw crashes. We did some troubleshooting but nothing works. When we upgrade ceph to 17.2.0. to 17.2.1

[ceph-users] Re: External RGW always down

2022-09-09 Thread Monish Selvaraj
FYI On Sat, Sep 10, 2022 at 11:23 AM Monish Selvaraj wrote: > Hi all, > > I have one critical issue in my prod cluster. When the customer's data > comes from 600 MiB . > > My Osds are down *8 to 20 from 238* . Then I manually up my osds . After > a few minutes, my a

[ceph-users] Re: External RGW always down

2022-09-09 Thread Monish Selvaraj
On Sat, Sep 10, 2022 at 11:25 AM Monish Selvaraj wrote: > FYI > > On Sat, Sep 10, 2022 at 11:23 AM Monish Selvaraj > wrote: > >> Hi all, >> >> I have one critical issue in my prod cluster. When the customer's data >> comes from 600 MiB . >&g

[ceph-users] Re: External RGW always down

2022-09-10 Thread Monish Selvaraj
s the cluster a new installation > with cephadm or an older cluster upgraded to Quincy? > > Zitat von Monish Selvaraj : > > > Hi all, > > > > I have one critical issue in my prod cluster. When the customer's data > > comes from 600 MiB . > > > >

[ceph-users] Re: External RGW always down

2022-09-26 Thread Monish Selvaraj
would RGW > prevent from starting? I’m assuming that if you fix your OSDs the RGWs > would start working again. But then again, we still don’t know > anything about the current situation. > > Zitat von Monish Selvaraj : > > > Hi Eugen, > > > > Below is the log

[ceph-users] Re: External RGW always down

2022-09-26 Thread Monish Selvaraj
; I dont know why it is happening. But maybe the rgw are running in > separate > > machines. This causes the issue ? > > I don't know how that should > > Zitat von Monish Selvaraj : > > > Hi Eugen, > > > > Yes, I have an inactive pgs when the osd goes down.

[ceph-users] Re: External RGW always down

2022-09-27 Thread Monish Selvaraj
he mailing list archives for that, > setting 'ceph osd set nodown' might help during the migration. But are > the OSDs fully saturated ('iostat -xmt /dev/sd* 1')? If updating helps > just stay on that version and maybe report a tracker issue with your > findings. >

[ceph-users] Re: External RGW always down

2022-09-27 Thread Monish Selvaraj
to sustain the failure > of three hosts without client impact, but if multiple OSDs across more > hosts fail (holding PGs of the same pool(s)) you would have inactive > PGs as you already reported. > > Zitat von Monish Selvaraj : > > > Hi Eugen, > > > > Thanks for

[ceph-users] Re: External RGW always down

2022-09-28 Thread Monish Selvaraj
: > As I already said, it's possible that your inactive PGs prevent the > RGWs from starting. You can turn on debug logs for the RGWs, maybe > they reveal more. > > Zitat von Monish Selvaraj : > > > Hi Eugen, > > > > The OSD fails because of RAM/CPU overloade

[ceph-users] Re: Increase the recovery throughput

2022-12-12 Thread Monish Selvaraj
to increase > osd_recovery_max_active and osd_max_backfills. What are the current > values in your cluster? > > > Zitat von Monish Selvaraj : > > > Hi, > > > > Our ceph cluster consists of 20 hosts and 240 osds. > > > > We used the erasure-coded pool