Hi Mark,
I've been playing with the reweight on 3 of the OSDs (BTW each OSD is
backed by a HDD, with a SSD backing all the 4 journals on each host) and
these slower ones were given a reweight of 0.5, 0.66 and 0.66.
>From what I gathered the reweight would also reduce the number of I/O
directed at
On 08/06/2014 03:43 AM, Luis Periquito wrote:
Hi,
In the last few days I've had some issues with the radosgw in which all
requests would just stop being served.
After some investigation I would go for a single slow OSD. I just
restarted that OSD and everything would just go back to work. Every
You can use the
ceph osd perf
command to get recent queue latency stats for all OSDs. With a bit
of sorting this should quickly tell you if any OSDs are going
significantly slower than the others.
We'd like to automate this in calamari or perhaps even in the monitor, but
it is not immediate
Hi Wido,
as the backing disk is running a deep scrub it's constantly 100% busy, no
errors though...
I'm running everything on XFS.
I had a similar feeling that was the OSD slowing down those requests. What
would be the affected pool? ".rgw"?
thanks,
On 6 August 2014 10:08, Wido den Hollander
On 08/06/2014 10:43 AM, Luis Periquito wrote:
Hi,
In the last few days I've had some issues with the radosgw in which all
requests would just stop being served.
After some investigation I would go for a single slow OSD. I just
restarted that OSD and everything would just go back to work. Every
Hi,
In the last few days I've had some issues with the radosgw in which all
requests would just stop being served.
After some investigation I would go for a single slow OSD. I just restarted
that OSD and everything would just go back to work. Every single time there
was a deep scrub running on th