Hello,
We've got some Intel DC S3610s 800GB in operation on cache tiers.
On the ones with G2010150 firmware we've seen _very_ infrequent SATA bus
resets [1]. On the order of once per year and these are fairly busy
critters with an average of 400 IOPS and peaks much higher than that.
Funnily eno
Hi,
On 22.03.19 17:14, Robert Sander wrote:
> we created a bunch of new OSDs on three new nodes this morning. Now,
> after roughly 6 hours they still are empty and the cluster did not
> rebalance any objects resp. placement groups to them.
A new data point in this story emerged today as we had t
Hi Rhian,
not sure if you fond an answer already.
I believe in luminous and mimic it is only possible to extract compression
information on osd device level. According to the recent announcement of
nautilus, this seems to get better in the future.
If you want to check if anything is compressed
I am now trying to run tests to see how mclock_client queue works on mimic. But
when I tried to config tag (r,w,l) of each client, I found there are no options
to distinguish different clients.
All I got are following options for mclock_opclass, which are used to
distinguish different types of o
On Fri, Mar 22, 2019 at 8:38 AM Vikas Rana wrote:
>
> Hi Jason,
>
> Thanks you for your help and support.
>
>
> One last question, after the demotion and promotion and when you do a resync
> again, does it copies the whole image again or sends just the changes since
> the last journal update?
R
Thanks Brad! I completely forgot about that trick! I copied the output and
modified the command as suggested and the monitor came up. So at least that
does work, now I just need to figure out why the normal service setup is
borked. I was quick concerned that it wouldn’t come back at all and
So I do not think mclock_client queue works the way you’re hoping it does. For
categorization purposes it joins the operation class and the client identifier
with the intent that that will execute operations among clients more evenly
(i.e., it won’t favor one client over another).
However, it w
We did a upgrade from luminous to nautilus
after upgrading the three monitors we got that all our pgs where inactive
cluster:
id: 5bafad08-31b2-4716-be77-07ad2e2647eb
health: HEALTH_ERR
noout flag(s) set
1 scrub errors
Reduced data availability:
Have you upgraded any OSD's?
On a test cluster I saw the same and as I upgraded / restarted the OSD's
the PG's started to show online till it was 100%.
I know it says to not change anything to do with pool's during the upgrade
so I am guessing there is a code change that cause this till all is on
On 26-3-2019 16:39, Ashley Merrick wrote:
Have you upgraded any OSD's?
No didn't go through with the osd's
On a test cluster I saw the same and as I upgraded / restarted the
OSD's the PG's started to show online till it was 100%.
I know it says to not change anything to do with pool's du
Hi
[root@ceph1 ~]# ceph version
ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic
(stable)
We've run into a "No space left on device" issue when trying to delete a
file, despite there being free space:
[root@ceph1 ~]# ceph df
GLOBAL:
SIZEAVAIL RAW USED %R
See http://tracker.ceph.com/issues/38849
As an immediate workaround you can increase `mds bal fragment size
max` to 20 (which will increase the max number of strays to 2
million.)
(Try injecting that option to the mds's -- I think it is read at runtime).
And you don't need to stop the mds's a
Hi Dan
Thanks!
ceph tell mds.ceph1 config set mds_bal_fragment_size_max 20
got us running again.
Cheers
toby
On 26/03/2019 16:56, Dan van der Ster wrote:
> See http://tracker.ceph.com/issues/38849
>
> As an immediate workaround you can increase `mds bal fragment size
> max` to 20 (w
More or less followed the install instructions with modifications as
needed; but I'm suspecting that either a dependency was missed in the
F29 package or something else is up. I don't see anything obvious; any
ideas?
When I try to start setting up my first node I get the following:
[root@odin
CEPH 12.2.11, pool size 3, min_size 2.
One node went down today(private network interface started flapping, and
after a while OSD processes crashed), no big deal, cluster recovered, but
not completely. 1 PG stuck in active+clean+remapped state.
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACE
Hi,
I have a rbd in a cache tier setup which I need to extend. The question is, do
I resize it trough the cache pool or directly on the slow/storage pool? Or
dosen t that matter at all?
Thanks for feedback and regards . Götz
smime.p7s
Description: S/MIME cryptographic signature
___
When using cache pools (which are essentially deprecated functionality
BTW), you should always reference the base tier pool. The fact that a
cache tier sits in front of a slower, base tier is transparently
handled.
On Tue, Mar 26, 2019 at 5:41 PM Götz Reinicke
wrote:
>
> Hi,
>
> I have a rbd in a
http://docs.ceph.com/docs/hammer/rados/troubleshooting/troubleshooting-pg/
Did you try repairing the pg?
On Tue, Mar 26, 2019 at 9:08 AM solarflow99 wrote:
>
> yes, I know its old. I intend to have it replaced but thats a few months
> away and was hoping to get past this. the other OSDs appe
https://bugzilla.redhat.com/show_bug.cgi?id=1662496
On Wed, Mar 27, 2019 at 5:00 AM Andrew J. Hutton
wrote:
>
> More or less followed the install instructions with modifications as
> needed; but I'm suspecting that either a dependency was missed in the
> F29 package or something else is up. I don
19 matches
Mail list logo