My understanding is that when using direct=1 on a raw block device FIO
(aka-you) will have to handle all the sector alignment or the request will get
buffered to perform the alignment.
Try adding the –blockalign=512b option to your jobs, or better yet just use the
native FIO RBD engine.
Someth
Thanks for the explanation. I guess this case you outlined explains why the
Ceph developers chose to make this a ‘safe’ default.
2 osds are transiently down and the third fails hard. The PGs on the 3rd osd
with no more replicas are marked unfound. You bring up 1 and 2 and these PGs
will remai
Hi Wido,
Just curious how blocking IO to the final replica provides protection from data
loss? I’ve never really understood why this is a Ceph best practice. In my
head all 3 replicas would be on devices that have roughly the same odds of
physically failing or getting logically corrupted in a
Check your MTU. I think ospf has issues when fragmenting. Try setting your
interface MTU to something obnoxiously small to ensure that anything upstream
isn't fragmenting - say 1200. If it works try a saner value like 1496 which
accounts for any vlan headers.
If you're running in a spine/leaf
>From my experience noin doesn't stop new OSDs from being marked in. noin only
>works on OSDs already in the crushmap. To accomplish the behavior you want
>I've injected "mon osd auto mark new in = false" into MONs. This also seems to
>set their OSD weight to 0 when they are created.
> On Nov
Hi Vincent,
When I did a similar upgrade I found that having mixed version OSDs caused
issues much like yours. My advice to you is to power through the upgrade as
fast as possible. Pretty sure this is related to an issue/bug discussed here
previously around excessive load on the monitors in mix
Start with a rolling restart of just the OSDs one system at a time, checking
the status after each restart.
On Nov 1, 2016, at 6:20 PM, Ronny Aasen
mailto:ronny+ceph-us...@aasen.cx>> wrote:
thanks for the suggestion.
is a rolling reboot sufficient? or must all osd's be down at the same time ?
Strangely enough, I’m also seeing similar user issues – a strangely high volume
of corrupt instance boot disks.
At this point I’m attributing it to the fact that our Ceph cluster is patched 9
months ahead of our RedHat OSP Kilo environment. However that’s a total guess
at this point…..
From:
Just out of curiosity, did you recently upgrade to Jewel?
From: ceph-users on behalf of
"keynes_...@wistron.com"
Date: Tuesday, October 25, 2016 at 10:52 PM
To: "ceph-users@lists.ceph.com"
Subject: [EXTERNAL] [ceph-users] Instance filesystem corrupt
We are using OpenStack + Ceph.
Recently we
Because you do not have segregated networks, the cluster traffic is most likely
drowning out the FIO user traffic. This is especially exacerbated by the fact
that it is only a 1gb link between the cluster nodes.
If you are planning on using this cluster for anything other than testing,
you’ll
What does your network setup look like? Do you have a separate cluster network?
Can you explain how you are performing the FIO test? Are you mounting a volume
through krbd and testing that from a different server?
On Oct 5, 2016, at 3:11 AM, Mario Rodríguez Molins
mailto:mariorodrig...@tuenti.
Just went through this upgrading a ~400 OSD cluster. I was in the EXACT spot
you were in. The faster you can get all OSDs to the same version as the MONs
the better. We decided to power forward and the performance got better for
every OSD node we patched.
Additionally I also discovered your Le
Sorry make that 'ceph tell osd.* version'
> On Sep 19, 2016, at 2:55 PM, WRIGHT, JON R (JON R)
> wrote:
>
> When you say client, we're actually doing everything through Openstack vms
> and cinder block devices.
>
> librbd and librados are:
>
> /usr/lib/librbd.so.1.0.0
>
> /usr/lib/librados.
Do you still have OSDs that aren't upgraded?
What does a 'ceph tell osd.* show' ?
> On Sep 19, 2016, at 2:55 PM, WRIGHT, JON R (JON R)
> wrote:
>
> When you say client, we're actually doing everything through Openstack vms
> and cinder block devices.
>
> librbd and librados are:
>
> /usr/li
How many PGs do you have - and how many are you increasing it to?
Increasing PG counts can be disruptive if you are increasing by a large
proportion of the initial count because all the PG peering involved. If you
are doubling the amount of PGs it might be good to do it in stages to minimize
p
I'm working with some teams who would like to not only create ACLs within
RADOSGW to a tenant level, they would like to tailor ACLs to users within that
tenant. After trial and error, I can only seem to get ACLs to stick at a
tenant level using the keystone tenant ID uuid.
Is this expected beh
Bah, what Waldo said. Forgot the MONs don’t use the cluster net. Do what
he said you’ll be fine.
On 8/17/15, 8:41 PM, "Will.Boege" wrote:
>Thinking this through, pretty sure you would need to take your cluster
>offline to do this. I can¹t think of a scenario where you could reliably
>keep quo
Thinking this through, pretty sure you would need to take your cluster
offline to do this. I can¹t think of a scenario where you could reliably
keep quorum as you swap your monitors to use the cluster network.
On 8/10/15, 8:59 AM, "Daniel Marks" wrote:
>Hi all,
>
>we just found out that our cep
In my experience I have seen something like this this happen twice - First
time there were unclean PGs because Ceph was down to one replica of a PG.
When that happens Ceph blocks IO to remaining replicas when the number
falls below the Œmin_size¹ parameter. That will manifest as blocked ops.
Second
Does the ceph health detail show anything about stale or unclean PGs, or
are you just getting the blocked ops messages?
On 7/13/15, 5:38 PM, "Deneau, Tom" wrote:
>I have a cluster where over the weekend something happened and successive
>calls to ceph health detail show things like below.
>What
20 matches
Mail list logo