Hi,
I have some magic deep scrubbing error, and can't find the reason.
The system contains 5 nodes with 20 OSDs in total and everything works
fine, except these scrubbing errors.
Sometimes the deep-scrub finds inconsistencies, but not exactly clear
why. The content of the objects are exactly
On 11/23/2015 07:19 PM, Gregory Farnum wrote:
> On Fri, Nov 20, 2015 at 11:33 AM, Simon Engelsman wrote:
>> [cut]
> In addition to what Robert said, it sounds like you've done something
> strange with your CRUSH map. Do you have separate trees for the SSDs
> and hard drives, or are they both und
Hi!
>I think the only time we've seen this was when there was some kind of
>XFS corruption that accidentally extended the size of the file on
>disk, and the object info was correct with its shorter size. But
>perhaps not, in which case I've no idea how this could have happened.
We use ext4 filesy
Hello Robert,
Sorry for late answer.
Thanks for your reply. I updated to infernalis and I applied all your
recommendations but it doesn't change anything, with or without cache
tiering :-/
I also compared XFS to EXT4 and BTRFS but it doesn't make the
difference.
The fio command from Seba
Hello Hugo,
Yes you're right. With Sebastien Han fio command I manage to see that my
disks can finally handle 100 Kiops, so the theoritical value is then: 2
x 2 x 100 / 2 = 200k.
I put the journal on the SSDSC2BX016T4R which is then supposed to double
my IOs, but it's not the case.
Rémi
Le 24/11/2015 21:48, Gregory Farnum a écrit :
>
> Yeah, this is the old "two copies in one rack, a third copy elsewhere"
> replication scheme that lots of stuff likes but CRUSH doesn't really
> support. Assuming new enough clients and servers (some of the older
> ones barf when you do this), you ca
Le 25/11/2015 14:37, Emmanuel Lacour a écrit :
Le 24/11/2015 21:48, Gregory Farnum a écrit :
Yeah, this is the old "two copies in one rack, a third copy elsewhere"
replication scheme that lots of stuff likes but CRUSH doesn't really
support. Assuming new enough clients and servers (some of the
Le 25/11/2015 14:47, Loris Cuoghi a écrit :
>
> Take the default root -> root
> Take two racks -> 2 racks
> For each rack, pick two hosts. -> 4 hosts
> Now, pick a leaf in each host : that would be 4 OSDs, but we are
> cutting short to 3 (see min_size/max_size 3) -> 3 OSDs
>
of course, thx for thi
On Wed, Nov 25, 2015 at 12:30 AM, Mike Miller wrote:
> Hi Greg,
>
> thanks very much. This is clear to me now.
>
> As for 'MDS cluster', I thought that this was not recommended at this stage?
> I would very much like to have a number >1 of MDS in my cluster as this
> would probably help very much
Hi all,
I am reinstalling our test cluster and run into problems when running
ceph-deploy mon create-initial
It fails stating:
[ceph_deploy.gatherkeys][WARNIN] Unable to find
/var/lib/ceph/bootstrap-osd/ceph.keyring on ceph1
[ceph_deploy][ERROR ] KeyNotFoundError: Could not find keyring file:
Hi,
discussing some design questions we came across the failover possibility
of cephs network configuration.
If I just have a public network, all traffic is crossing that lan.
With public and cluster network I can separate the traffic and get some
benefits.
What if one of the networks fail? e.g
On Wed, Nov 25, 2015 at 8:37 AM, Götz Reinicke - IT Koordinator
wrote:
> Hi,
>
> discussing some design questions we came across the failover possibility
> of cephs network configuration.
>
> If I just have a public network, all traffic is crossing that lan.
>
> With public and cluster network I c
Hi all,
Well, after repeating the procedure a few times I once ran ceph-deploy
forgetkeys and voila, that did it.
Sorry for the noise,
--
J.Hofmüller
Ein literarisches Meisterwerk ist nur ein Wörterbuch in Unordnung.
- Jean Cocteau
signature.asc
Description: OpenPGP digital signature
_
Hi,
Currently we have OK, WARN and ERR as states for a Ceph cluster.
Now, it could happen that while a Ceph cluster is in WARN state certain
PGs are not available due to being in peering or any non-active+? state.
When monitoring a Ceph cluster you usually want to see OK and not worry
when a clu
On Wed, 25 Nov 2015, Nick Fisk wrote:
> Presentation from the performance meeting.
>
> I seem to be unable to post to Ceph-devel, so can someone please repost
> there if useful.
Copying ceph-devel. The problem is just that your email is
HTML-formatted. If you send it in plaintext vger won't rej
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Since the one that is different is not your primary for the pg, then
pg repair is safe.
-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Wed, Nov 25, 2015 at 2:42 AM, Major Csaba wrote:
> Hi,
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
I'm really surprised that you are getting 100K IOPs from the Intel
S3610s. We are already in the process of ordering some to test along
side other drives, so I should be able to verify that as well. With
the S3700 and S3500, I was able to only get 20
Hi Sage
> -Original Message-
> From: Sage Weil [mailto:s...@newdream.net]
> Sent: 25 November 2015 17:38
> To: Nick Fisk
> Cc: 'ceph-users' ; ceph-de...@vger.kernel.org;
> 'Mark Nelson'
> Subject: Re: Cache Tiering Investigation and Potential Patch
>
> On Wed, 25 Nov 2015, Nick Fisk wro
On Wed, 25 Nov 2015, Nick Fisk wrote:
> Hi Sage
>
> > -Original Message-
> > From: Sage Weil [mailto:s...@newdream.net]
> > Sent: 25 November 2015 17:38
> > To: Nick Fisk
> > Cc: 'ceph-users' ; ceph-de...@vger.kernel.org;
> > 'Mark Nelson'
> > Subject: Re: Cache Tiering Investigation and
> -Original Message-
> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
> ow...@vger.kernel.org] On Behalf Of Sage Weil
> Sent: 25 November 2015 19:41
> To: Nick Fisk
> Cc: 'ceph-users' ; ceph-de...@vger.kernel.org;
> 'Mark Nelson'
> Subject: RE: Cache Tiering Investigation and
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
I think if it is not too much, we would like N out of M.
I don't know specifically about only building the one package, but I
build locally with make to shake out any syntax bugs, then I run
make-debs.sh which takes about 10-15 minutes to build to i
On Wed, 25 Nov 2015, Nick Fisk wrote:
> > > Yes I think that should definitely be an improvement. I can't quite
> > > get my head around how it will perform in instances where you miss 1
> > > hitset but all others are a hit. Like this:
> > >
> > > H H H M H H H H H H H H
> > >
> > > And recency is
To ease on clients you can change osd_backfill_scan_min and
osd_backfill_scan_max to 1. It's possible to change this online:
ceph tell osd.\* injectargs '--osd_backfill_scan_min 1'
ceph tell osd.\* injectargs '--osd_backfill_scan_max 1'
2015-11-24 16:52 GMT+01:00 Joe Ryner :
> Hi,
>
> Last night
On Wed, Nov 25, 2015 at 11:09 AM, Wido den Hollander wrote:
> Hi,
>
> Currently we have OK, WARN and ERR as states for a Ceph cluster.
>
> Now, it could happen that while a Ceph cluster is in WARN state certain
> PGs are not available due to being in peering or any non-active+? state.
>
> When mon
I don't have any comment on Greg's specific concerns, but I agree that
conceptually that distinguishing between states that are likely to resolve
themselves and ones that require intervention would be a nice addition.
QH
On Wed, Nov 25, 2015 at 2:46 PM, Gregory Farnum wrote:
> On Wed, Nov 25,
Hi,
After our trouble with ext4/xattr soft lockup kernel bug we started
moving some of our OSD to XFS, we're using ubuntu 14.04 3.19 kernel
and ceph 0.94.5.
We have two out of 28 rotational OSD running XFS and
they both get restarted regularly because they're terminating with
"ENOSPC":
2015-11-2
Posting again as it seems attachment was too large
Uploaded to DroidDoc, thanks to Stephen for the pointer.
http://docdro.id/QMHXDPl
From: Nick Fisk [mailto:n...@fisk.me.uk]
Sent: 25 November 2015 17:07
To: 'ceph-users'
Cc: 'Sage Weil' ; 'Mark Nelson'
Subject: Cache Tiering Investigation an
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
I don't think this does what you think it does.
This will almost certainly starve the client of IO. This is the number
of seconds between backfills, not the number of objects being scanned
during a backfill. Setting these to higher values will m
Hi, colleagues!
I have small 4-node CEPH cluster (0.94.2), all pools have size 3, min_size 1.
This night one host failed and cluster was unable to rebalance saying
there are a lot of undersized pgs.
root@slpeah002:[~]:# ceph -s
cluster 78eef61a-3e9c-447c-a3ec-ce84c617d728
health HEALTH_W
On 11/24/2015 08:48 PM, Somnath Roy wrote:
> Hi Yehuda/RGW experts,
>
> I have one cluster with RGW up and running in the customer site.
>
> I did some heavy performance testing on that with CosBench and as a
> result written significant amount of data to showcase performance on that.
>
> Over t
On 11/25/2015 10:46 PM, Gregory Farnum wrote:
> On Wed, Nov 25, 2015 at 11:09 AM, Wido den Hollander wrote:
>> Hi,
>>
>> Currently we have OK, WARN and ERR as states for a Ceph cluster.
>>
>> Now, it could happen that while a Ceph cluster is in WARN state certain
>> PGs are not available due to be
Hi,
can some please help me with this error?
$ ceph tell mds.0 version
Error EPERM: problem getting command descriptions from mds.0
Tell is not working for me on mds.
Version: infernalis - trusty
Thanks and regards,
Mike
___
ceph-users mailing list
Hi!
>After our trouble with ext4/xattr soft lockup kernel bug we started
>moving some of our OSD to XFS, we're using ubuntu 14.04 3.19 kernel
>and ceph 0.94.5.
It was a rather serious bug, but there is small a patch at kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=107301
https://bugzill
33 matches
Mail list logo