This is more or less the same bahaviour i have in ky environment
By any chance is anyone running their osds and their hypervisors on the
same machine ?
And could high workload, like starting 40 - 60 or above virtual machines
have an effect on this problem ?
On Thursday, 27 October 2016, wrote:
It is not more than a three line script. You will also need leveldb's
code in your working directory:
```
#!/usr/bin/python2
import leveldb
leveldb.RepairDB('./omap')
```
I totally agree that we need more repair tools to be officially
available and also tools that provide better insight to compo
Strangely enough, I’m also seeing similar user issues – a strangely high volume
of corrupt instance boot disks.
At this point I’m attributing it to the fact that our Ceph cluster is patched 9
months ahead of our RedHat OSP Kilo environment. However that’s a total guess
at this point…..
From:
Hello,
On Wed, 26 Oct 2016 15:40:00 + Ashley Merrick wrote:
> Hello All,
>
> Currently running a CEPH cluster connected to KVM via the KRBD and used only
> for this purpose.
>
> Is working perfectly fine, however would like to look at increasing / helping
> with random write performance
Most of filesystem corrupt causes instances crashed, we saw that after a
shutdown / restart
( triggered by OpenStack portal buttons or triggered by OS commands in
Instances )
Some are early-detected, we saw filesystem errors in OS logs on instances.
Then we make a filesystem check ( FSCK / chk
Hum ~~~seems we have in common
We use
rbd snap create to make snapshot for instances volumes
rbd export and rbd export-diff command to make daily backup.
Now we got 29 instances and 33 volumes
[cid:image007.jpg@01D1747D.DB260110]
Keynes Lee李 俊 賢
Direct:
+886-2-6612-1025
Mobile:
+886-9
> Collectd and graphite look really nice.
Also look into Grafana, and of course RHSC.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
- Den 26.okt.2016 21:25 skrev Haomai Wang hao...@xsky.com:
> On Thu, Oct 27, 2016 at 2:10 AM, Trygve Vea
> wrote:
>> - Den 26.okt.2016 16:37 skrev Sage Weil s...@newdream.net:
>>> On Wed, 26 Oct 2016, Trygve Vea wrote:
- Den 26.okt.2016 14:41 skrev Sage Weil s...@newdream.net:
>>>
On Thu, Oct 27, 2016 at 2:10 AM, Trygve Vea
wrote:
> - Den 26.okt.2016 16:37 skrev Sage Weil s...@newdream.net:
>> On Wed, 26 Oct 2016, Trygve Vea wrote:
>>> - Den 26.okt.2016 14:41 skrev Sage Weil s...@newdream.net:
>>> > On Wed, 26 Oct 2016, Trygve Vea wrote:
>>> >> Hi,
>>> >>
>>> >> We
Just before your response, I decided to take the chance of restarting the
primary osd for the pg (153).
At this point, the MDS trimming error is gone and I'm in a warning state
now. The pg has moved from peering+remapped
to active+degraded+remapped+backfilling.
I'd say we're probably nearly back
> Op 26 oktober 2016 om 20:44 schreef Brady Deetz :
>
>
> Summary:
> This is a production CephFS cluster. I had an OSD node crash. The cluster
> rebalanced successfully. I brought the down node back online. Everything
> has rebalanced except 1 hung pg and MDS trimming is now behind. No hardware
Summary:
This is a production CephFS cluster. I had an OSD node crash. The cluster
rebalanced successfully. I brought the down node back online. Everything
has rebalanced except 1 hung pg and MDS trimming is now behind. No hardware
failures have become apparent yet.
Questions:
1) Is there a way to
- Den 26.okt.2016 16:37 skrev Sage Weil s...@newdream.net:
> On Wed, 26 Oct 2016, Trygve Vea wrote:
>> - Den 26.okt.2016 14:41 skrev Sage Weil s...@newdream.net:
>> > On Wed, 26 Oct 2016, Trygve Vea wrote:
>> >> Hi,
>> >>
>> >> We have two Ceph-clusters, one exposing pools both for RGW and
Hello All,
Currently running a CEPH cluster connected to KVM via the KRBD and used only
for this purpose.
Is working perfectly fine, however would like to look at increasing / helping
with random write performance and latency, specially from multiple VM's hitting
the spinning disks at same tim
On Wed, Oct 26, 2016 at 9:57 PM, Trygve Vea
wrote:
> - Den 26.okt.2016 15:36 skrev Haomai Wang hao...@xsky.com:
>> On Wed, Oct 26, 2016 at 9:09 PM, Trygve Vea
>> wrote:
>>>
>>> - Den 26.okt.2016 14:41 skrev Sage Weil s...@newdream.net:
>>> > On Wed, 26 Oct 2016, Trygve Vea wrote:
>>> >> H
On Wed, 26 Oct 2016, Trygve Vea wrote:
> - Den 26.okt.2016 14:41 skrev Sage Weil s...@newdream.net:
> > On Wed, 26 Oct 2016, Trygve Vea wrote:
> >> Hi,
> >>
> >> We have two Ceph-clusters, one exposing pools both for RGW and RBD
> >> (OpenStack/KVM) pools - and one only for RBD.
> >>
> >> Aft
> Op 26 oktober 2016 om 15:51 schreef J David :
>
>
> On Wed, Oct 26, 2016 at 8:55 AM, Andreas Davour wrote:
> > If there are 1 MON in B, that cluster will have quorum within itself and
> > keep running, and in A the MON cluster will vote and reach quorum again.
>
> Quorum requires a majority
Actually i have the same problem when starting an instance backed up by
librbd
But this only happens when trying to start 60+ instance
But I decided that this is due to the fact that we are using old hardware
that is not able to respond to high demand.
Could that be the same issue that you are fa
- Den 26.okt.2016 15:36 skrev Haomai Wang hao...@xsky.com:
> On Wed, Oct 26, 2016 at 9:09 PM, Trygve Vea
> wrote:
>>
>> - Den 26.okt.2016 14:41 skrev Sage Weil s...@newdream.net:
>> > On Wed, 26 Oct 2016, Trygve Vea wrote:
>> >> Hi,
>> >>
>> >> We have two Ceph-clusters, one exposing pools
On Wed, Oct 26, 2016 at 8:55 AM, Andreas Davour wrote:
> If there are 1 MON in B, that cluster will have quorum within itself and
> keep running, and in A the MON cluster will vote and reach quorum again.
Quorum requires a majority of all monitors. One monitor by itself (in
a cluster with at lea
I am not aware of any similar reports against librbd on Firefly. Do you use
any configuration overrides? Does the filesystem corruption appears while
the instances are running or only after a shutdown / restart of the
instance?
On Wed, Oct 26, 2016 at 12:46 AM, wrote:
> No , we are using Firefly
On Wed, Oct 26, 2016 at 9:09 PM, Trygve Vea
wrote:
>
> - Den 26.okt.2016 14:41 skrev Sage Weil s...@newdream.net:
> > On Wed, 26 Oct 2016, Trygve Vea wrote:
> >> Hi,
> >>
> >> We have two Ceph-clusters, one exposing pools both for RGW and RBD
> >> (OpenStack/KVM) pools - and one only for RBD.
If your clustering something important do it at the application level. For
example financial transactions are replicated at the application level just for
this reason.
As far as Ceph I'm not an expert yet. Even with all the file system wizardry in
the world some things need to be handled outsi
- Den 26.okt.2016 14:41 skrev Sage Weil s...@newdream.net:
> On Wed, 26 Oct 2016, Trygve Vea wrote:
>> Hi,
>>
>> We have two Ceph-clusters, one exposing pools both for RGW and RBD
>> (OpenStack/KVM) pools - and one only for RBD.
>>
>> After upgrading both to Jewel, we have seen a significantl
On Wed, 26 Oct 2016, Trygve Vea wrote:
> Hi,
>
> We have two Ceph-clusters, one exposing pools both for RGW and RBD
> (OpenStack/KVM) pools - and one only for RBD.
>
> After upgrading both to Jewel, we have seen a significantly increased CPU
> footprint on the OSDs that are a part of the cluste
Hi,
We have two Ceph-clusters, one exposing pools both for RGW and RBD
(OpenStack/KVM) pools - and one only for RBD.
After upgrading both to Jewel, we have seen a significantly increased CPU
footprint on the OSDs that are a part of the cluster which includes RGW.
This graph illustrates this: h
served with 405
>> Method Not Allowed
>>
>> DEBUG: Sending request method_string='PUT', uri='/?website',
>> headers={'x-amz-content-sha256':
>> '3fcf37205b114f03a910d11d74206358f1681381f0f9498
7;x-amz-content-sha256':
> '3fcf37205b114f03a910d11d74206358f1681381f0f9498b25aa1cc65e168937',
> 'Authorization': 'AWS4-HMAC-SHA256
> Credential=V4NZ37SLP3VOPR2BI5UW/20161026/US/s3/aws4_request,SignedHeaders=host;x-amz-content-sha256;x-amz-date,Signature=4cbd6a7
Hi Christian,
Thanks for the reply / suggestion!
MJ
On 10/24/2016 10:02 AM, Christian Balzer wrote:
Hello,
On Mon, 24 Oct 2016 09:41:37 +0200 mj wrote:
Hi,
We have been running xfs on our servers for many years, and we are used
to run a scheduled xfs_fsr during the weekend.
Lately we hav
f0f9498b25aa1cc65e168937',
'Authorization': 'AWS4-HMAC-SHA256
Credential=V4NZ37SLP3VOPR2BI5UW/20161026/US/s3/aws4_request,SignedHeaders=host;x-amz-content-sha256;x-amz-date,Signature=4cbd6a7c26dc149fc8fb352dae2d42c27e9bdc254cecc467802941cfc0e200a2',
'x-amz-d
2016-10-11 9:20 GMT+02:00 Дробышевский, Владимир :
> It may looks like a boys club but I believe that sometimes for the
> proof-of-concept projects or in the beginning of the commercial project
> without a lot of invesments it worth to consider used hardware. For example,
> it's possible to find us
> Op 26 oktober 2016 om 10:44 schreef Sage Weil :
>
>
> On Wed, 26 Oct 2016, Dan van der Ster wrote:
> > On Tue, Oct 25, 2016 at 7:06 AM, Wido den Hollander wrote:
> > >
> > >> Op 24 oktober 2016 om 22:29 schreef Dan van der Ster
> > >> :
> > >>
> > >>
> > >> Hi Wido,
> > >>
> > >> This seems
On Wed, 26 Oct 2016, Dan van der Ster wrote:
> On Tue, Oct 25, 2016 at 7:06 AM, Wido den Hollander wrote:
> >
> >> Op 24 oktober 2016 om 22:29 schreef Dan van der Ster :
> >>
> >>
> >> Hi Wido,
> >>
> >> This seems similar to what our dumpling tunables cluster does when a few
> >> particular osds
On Tue, Oct 25, 2016 at 7:06 AM, Wido den Hollander wrote:
>
>> Op 24 oktober 2016 om 22:29 schreef Dan van der Ster :
>>
>>
>> Hi Wido,
>>
>> This seems similar to what our dumpling tunables cluster does when a few
>> particular osds go down... Though in our case the remapped pgs are
>> correctly
34 matches
Mail list logo