On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
wrote:
> Hi,
>
> Am 01.03.2018 um 09:42 schrieb Dan van der Ster:
>> On Thu, Mar 1, 2018 at 9:31 AM, Stefan Priebe - Profihost AG
>> wrote:
>>> Hi,
>>> Am 01.03.2018 um 09:03 schrieb Dan va
On Thu, Mar 1, 2018 at 10:24 AM, Stefan Priebe - Profihost AG
wrote:
>
> Am 01.03.2018 um 09:58 schrieb Dan van der Ster:
>> On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
>> wrote:
>>> Hi,
>>>
>>> Am 01.03.2018 um 09:42 schrieb Dan van
On Thu, Mar 1, 2018 at 10:38 AM, Dan van der Ster wrote:
> On Thu, Mar 1, 2018 at 10:24 AM, Stefan Priebe - Profihost AG
> wrote:
>>
>> Am 01.03.2018 um 09:58 schrieb Dan van der Ster:
>>> On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
>>> wro
On Thu, Mar 1, 2018 at 10:40 AM, Dan van der Ster wrote:
> On Thu, Mar 1, 2018 at 10:38 AM, Dan van der Ster wrote:
>> On Thu, Mar 1, 2018 at 10:24 AM, Stefan Priebe - Profihost AG
>> wrote:
>>>
>>> Am 01.03.2018 um 09:58 schrieb Dan van der Ster:
>>&g
On Thu, Mar 1, 2018 at 1:08 PM, Stefan Priebe - Profihost AG
wrote:
> nice thanks will try that soon.
>
> Can you tell me how to change the log lever to info for the balancer module?
debug mgr = 4/5
-- dan
___
ceph-users mailing list
ceph-users@lists.c
late evening, every day, the cluster is back to HEALTH_OK.
Cheers, Dan
> Stefan
>
> Excuse my typo sent from my mobile phone.
>
> Am 01.03.2018 um 13:12 schrieb Dan van der Ster :
>
> On Thu, Mar 1, 2018 at 1:08 PM, Stefan Priebe - Profihost AG
> wrote:
>
> nice than
Hi all,
What is the purpose of
ceph mds set max_mds
?
We just used that by mistake on a cephfs cluster when attempting to
decrease from 2 to 1 active mds's.
The correct command to do this is of course
ceph fs set max_mds
So, is `ceph mds set max_mds` useful for something? If not, sho
On Wed, Mar 7, 2018 at 2:29 PM, John Spray wrote:
> On Wed, Mar 7, 2018 at 10:11 AM, Dan van der Ster wrote:
>> Hi all,
>>
>> What is the purpose of
>>
>>ceph mds set max_mds
>>
>> ?
>>
>> We just used that by mistake on a cephfs
Hi all,
On our luminous v12.2.4 ceph-fuse clients / mds the rctime is not
tracking the latest inode ctime, but only the latest directory ctimes.
Initial empty dir:
# getfattr -d -m ceph . | egrep 'bytes|ctime'
ceph.dir.rbytes="0"
ceph.dir.rctime="1521043742.09466372697"
Create a file, rctime is
On Wed, Mar 14, 2018 at 11:43 PM, Patrick Donnelly wrote:
> On Wed, Mar 14, 2018 at 9:22 AM, Dan van der Ster wrote:
>> Hi all,
>>
>> On our luminous v12.2.4 ceph-fuse clients / mds the rctime is not
>> tracking the latest inode ctime, but only the latest directory
Hi,
Do you see any split or merge messages in the osd logs?
I recall some surprise filestore splitting on a few osds after the luminous
upgrade.
.. Dan
On Mar 15, 2018 6:04 PM, "David Turner" wrote:
I upgraded a [1] cluster from Jewel 10.2.7 to Luminous 12.2.2 and last week
I added 2 nodes t
store splitting), but actually
segfaulting and restarting.
On Thu, Mar 15, 2018 at 4:08 PM Dan van der Ster wrote:
> Hi,
>
> Do you see any split or merge messages in the osd logs?
> I recall some surprise filestore splitting on a few osds after the
> luminous upgrade.
>
>
Hi,
Which versions were those MDS's before and after the restarted standby MDS?
Cheers, Dan
On Wed, Mar 28, 2018 at 11:11 AM, adrien.geor...@cc.in2p3.fr
wrote:
> Hi,
>
> I just had the same issue with our 12.2.4 cluster but not during the
> upgrade.
> One of our 3 monitors restarted (the one
version.
>
> Adrien
>
>
> Le 28/03/2018 à 14:47, Dan van der Ster a écrit :
>>
>> Hi,
>>
>> Which versions were those MDS's before and after the restarted standby
>> MDS?
>>
>> Cheers, Dan
>>
>>
>>
>> On Wed, Mar
On Thu, Mar 29, 2018 at 10:31 AM, Robert Sander
wrote:
> On 29.03.2018 09:50, ouyangxu wrote:
>
>> I'm using Ceph 12.2.4 with CentOS 7.4, and tring to use cephfs for
>> MariaDB deployment,
>
> Don't do this.
> As the old saying goes: If it hurts, stop doing it.
Why not? Let's find out where and w
Guys,
Ceph does not have a concept of "osd quorum" or "electing a primary
PG". The mons are in a PAXOS quorum, and the mon leader decides which
OSD is primary for each PG. No need to worry about a split OSD brain.
-- dan
On Thu, Mar 29, 2018 at 2:51 PM, Peter Linder
wrote:
>
>
> Den 2018-03-2
I keep seeing these threads where adding nodes has such an impact on the
cluster as a whole, that I wonder what the rest of the cluster looks like.
Normally I’d just advise someone to put a limit on the concurrent backfills
that can be done, and `osd max backfills` by default already is 1. Could
Hi Steven,
There is only one bench. Could you show multiple benches of the different
scenarios you discussed? Also provide hardware details.
Hans
On Apr 19, 2018 13:11, "Steven Vacaroaia" wrote:
Hi,
Any idea why 2 servers with one OSD each will provide better performance
than 3 ?
Servers are
4194304
> Bandwidth (MB/sec): 44.0793
> Stddev Bandwidth: 55.3843
> Max bandwidth (MB/sec): 232
> Min bandwidth (MB/sec): 0
> Average IOPS: 11
> Stddev IOPS:13
> Max IOPS: 58
> Min IOPS: 0
> Average Latency(s
DB ( on separate SSD or same HDD)
Thanks
Steven
On Thu, 19 Apr 2018 at 12:06, Hans van den Bogert
wrote:
> I take it that the first bench is with replication size 2, the second
> bench is with replication size 3? Same for the 4 node OSD scenario?
>
> Also please let us know how you
Write Cache : Disk's Default
> Adapter 0-VD 2(target id: 2): Disk Write Cache : Disk's Default
> Adapter 0-VD 3(target id: 3): Disk Write Cache : Disk's Default
>
>
> On Thu, 19 Apr 2018 at 14:22, Hans van den Bogert
> wrote:
>
>> I see, the second one i
That "nicely exporting" thing is a logging issue that was apparently
fixed in https://github.com/ceph/ceph/pull/19220. I'm not sure if that
will be backported to luminous.
Otherwise the slow requests could be due to either slow trimming (see
previous discussions about mds log max expiring and mds
Hi Scott,
Multi MDS just assigns different parts of the namespace to different
"ranks". Each rank (0, 1, 2, ...) is handled by one of the active
MDSs. (You can query which parts of the name space are assigned to
each rank using the jq tricks in [1]). If a rank is down and there are
no more standby
Shouldn't Steven see some data being written to the block/wal for object
metadata? Though that might be negligible with 4MB objects
On 27-04-18 16:04, Serkan Çoban wrote:
rados bench is using 4MB block size for io. Try with with io size 4KB,
you will see ssd will be used for write operations.
Hi Nick,
Our latency probe results (4kB rados bench) didn't change noticeably
after converting a test cluster from FileStore (sata SSD journal) to
BlueStore (sata SSD db). Those 4kB writes take 3-4ms on average from a
random VM in our data centre. (So bluestore DB seems equivalent to
FileStore jou
Hi Valery,
Did you eventually find a workaround for this? I *think* we'd also
prefer rgw to fallback to external plugins, rather than checking them
before local. But I never understood the reasoning behind the change
from jewel to luminous.
I saw that there is work towards a cache for ldap [1] an
>
> We agreed in upstream RGW to make this change. Do you intend to
> submit this as a PR?
>
> regards
>
> Matt
>
> On Fri, May 4, 2018 at 10:57 AM, Dan van der Ster wrote:
>> Hi Valery,
>>
>> Did you eventually find a workaround for this? I *think* we
On Tue, May 8, 2018 at 7:35 PM, Vasu Kulkarni wrote:
> On Mon, May 7, 2018 at 2:26 PM, Maciej Puzio wrote:
>> I am an admin in a research lab looking for a cluster storage
>> solution, and a newbie to ceph. I have setup a mini toy cluster on
>> some VMs, to familiarize myself with ceph and to tes
Hi Adrian,
Is there a strict reason why you *must* upgrade the tunables?
It is normally OK to run with old (e.g. hammer) tunables on a luminous
cluster. The crush placement won't be state of the art, but that's not
a huge problem.
We have a lot of data in a jewel cluster with hammer tunables. We
Hi,
It still isn't clear if you're using the fuse or kernel client.
Do you `mount -t ceph` or something else?
-- Dan
On Wed, May 16, 2018 at 8:28 PM Donald "Mac" McCarthy
wrote:
> CephFS. 8 core atom C2758, 16 GB ram, 256GB ssd, 2.5 GB NIC (supermicro
microblade node).
> Read test:
> dd if
Hi all,
We have an intermittent issue where bluestore osds sometimes fail to
start after a reboot.
The osds all fail the same way [see 2], failing to open the superblock.
One one particular host, there are 24 osds and 4 SSDs partitioned for
the block.db's. The affected non-starting OSDs all have b
On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote:
>
> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > Hi all,
> >
> > We have an intermittent issue where bluestore osds sometimes fail to
> > start after a reboot.
> > The osds all fail the same way [see 2], fai
On Thu, Jun 7, 2018 at 4:31 PM Alfredo Deza wrote:
>
> On Thu, Jun 7, 2018 at 10:23 AM, Dan van der Ster wrote:
> > Hi all,
> >
> > We have an intermittent issue where bluestore osds sometimes fail to
> > start after a reboot.
> > The osds all fail the s
On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote:
>
> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote:
> > >
> > > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > > Hi all,
> > > >
> > >
On Thu, Jun 7, 2018 at 5:16 PM Alfredo Deza wrote:
>
> On Thu, Jun 7, 2018 at 10:54 AM, Dan van der Ster wrote:
> > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote:
> >>
> >> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> >> > On Thu, Jun 7, 2018 at 4:33
On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote:
>
> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote:
> > >
> > > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil
On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster wrote:
>
> On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote:
> >
> > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote:
> > > >
> > > > On Thu, 7 Jun
On Thu, Jun 7, 2018 at 6:01 PM Dan van der Ster wrote:
>
> On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster wrote:
> >
> > On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote:
> > >
> > > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > >
On Thu, Jun 7, 2018 at 6:09 PM Sage Weil wrote:
>
> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster wrote:
> > >
> > > On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote:
> > > >
> > > > On Thu
On Thu, Jun 7, 2018 at 6:33 PM Sage Weil wrote:
>
> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > > Wait, we found something!!!
> > > >
> > > > In the 1st 4k on the block we found the block.db pointing at the wrong
> > > > device (/dev/sd
On Thu, Jun 7, 2018 at 6:58 PM Alfredo Deza wrote:
>
> On Thu, Jun 7, 2018 at 12:09 PM, Sage Weil wrote:
> > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> >> On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster
> >> wrote:
> >> >
> >>
On Thu, Jun 7, 2018 at 8:58 PM Alfredo Deza wrote:
>
> On Thu, Jun 7, 2018 at 2:45 PM, Dan van der Ster wrote:
> > On Thu, Jun 7, 2018 at 6:58 PM Alfredo Deza wrote:
> >>
> >> On Thu, Jun 7, 2018 at 12:09 PM, Sage Weil wrote:
> >> > On Thu, 7 Jun 20
See this thread:
http://lists.ceph.com/pipermail/ceph-large-ceph.com/2018-April/000106.html
http://lists.ceph.com/pipermail/ceph-large-ceph.com/2018-June/000113.html
(Wido -- should we kill the ceph-large list??)
-- dan
On Wed, Jun 13, 2018 at 12:27 PM Marc Roos wrote:
>
>
> Shit, I added th
See this thread:
http://lists.ceph.com/pipermail/ceph-large-ceph.com/2018-April/000106.html
http://lists.ceph.com/pipermail/ceph-large-ceph.com/2018-June/000113.html
(Wido -- should we kill the ceph-large list??)
On Wed, Jun 13, 2018 at 1:14 PM Marc Roos wrote:
>
>
> I wonder if this is not a
Hello,
We recently upgraded Ceph from version 12.2.2 to version 12.2.5. Since the
upgrade we've been having performance issues which seem to relate to when
deep-scrub actions are performed.
Most of the time deep-scrub actions only takes a couple of seconds at most,
however ocassionaly it takes
porarily disabled this. Could
this somehow be related?
Thanks
Sander
From: Gregory Farnum
Sent: Thursday, June 14, 2018 19:45
To: Sander van Schie / True
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Performance issues with deep-scrub since upgrading
the issue for us.
Sander
From: Gregory Farnum
Sent: Thursday, June 14, 2018 22:45
To: Sander van Schie / True
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Performance issues with deep-scrub since upgrading
from v12.2.2 to v12.2.5
Yes. Deep scrub o
Hello,
We're into some problems with dynamic bucket index resharding. After an upgrade
from Ceph 12.2.2 to 12.2.5, which fixed an issue with the resharding when using
tenants (which we do), the cluster was busy resharding for 2 days straight,
resharding the same buckets over and over again.
Af
Hi,
One way you can see exactly what is happening when you write an object
is with --debug_ms=1.
For example, I write a 100MB object to a test pool: rados
--debug_ms=1 -p test put 100M.dat 100M.dat
I pasted the output of this here: https://pastebin.com/Zg8rjaTV
In this case, it first gets the cl
:6789/0,ngfdv078=128.55.xxx.xx:6789/0}
>
> election epoch 4, quorum 0,1 ngfdv076,ngfdv078
>
> osdmap e280: 48 osds: 48 up, 48 in
>
> flags sortbitwise,require_jewel_osds
>
> pgmap v117283: 3136 pgs, 11 pools, 25600 MB data, 510 objects
>
&g
p would be greatly appreciated.
Thanks,
Sander
From: Sander van Schie / True
Sent: Friday, June 15, 2018 14:19
To: ceph-users@lists.ceph.com
Subject: RGW Dynamic bucket index resharding keeps resharding all buckets
Hello,
We're into some proble
Thanks, I created the following issue: https://tracker.ceph.com/issues/24551
Sander
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
82489 osd.19 up 1.0 1.0
>> >
>> > 23 21.82489 osd.23 up 1.0 1.0
>> >
>> > 27 21.82489 osd.27 up 1.0 1.0
>> >
>> > 31 21.82489 osd.3
And BTW, if you can't make it to this event we're in the early days of
planning a dedicated Ceph + OpenStack Days at CERN around May/June
2019.
More news on that later...
-- Dan @ CERN
On Tue, Jun 19, 2018 at 10:23 PM Leonardo Vaz wrote:
>
> Hey Cephers,
>
> We will join our friends from OpenSt
On Thu, Jun 21, 2018 at 2:41 PM Kai Wagner wrote:
>
> On 20.06.2018 17:39, Dan van der Ster wrote:
> > And BTW, if you can't make it to this event we're in the early days of
> > planning a dedicated Ceph + OpenStack Days at CERN around May/June
> > 2019.
> >
Hi all,
Quick question: does an IO with an unfound object result in an IO
error or should the IO block?
During a jewel to luminous upgrade some PGs passed through a state
with unfound objects for a few seconds. And this seems to match the
times when we had a few IO errors on RBD attached volumes.
there is no live ceph-osd who has a
> copy. In this case, IO to those objects will block, and the cluster will hope
> that the failed node comes back soon; this is assumed to be preferable to
> returning an IO error to the user."
>
> On 22.06.2018, at 16:16, Dan van der Ster w
viour of virtio-blk vs
virtio-scsi: the latter has a timeout but blk blocks forever.
On 5000 attached volumes we saw around 12 of these IO errors, and this
was the first time in 5 years of upgrades that an IO error happened...
-- dan
> -Greg
>
>>
>>
>> On 22.06.2018, at 1
On Thu, Jun 7, 2018 at 8:40 PM Dan van der Ster wrote:
>
> On Thu, Jun 7, 2018 at 6:33 PM Sage Weil wrote:
> >
> > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > > > Wait, we found something!!!
> > > > >
> > > > > In the
Hammer or jewel? I've forgotten which thread pool is handling the snap
trim nowadays -- is it the op thread yet? If so, perhaps all the op
threads are stuck sleeping? Just a wild guess. (Maybe increasing # op
threads would help?).
-- Dan
On Thu, Jan 12, 2017 at 3:11 PM, Nick Fisk wrote:
> Hi,
>
o:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Nick Fisk
>> Sent: 13 January 2017 20:38
>> To: 'Dan van der Ster'
>> Cc: 'ceph-users'
>> Subject: Re: [ceph-users] osd_snap_trim_sleep keeps locks PG during sleep?
>>
>> We're o
Hi All,
I'm clueless as to why an OSD crashed. I have a log at [1]. If anyone can
explain how this should be interpreted, then please let me know. I can only see
generic errors probably started by a false assert. Restarting the OSD fails
with the same errors as in [1]. It seems like, though co
Hi,
This is interesting. Do you have a bit more info about how to identify
a server which is suffering from this problem? Is there some process
(xfs* or kswapd?) we'll see as busy in top or iotop.
Also, which kernel are you using?
Cheers, Dan
On Tue, Feb 7, 2017 at 6:59 PM, Thorvald Natvig wr
On Mon, Mar 13, 2017 at 10:35 AM, Florian Haas wrote:
> On Sun, Mar 12, 2017 at 9:07 PM, Laszlo Budai wrote:
>> Hi Florian,
>>
>> thank you for your answer.
>>
>> We have already set the IO scheduler to cfq in order to be able to lower the
>> priority of the scrub operations.
>> My problem is tha
On Sat, Mar 11, 2017 at 12:21 PM, wrote:
>
> The next and biggest problem we encountered had to do with the CRC errors on
> the OSD map. On every map update, the OSDs that were not upgraded yet, got
> that CRC error and asked the monitor for a full OSD map instead of just a
> delta update. At f
Hi John,
Last week we updated our prod CephFS cluster to 10.2.6 (clients and
server side), and for the first time today we've got an object info
size mismatch:
I found this ticket you created in the tracker, which is why I've
emailed you: http://tracker.ceph.com/issues/18240
Here's the detail of
On Mon, Mar 13, 2017 at 1:35 PM, John Spray wrote:
> On Mon, Mar 13, 2017 at 10:28 AM, Dan van der Ster
> wrote:
>> Hi John,
>>
>> Last week we updated our prod CephFS cluster to 10.2.6 (clients and
>> server side), and for the first time today we've got an ob
Hi,
This sounds familiar: http://tracker.ceph.com/issues/17939
I found that you can get the updated quota on node2 by touching the
base dir. In your case:
touch /shares/share0
-- Dan
On Tue, Mar 14, 2017 at 10:52 AM, yu2xiangyang wrote:
> Dear cephers,
> I met a problem when using ce
On Tue, Mar 14, 2017 at 5:55 PM, John Spray wrote:
> On Tue, Mar 14, 2017 at 2:10 PM, Andras Pataki
> wrote:
>> Hi John,
>>
>> I've checked the MDS session list, and the fuse client does appear on that
>> with 'state' as 'open'. So both the fuse client and the MDS agree on an
>> open connection.
On Wed, Mar 22, 2017 at 8:24 AM, Marcus Furlong wrote:
> Hi,
>
> I'm experiencing the same issue as outlined in this post:
>
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/013330.html
>
> I have also deployed this jewel cluster using ceph-deploy.
>
> This is the message I see
Can't help, but just wanted to say that the upgrade worked for us:
# ceph health
HEALTH_OK
# ceph tell mon.* version
mon.p01001532077488: ceph version 10.2.7
(50e863e0f4bc8f4b9e31156de690d765af245185)
mon.p01001532149022: ceph version 10.2.7
(50e863e0f4bc8f4b9e31156de690d765af245185)
mon.p01001532
Dear ceph-*,
A couple weeks ago I wrote this simple tool to measure the round-trip
latency of a shared filesystem.
https://github.com/dvanders/fsping
In our case, the tool is to be run from two clients who mount the same
CephFS.
First, start the server (a.k.a. the ping reflector) on one mach
Hi,
The mon's on my test luminous cluster do not start after upgrading
from 12.0.1 to 12.0.2. Here is the backtrace:
0> 2017-04-25 11:06:02.897941 7f467ddd7880 -1 *** Caught signal
(Aborted) **
in thread 7f467ddd7880 thread_name:ceph-mon
ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935
__ << " loading creating_pgs e" <<
creating_pgs.last_scan_epoch << dendl;
}
...
Cheers, Dan
On Tue, Apr 25, 2017 at 11:15 AM, Dan van der Ster wrote:
> Hi,
>
> The mon's on my test luminous cluster do not start after upgrading
> from 12.0.1 to 12.0.2. Here is the b
Created ticket to follow up: http://tracker.ceph.com/issues/19769
On Tue, Apr 25, 2017 at 11:34 AM, Dan van der Ster wrote:
> Could this change be the culprit?
>
> commit 973829132bf7206eff6c2cf30dd0aa32fb0ce706
> Author: Sage Weil
> Date: Fri Mar 31 09:33:19 2017 -0
Hi Blair,
We use cpu_dma_latency=1, because it was in the latency-performance profile.
And indeed by setting cpu_dma_latency=0 on one of our OSD servers,
powertop now shows the package as 100% in turbo mode.
So I suppose we'll pay for this performance boost in energy.
But more importantly, can th
On Wed, May 3, 2017 at 9:13 AM, Blair Bethwaite
wrote:
> We did the latter using the pmqos_static.py, which was previously part of
> the RHEL6 tuned latency-performance profile, but seems to have been dropped
> in RHEL7 (don't yet know why),
It looks like el7's tuned natively supports the pmqos i
On Wed, May 3, 2017 at 10:32 AM, Blair Bethwaite
wrote:
> On 3 May 2017 at 18:15, Dan van der Ster wrote:
>> It looks like el7's tuned natively supports the pmqos interface in
>> plugins/plugin_cpu.py.
>
> Ahha, you are right, but I'm sure I tested tuned an
On Wed, May 3, 2017 at 10:52 AM, Blair Bethwaite
wrote:
> On 3 May 2017 at 18:38, Dan van der Ster wrote:
>> Seems to work for me, or?
>
> Yeah now that I read the code more I see it is opening and
> manipulating /dev/cpu_dma_latency in response to that option, so the
> TOD
I am currently pricing out some DCS3520's, for OSDs. Word is that the
price is going up, but I don't have specifics, yet.
I'm curious, does your real usage show that the 3500 series don't
offer enough endurance?
Here's one of our DCS3700's after 2.5 years of RBD + a bit of S3:
Model Family:
On Wed, May 17, 2017 at 11:29 AM, Dan van der Ster wrote:
> I am currently pricing out some DCS3520's, for OSDs. Word is that the
> price is going up, but I don't have specifics, yet.
>
> I'm curious, does your real usage show that the 3500 series don't
> offer
On Thu, May 18, 2017 at 3:11 AM, Christian Balzer wrote:
> On Wed, 17 May 2017 18:02:06 -0700 Ben Hines wrote:
>
>> Well, ceph journals are of course going away with the imminent bluestore.
> Not really, in many senses.
>
But we should expect far fewer writes to pass through the RocksDB and
its W
Hi Sage,
We need named clusters on the client side. RBD or CephFS clients, or
monitoring/admin machines all need to be able to access several clusters.
Internally, each cluster is indeed called "ceph", but the clients use
distinct names to differentiate their configs/keyrings.
Cheers, Dan
On J
Hi Bryan,
On Fri, Jun 9, 2017 at 1:55 AM, Bryan Stillwell wrote:
> This has come up quite a few times before, but since I was only working with
> RBD before I didn't pay too close attention to the conversation. I'm
> looking
> for the best way to handle existing clusters that have buckets with a
On Fri, Jun 9, 2017 at 5:58 PM, Vasu Kulkarni wrote:
> On Fri, Jun 9, 2017 at 6:11 AM, Wes Dillingham
> wrote:
>> Similar to Dan's situation we utilize the --cluster name concept for our
>> operations. Primarily for "datamover" nodes which do incremental rbd
>> import/export between distinct clus
Dear ceph users,
Today we had O(100) slow requests which were caused by deep-scrubbing
of the metadata log:
2017-06-14 11:07:55.373184 osd.155
[2001:1458:301:24::100:d]:6837/3817268 7387 : cluster [INF] 24.1d
deep-scrub starts
...
2017-06-14 11:22:04.143903 osd.155
[2001:1458:301:24::100:d]:6837/
Hi Patrick,
We've just discussed this internally and I wanted to share some notes.
First, there are at least three separate efforts in our IT dept to
collect and analyse SMART data -- its clearly a popular idea and
simple to implement, but this leads to repetition and begs for a
common, good solu
On Thu, Jun 15, 2017 at 7:56 PM, Casey Bodley wrote:
>
> On 06/14/2017 05:59 AM, Dan van der Ster wrote:
>>
>> Dear ceph users,
>>
>> Today we had O(100) slow requests which were caused by deep-scrubbing
>> of the metadata log:
>>
>> 2017-06-14 11:
'leveldb compact on mount = true' to the osd
> config and restarting.
>
> Casey
>
>
> On 06/19/2017 11:01 AM, Dan van der Ster wrote:
>>
>> On Thu, Jun 15, 2017 at 7:56 PM, Casey Bodley wrote:
>>>
>>> On 06/14/2017 05:59 AM, Dan van der Ster
On Thu, Dec 10, 2015 at 5:06 AM, Christian Balzer wrote:
>
> Hello,
>
> On Wed, 9 Dec 2015 15:57:36 + MATHIAS, Bryn (Bryn) wrote:
>
>> to update this, the error looks like it comes from updatedb scanning the
>> ceph disks.
>>
>> When we make sure it doesn’t, by putting the ceph mount points in
On Wed, Dec 9, 2015 at 1:25 PM, Jacek Jarosiewicz
wrote:
> 2015-12-09 13:11:51.171377 7fac03c7f880 -1
> filestore(/var/lib/ceph/osd/ceph-5) Error initializing leveldb : Corruption:
> 29 missing files; e.g.: /var/lib/ceph/osd/ceph-5/current/omap/046388.sst
Did you have .lbd files? If so, this shou
On Wed, Jan 20, 2016 at 8:01 PM, Zoltan Arnold Nagy
wrote:
>
> Wouldn’t actually blowing away the other monitors then recreating them from
> scratch solve the issue?
>
> Never done this, just thinking out loud. It would grab the osdmap and
> everything from the other monitor and form a quorum, w
Thanks for this thread. We just did the same mistake (rmfailed) on our
hammer cluster which broke it similarly. The addfailed patch worked
for us too.
-- Dan
On Fri, Jan 15, 2016 at 6:30 AM, Mike Carlson wrote:
> Hey ceph-users,
>
> I wanted to follow up, Zheng's patch did the trick. We re-added
On Mon, Feb 8, 2016 at 8:10 PM, Sage Weil wrote:
> On Mon, 8 Feb 2016, Karol Mroz wrote:
>> On Mon, Feb 08, 2016 at 01:36:57PM -0500, Sage Weil wrote:
>> > I didn't find any other good K names, but I'm not sure anything would top
>> > kraken anyway, so I didn't look too hard. :)
>> >
>> > For L,
Hi,
Thanks for linking to a current update on this problem [1] [2]. I
really hope that new Ceph installations aren't still following that
old advice... it's been known to be a problem for around a year and a
half [3].
That said, the "-n size=64k" wisdom was really prevalent a few years
ago, and I
On Thu, Feb 18, 2016 at 3:46 PM, Jens Rosenboom wrote:
> 2016-02-18 15:10 GMT+01:00 Dan van der Ster :
>> Hi,
>>
>> Thanks for linking to a current update on this problem [1] [2]. I
>> really hope that new Ceph installations aren't still following that
>>
Thanks Sage, looking forward to some scrub randomization.
Were binaries built for el6? http://download.ceph.com/rpm-hammer/el6/x86_64/
Cheers, Dan
On Tue, Feb 23, 2016 at 5:01 PM, Sage Weil wrote:
> This Hammer point release fixes a range of bugs, most notably a fix for
> unbounded growth of t
I can reproduce and updated the ticket. (I only upgraded the client,
not the server).
It seems to be related to the new --no-verify option, which is giving
strange results -- see the ticket.
-- Dan
On Fri, Feb 26, 2016 at 11:48 AM, Alexey Sheplyakov
wrote:
> Christian,
>
>> Note that "rand" wo
0.94.6 Hammer released
>
> Hi all,
>
> should we build el6 packages ourself or, it's hoped that these packages would
> be built officially by community?
>
>
> Regards,
>
> Vladislav Odintsov
>
> _______
If it can help, it's really very little work for me to send the hammer
SRPM to our Koji build system.
I think the real work will come if people starting asking for jewel
builds on el6 and other old platforms. In that case, if a reputable
organisation offers to maintain the builds (+ deps), then IM
601 - 700 of 818 matches
Mail list logo