I'll be first to admit that most of my comments are anecdotal. But, I
suspect when it comes to storage many of us don't require a lot to get
scared back into our dark corners. In short it seems that the dev team
should get better at selecting features and delivering on the existing
scheduled cadenc
It crashes with SimpleMessenger as well (ms_type = simple)
I've also tried with and without these two settings, but still crashes.
bluestore cache size = 536870912
bluestore cache kv max = 268435456
When using SimpleMessenger, it tells me it is crashing (Segmentation
Fault) in 'thread_name:ms_
On 23 September 2017 at 11:58, Sage Weil wrote:
> I'm *much* happier with 2 :) so no complaint from me. I just heard a lot
> of "2 years" and 2 releases (18 months) doesn't quite cover it. Maybe
> it's best to start with that, though? It's still an improvement over the
> current ~12 months.
FW
On Fri, 22 Sep 2017, Gregory Farnum wrote:
> On Fri, Sep 22, 2017 at 3:28 PM, Sage Weil wrote:
> > Here is a concrete proposal for everyone to summarily shoot down (or
> > heartily endorse, depending on how your friday is going):
> >
> > - 9 month cycle
> > - enforce a predictable release schedule
On Fri, Sep 22, 2017 at 3:28 PM, Sage Weil wrote:
> Here is a concrete proposal for everyone to summarily shoot down (or
> heartily endorse, depending on how your friday is going):
>
> - 9 month cycle
> - enforce a predictable release schedule with a freeze date and
> a release date. (The actua
Hi everyone, in the case where I’ve lost the entire directory below that
contains a bluestore OSD’s config and metadata, but all the bluestore devices
are intact (block, block.db, block.wal), how can I get the OSD up and running
again?
I tried to do a ceph-osd –mkfs again, which seemed to regen
Am 22.09.2017 um 23:56 schrieb Gregory Farnum:
> On Fri, Sep 22, 2017 at 2:49 PM, Danny Al-Gaaf
> wrote:
>> Am 22.09.2017 um 22:59 schrieb Gregory Farnum:
>> [..]
>>> This is super cool! Is there anything written down that explains this
>>> for Ceph developers who aren't familiar with the working
Here is a concrete proposal for everyone to summarily shoot down (or
heartily endorse, depending on how your friday is going):
- 9 month cycle
- enforce a predictable release schedule with a freeze date and
a release date. (The actual .0 release of course depends on no blocker
bugs being o
On Fri, Sep 22, 2017 at 2:49 PM, Danny Al-Gaaf wrote:
> Am 22.09.2017 um 22:59 schrieb Gregory Farnum:
> [..]
>> This is super cool! Is there anything written down that explains this
>> for Ceph developers who aren't familiar with the workings of Dovecot?
>> I've got some questions I see going thr
Am 22.09.2017 um 22:59 schrieb Gregory Farnum:
[..]
> This is super cool! Is there anything written down that explains this
> for Ceph developers who aren't familiar with the workings of Dovecot?
> I've got some questions I see going through it, but they may be very
> dumb.
>
> *) Why are indexes
On Thu, Sep 21, 2017 at 1:40 AM, Wido den Hollander wrote:
> Hi,
>
> A tracker issue has been out there for a while:
> http://tracker.ceph.com/issues/12430
>
> Storing e-mail in RADOS with Dovecot, the IMAP/POP3/LDA server with a huge
> marketshare.
>
> It took a while, but last year Deutsche Te
On Thu, Sep 21, 2017 at 3:02 AM, Sean Purdy wrote:
> On Wed, 20 Sep 2017, Gregory Farnum said:
>> That definitely sounds like a time sync issue. Are you *sure* they matched
>> each other?
>
> NTP looked OK at the time. But see below.
>
>
>> Is it reproducible on restart?
>
> Today I did a straigh
I have a few running ceph clusters. I built a new cluster using luminous, and
I also upgraded a cluster running hammer to luminous. In both cases, I have a
HEALTH_WARN that I can't figure out. The cluster appears healthy except for
the HEALTH_WARN in overall status. For now, I’m monitoring h
If you have a random number generator rand() and variables A,B
A = rand()
B = rand()
and loop 100 times to see which is bigger A or B, on average A will win
50 times and B wins 50 times
Now assume you want to make A win twice as many times, you can add a
weight
A = 3 x rand()
B = 1 x rand()
If yo
^C[root@mon01 ceph]# ceph status
cluster 55ebbc2d-c5b7-4beb-9688-0926cefee155
health HEALTH_WARN
2 requests are blocked > 32 sec
monmap e1: 3 mons at
{mon01=##:6789/0,mon02=##:6789/0,mon03=##:6789/0}
election epoch 74, quorum 0,1,2 mon0
It shows that the blocked requests also reset and are now only a few
minutes old instead of nearly a full day. What is your full `ceph status`?
The blocked requests are referring to missing objects.
On Fri, Sep 22, 2017 at 12:09 PM Matthew Stroud
wrote:
> Got one to clear:
>
>
>
> 2017-09-22 10
Got one to clear:
2017-09-22 10:06:23.030648 osd.3 [WRN] 2 slow requests, 1 included below;
oldest blocked for > 120.959814 secs
2017-09-22 10:06:23.030657 osd.3 [WRN] slow request 120.959814 seconds old,
received at 2017-09-22 10:04:22.070785: osd_op(client.301013529.0:2418
7.e637a4b3 measure
Thanks ! I still have a question. Like the code in bucket_straw2_choose below:
u = crush_hash32_3(bucket->h.hash, x, ids[i], r);
u &= 0x;
ln = crush_ln(u) - 0x1ll;
draw = div64_s64(ln, weights[i]);
Because the x , id, r , don't change, so the ln won't change for old
bucket, add
The request remains blocked if you issue `ceph osd down 2`? Marking the
offending OSD as down usually clears up blocked requests for me... at least
it resets the timer on it and the requests start blocking again if the OSD
is starting to fail.
On Fri, Sep 22, 2017 at 11:51 AM Matthew Stroud
wrot
It appears I have three stuck IOs after switching my tunables to optimal. We
are running 10.2.9 and the offending pool is for gnocchi (which has caused us
quite a bit pain at this point). Here are the stuck IOs:
2017-09-22 09:05:40.095125 osd.2 ##:6802/1453572 164 : cluster [WRN] 3
slow
Thanks again Ronny,
Ocfs2 is working well so far.
I have 3 nodes sharing the same 7TB MSA FC lun. Hoping to add 3 more...
James Okken
Lab Manager
Dialogic Research Inc.
4 Gatehall Drive
Parsippany
NJ 07054
USA
Tel: 973 967 5179
Email: james.ok...@dialogic.com
Web: www.dialogic.com - Th
Just a follow-up here:
I'm chasing down a bug with memory accounting. On my luminous cluster I
am seeing lots of memory usage that is triggered by scrub. Pretty sure
this is a bluestore cache mempool issue (making it use more memory than it
thinks it is); hopefully I'll have a fix shortly.
Reco
Hi all,
Recently, we've noticed a strange behaviour on one of our test clusters.
The cluster was configured to serve RGW and is running Luminous. Our
standard procedure is to create blind (non-indexed) buckets so that our
software manages the metadata by itself and we get less load on the index
p
Hi John,
Yes, it was working for some time and then I tried updating the run_dir on the
cluster for another reason so I had to restart the cluster. Now I get the
issue with the socket creation.
I tried reverting the run_dir configuration to default and restarted but the
issue persists.
> Op 22 september 2017 om 8:03 schreef Adrian Saul
> :
>
>
>
> Thanks for bringing this to attention Wido - its of interest to us as we are
> currently looking to migrate mail platforms onto Ceph using NFS, but this
> seems far more practical.
>
Great! Keep in mind this is still in a very
I asked the same question a couple of weeks ago. No response I got
contradicted the documentation but nobody actively confirmed the
documentation was correct on this subject, either; my end state was that
I was relatively confident I wasn't making some horrible mistake by
simply specifying a bi
On Fri, Sep 22, 2017 at 9:49 AM, Dietmar Rieder
wrote:
> Hmm...
>
> not sure what happens if you loose 2 disks in 2 different rooms, isn't
> there is a risk that you loose data ?
yes, and that's why I don't really like the profile...
___
ceph-users mai
Hmm...
not sure what happens if you loose 2 disks in 2 different rooms, isn't
there is a risk that you loose data ?
Dietmar
On 09/22/2017 10:39 AM, Luis Periquito wrote:
> Hi all,
>
> I've been trying to think what will be the best erasure code profile,
> but I don't really like the one I came
Hi all,
I've been trying to think what will be the best erasure code profile,
but I don't really like the one I came up with...
I have 3 rooms that are part of the same cluster, and I need to design
so we can lose any one of the 3.
As this is a backup cluster I was thinking on doing a k=2 m=1 co
Per section 3.4.4 The default bucket type straw computes the hash of (PG
number, replica number, bucket id) for all buckets using the Jenkins
integer hashing function, then multiply this by bucket weight (for OSD
disks the weight of 1 is for 1 TB, for higher level it is the sum of
contained weights
On Thu, Sep 21, 2017 at 3:29 PM, Bryan Banister
wrote:
> I’m not sure what happened but the dashboard module can no longer startup
> now:
I'm curious about the "no longer" part -- from the log it looks like
you only just enabled the dashboard module ("module list changed...")?
Was it definitely
31 matches
Mail list logo