Hi Eric,
Our team has developed QOS feature on ceph using the dmclock library from
community. We treat a rbd as a dmclock client instead of pool as . We tested
our code and the result is confusing .
Testing environment: single server with 16 cores , RAM of 32G, 8 non-systerm
disks,each runs
Hello,
On Fri, 2 Jun 2017 14:30:56 +0800 jiajia zhong wrote:
> christian, thanks for your reply.
>
> 2017-06-02 11:39 GMT+08:00 Christian Balzer :
>
> > On Fri, 2 Jun 2017 10:30:46 +0800 jiajia zhong wrote:
> >
> > > hi guys:
> > >
> > > Our ceph cluster is working with tier cache.
> > If
Hello gurus:
My name is will . I have just study ceph and have a lot of
interest in it . We are using ceph 0.94.10. And I am tring to tune the
performance of ceph to satisfy our requirements. We are using it as
object store now. Even though I have tried some different
configuration. But I sti
thank you for your guide :), It's making sense.
2017-06-02 16:17 GMT+08:00 Christian Balzer :
>
> Hello,
>
> On Fri, 2 Jun 2017 14:30:56 +0800 jiajia zhong wrote:
>
> > christian, thanks for your reply.
> >
> > 2017-06-02 11:39 GMT+08:00 Christian Balzer :
> >
> > > On Fri, 2 Jun 2017 10:30:46 +0
Hi David,
If I understand correctly your suggestion is the following:
If we have for instance 12 servers grouped into 3 racks (4/rack) then you would
build a crush map saying that you have 6 racks (virtual ones), and 2 servers in
each of them, right?
In this case if we are setting the failure
On 06/01/17 17:12, koukou73gr wrote:
> Hello list,
>
> Today I had to create a new image for a VM. This was the first time,
> since our cluster was updated from Hammer to Jewel. So far I was just
> copying an existing golden image and resized it as appropriate. But this
> time I used rbd create.
>
On 06/02/17 11:59, Peter Maloney wrote:
> On 06/01/17 17:12, koukou73gr wrote:
>> Hello list,
>>
>> Today I had to create a new image for a VM. This was the first time,
>> since our cluster was updated from Hammer to Jewel. So far I was just
>> copying an existing golden image and resized it as app
Thanks for the reply.
Easy?
Sure, it happens reliably every time I boot the guest with
exclusive-lock on :)
I'll need some walkthrough on the gcore part though!
-K.
On 2017-06-02 12:59, Peter Maloney wrote:
> On 06/01/17 17:12, koukou73gr wrote:
>> Hello list,
>>
>> Today I had to create a new
On 06/02/17 12:06, koukou73gr wrote:
> Thanks for the reply.
>
> Easy?
> Sure, it happens reliably every time I boot the guest with
> exclusive-lock on :)
If it's that easy, also try with only exclusive-lock, and not object-map
nor fast-diff. And also with one or the other of those.
>
> I'll need
On 2017-06-02 13:01, Peter Maloney wrote:
>> Is it easy for you to reproduce it? I had the same problem, and the same
>> solution. But it isn't easy to reproduce... Jason Dillaman asked me for
>> a gcore dump of a hung process but I wasn't able to get one. Can you do
>> that, and when you reply, C
On 2017-06-02 13:22, Peter Maloney wrote:
> On 06/02/17 12:06, koukou73gr wrote:
>> Thanks for the reply.
>>
>> Easy?
>> Sure, it happens reliably every time I boot the guest with
>> exclusive-lock on :)
> If it's that easy, also try with only exclusive-lock, and not object-map
> nor fast-diff. And
On 06/02/17 12:25, koukou73gr wrote:
> On 2017-06-02 13:01, Peter Maloney wrote:
>>> Is it easy for you to reproduce it? I had the same problem, and the same
>>> solution. But it isn't easy to reproduce... Jason Dillaman asked me for
>>> a gcore dump of a hung process but I wasn't able to get one.
Hi Will,
Few people have tried rocksdb as the k/v store for filestore since we
never really started supporting it for production use (We ended up
deciding to move on to bluestore). I suspect it will be faster than
leveldb but I don't think anyone has actually tested filestore+rocksdb
to any
I'm thinking you have erasure coding in cephfs and only use cache tiring
because you have to, correct? What is your use case for repeated file
accesses? How much data is written into cephfs at a time?
For me, my files are infrequently accessed after they are written or read
from the EC back-end po
You wouldn't be able to guarantee that the cluster will not use 2 servers
from the same rack. The problem with 3 failure domains, however, is if you
lose a full failure domain ceph can do nothing to maintain 3 copies of your
data. It leaves you in a position where you need to rush to the datacenter
Hello,
I am playing around with ceph (ceph version 10.2.7
(50e863e0f4bc8f4b9e31156de690d765af245185)) on Debian Jessie and I build a test
setup:
$ ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.01497 root default
-2 0.00499 host af-staging-ceph01
You only have 3 osd's hence with one down you only have 2 left for replication
of 3 objects.
No spare OSD to place the 3rd object on, if you was to add a 4th node the issue
would be removed.
,Ashley
On 2 Jun 2017, at 10:31 PM, Oleg Obleukhov
mailto:leoleov...@gmail.com>> wrote:
Hello,
I am pl
I think it's because af-staging-ceph02 data can only be moved to
af-staging-ceph01/3 which already have the data.
There is no acceptable place to create the third replicate of data.
Etienne
From: ceph-users on behalf of Oleg
Obleukhov
Sent: Friday, June 2,
Hi,
On 06/02/2017 04:15 PM, Oleg Obleukhov wrote:
Hello,
I am playing around with ceph (ceph version 10.2.7
(50e863e0f4bc8f4b9e31156de690d765af245185)) on Debian Jessie and I
build a test setup:
$ ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.014
What you're saying that if we only have 3 failure domains then ceph can do
nothing to maintain 3 copies in case of an entire failure domain is lost, that
is correct.
BUT if you're losing 2 replicas out of 3 of your data, and your min size is set
to 2 (the recommended minimum) then you have an e
Also, your min_size is set to 2. What this means is that you need at least
2 copies of your data up to be able to access it. You do not want to have
min_size of 1. If you had min_size of 1 and you only have 1 copy of your
data receiving writes and then that copy goes down as well... What is to
s
I agree that running in min_size of 1 is worse than running with only 3
failure domains. Even if it's just for a short time and you're monitoring
it closely... it takes mere seconds before you could have corrupt data with
min_size of 1 (depending on your use case). That right there is the key.
Wh
Thanks to everyone,
problem is solved by:
ceph osd pool set cephfs_metadata size 2
ceph osd pool set cephfs_data size 2
Best, Oleg.
> On 2 Jun 2017, at 16:15, Oleg Obleukhov wrote:
>
> Hello,
> I am playing around with ceph (ceph version 10.2.7
> (50e863e0f4bc8f4b9e31156de690d765af245185)) on D
That's good for testing in the small scale. For production I would revisit
using size 3. Glad you got it working.
On Fri, Jun 2, 2017 at 11:02 AM Oleg Obleukhov wrote:
> Thanks to everyone,
> problem is solved by:
> ceph osd pool set cephfs_metadata size 2
> ceph osd pool set cephfs_data size
But what would be the best? Have 3 servers and how many osd?
Thanks!
> On 2 Jun 2017, at 17:09, David Turner wrote:
>
> That's good for testing in the small scale. For production I would revisit
> using size 3. Glad you got it working.
>
> On Fri, Jun 2, 2017 at 11:02 AM Oleg Obleukhov
I got a chance to run this by Josh and he had a good thought. Just to
make sure that it's not IO backing up on the device, it probably makes
sense to repeat the test and watch what the queue depth and service
times look like. I like using collectl for it:
"collectl -sD -oT"
The queue depth
I'm seeing this again on two OSDs after adding another 20 disks to my
cluster. Is there someway I can maybe determine which snapshots the
recovery process is looking for? Or maybe find and remove the objects
it's trying to recover, since there's apparently a problem with them?
Thanks!
-Steve
On 0
Coming back to this, with Jason's insight it was quickly revealed that
my problem was in reality a cephx authentication permissions issue.
Specifically, exclusive-lock requires a cephx user with class-write
access to the pool where the image resides. This wasn't clear in the
documentation and the
All very true and worth considering, but I feel compelled to mention the
strategy of setting mon_osd_down_out_subtree_limit carefully to prevent
automatic rebalancing.
*If* the loss of a failure domain is temporary, ie. something you can fix
fairly quickly, it can be preferable to not start tha
Hi Graham.
We are on Kraken and have the same problem with "lifecycle". Various (other)
tools like s3cmd or CyberDuck do show the applied "expiration" settings, but
objects seem never to be purged.
If you should have new findings, hints,... PLEASE share/let me know.
Thanks a lot!
Anton
Ges
Have you opened a ceph tracker issue, so that we don't lose track of
the problem?
Thanks,
Yehuda
On Fri, Jun 2, 2017 at 3:05 PM, wrote:
> Hi Graham.
>
> We are on Kraken and have the same problem with "lifecycle". Various (other)
> tools like s3cmd or CyberDuck do show the applied "expiration"
Well, what's "best" really depends on your needs and use-case. The general
advise which has been floated several times now is to have at least N+2
entities of your failure domain in your cluster.
So for example if you run with size=3 then you should have at least 5 OSDs
if your failure domain is OS
david,
2017-06-02 21:41 GMT+08:00 David Turner :
> I'm thinking you have erasure coding in cephfs and only use cache tiring
> because you have to, correct? What is your use case for repeated file
> accesses? How much data is written into cephfs at a time?
>
these days, up to ten-millions of tiny
Hi,
I found two bugs when testing out Lua object class. I am running with Ceph
11.2.0. Can anybody take a look at them?
Zheyuan
Bug 1: I can not get returned output in the first script. "data" is always
empty.
import rados, json
> cluster = rados.Rados(conffile='')
> cluster.connect()
> ioctx
34 matches
Mail list logo