Hi Peter, yes just restart the OSD for the setting to take effect.
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Peter
Kerdisle
Sent: 10 May 2016 19:06
To: Nick Fisk
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Erasure pool performance expectations
Th
Hi,
We plan to create image with this format, how do you think about it? thanks.
rbd create myimage --size 102400 --order 25 --stripe-unit 4K
--stripe-count 32 --image-feature layering --image-feature striping
2016-05-10 21:19 GMT+08:00 :
> Hi,
>
>
> Am 2016-05-10 05:48, schrieb Geocast:
>
>>
Hello,
not sure if the Cc: to the users ML was intentional or not, but either way.
The issue seen in the tracker:
http://tracker.ceph.com/issues/15763
and what you have seen (and I as well) feels a lot like the lack of
parallelism towards the end of rebuilds.
This becomes even more obvious whe
Hello again,
I was looking at the patches sent on the repository and I found a patch
that made the OSD to check for cluster health before starting up.
Can this be patch the source of all my problems?
Best regards,
On Tue, May 10, 2016 at 6:07 PM, Gonzalo Aguilar Delgado <
gaguilar.delg...@gmai
> Op 11 mei 2016 om 8:38 schreef M Ranga Swami Reddy :
>
>
> Hello,
> I wanted to resize an image using 'rbd' resize option, but it should
> be have data loss.
> For ex: I have image with 100 GB size (thin provisioned). and this
> image has data of 10GB only. Here I wanted to resize this image t
Thanks, I tried that earlier but so far I am still getting slow requests.
Although I also found I didn't have writeback enabled on my hardware
controller. It seems that after changing that and setting the max bytes
things are a bit more stable, less slow requests popping up. The fact that
the write
Hello,
On Wed, 11 May 2016 15:16:14 +0800 Geocast Networks wrote:
> Hi,
>
> We plan to create image with this format, how do you think about it?
> thanks.
>
Not really related to OSD formatting.
> rbd create myimage --size 102400 --order 25 --stripe-unit 4K
> --stripe-count 32 --image-feature
> -Original Message-
> From: Eric Eastman [mailto:eric.east...@keepertech.com]
> Sent: 10 May 2016 18:29
> To: Nick Fisk
> Cc: Ceph Users
> Subject: Re: [ceph-users] CephFS + CTDB/Samba - MDS session timeout on
> lockfile
>
> On Tue, May 10, 2016 at 6:48 AM, Nick Fisk wrote:
> >
> >
> >
Thank you.
but the fstrim can used with in mount partition...But I wanted to as
cloud admin...
I have a few uses with high volume size (ie capacity) allotted, but
only used 5% of the capacity. so I wanted to reduce the size to 10% of
size using the rbd resize command. But in this process, if a
cus
Hello,
I have a production Ceph cluster running the latest Hammer Release.
We are not planning soon the upgrade to Jewel.
However, I would like to upgrade just the Rados Gateway to Jewel,
because I want to test the new SWIFT compatibiltiy improvements.
Is it supported to run the system with thi
>>but the fstrim can used with in mount partition...But I wanted to as
>>cloud admin...
if you use qemu, you can launch fstrim through guest-agent
http://dustymabe.com/2013/06/26/enabling-qemu-guest-agent-and-fstrim-again/
- Mail original -
De: "M Ranga Swami Reddy"
À: "Wido den
Hello,
On Wed, 11 May 2016 13:33:44 +0200 (CEST) Alexandre DERUMIER wrote:
> >>but the fstrim can used with in mount partition...But I wanted to as
> >>cloud admin...
>
> if you use qemu, you can launch fstrim through guest-agent
>
This of course assumes that qemu/kvm is using a disk method
Hi,
We are using ceph rbd with cepfs mounted file system, Here while use ant copy
task with in ceph shared directory the file is copied properly but after few
seconds content become empty. Is there any solution for this issue.
Regards
Prabu GJ
_
Hi,
I’m looking for some help in figuring out why there are 2 pg’s in our cluster
in 'active+undersized+degraded' state. They don’t seem to get assigned a 3rd
osd to place data on. I’m not sure why, everything looks ‘ok’ to me. Our ceph
cluster consists of 3 nodes and has been upgraded from fir
1. First scenario, only 4 node scenario and since it is chassis level
replication single node remaining on the chassis taking all the traffic.
It seems that is a bottleneck as for the host level replication on the
similar setup recovery time is much less (data is not in this table).
2. In the s
> -Original Message-
> From: Mark Nelson [mailto:mnel...@redhat.com]
> Sent: 11 May 2016 13:16
> To: Somnath Roy ; Nick Fisk
> ; Ben England ; Kyle Bader
>
> Cc: Sage Weil ; Samuel Just ; ceph-
> us...@lists.ceph.com
> Subject: Re: Weighted Priority Queue testing
>
> > 1. First scenario,
Hi Guys,
we spent some time over the past week looking at hammer vs jewel RBD
performance in HDD only, HDD+NVMe journal, and NVMe cases with the
default filestore backend. We ran into a number of issues during
testing and I don't want to get into everything, but we were eventually
able to ge
Hi Max,
I encountered same error with my 3 node cluster few days ago. When I
added a fourth node to the cluster , the PGs came back to healthy
state. It seems to be a corner case of CRUSH algorithm which hits only
in a small cluster.
Quoting from another ceph user : " yes the pg should get remap
Hello there,
Our setup is with Ceph Hammer (latest release).
We want to publish in our Object Storage some Scientific Datasets.
These are collections of around 100K objects and total size of about
200 TB.
For Object Storage we use the RadosGW with S3 API.
For the initial testing we are using a
Hello everyone,
We experienced a strange scenario last week of unfound objects and
inconsistent reports from ceph tools. We solved it with the help from
Sage, and we wanted to share our experience and to see if it can be of
any use for developers too.
After OSDs segfaulting randomly, our cluster
Awesome work Mark! Comments / questions inline below:
On Wed, May 11, 2016 at 9:21 AM, Mark Nelson wrote:
> There are several commits of interest that have a noticeable effect on 128K
> sequential read performance:
>
>
> 1) https://github.com/ceph/ceph/commit/3a7b5e3
>
> This commit was the firs
> Op 11 mei 2016 om 15:51 schreef Simon Engelsman :
>
>
> Hello everyone,
>
> We experienced a strange scenario last week of unfound objects and
> inconsistent reports from ceph tools. We solved it with the help from
> Sage, and we wanted to share our experience and to see if it can be of
> any
Thank you.
It is exactly a problem with multipart.
So I tried two clients (s3cmd and rclone). When you upload a file in
S3 using multipart, you are not able to read anymore this object with
the SWIFT API because the md5 check fails.
Saverio
2016-05-09 12:00 GMT+02:00 Xusangdi :
> Hi,
>
> I'm
On 05/11/2016 08:52 AM, Jason Dillaman wrote:
Awesome work Mark! Comments / questions inline below:
On Wed, May 11, 2016 at 9:21 AM, Mark Nelson wrote:
There are several commits of interest that have a noticeable effect on 128K
sequential read performance:
1) https://github.com/ceph/ceph/
It does not work also the way around:
If I upload a file with the swift client with the -S options to force
swift to make multipart:
swift upload -S 100 multipart 180.mp4
Then I am not able to read the file with S3
s3cmd get s3://multipart/180.mp4
download: 's3://multipart/180.mp4' -> './18
On Wed, May 11, 2016 at 10:07 AM, Mark Nelson wrote:
> Perhaps 0024677 or 3ad19ae introduced another regression that was being
> masked by c474e4 and when 66e7464 improved the situation, the other
> regression appeared?
0024677 is in Hammer as 7004149 and 3ad19ae is in Hammer as b38da480.
I opene
Can anybody help shed some light on this error I’m getting from radosgw?
2016-05-11 10:09:03.471649 7f1b957fa700 1 -- 172.16.129.49:0/3896104243 -->
172.16.128.128:6814/121075 -- osd_op(client.111957498.0:726 27.4742be4b
97c56252-6103-4ef4-b37a-42739393f0f1.113770300.1_interfaces [create 0~0
[
On 05/11/2016 09:19 AM, Jason Dillaman wrote:
On Wed, May 11, 2016 at 10:07 AM, Mark Nelson wrote:
Perhaps 0024677 or 3ad19ae introduced another regression that was being
masked by c474e4 and when 66e7464 improved the situation, the other
regression appeared?
0024677 is in Hammer as 7004149
On Wed, May 11, 2016 at 2:04 AM, Nick Fisk wrote:
>> -Original Message-
>> From: Eric Eastman [mailto:eric.east...@keepertech.com]
>> Sent: 10 May 2016 18:29
>> To: Nick Fisk
>> Cc: Ceph Users
>> Subject: Re: [ceph-users] CephFS + CTDB/Samba - MDS session timeout on
>> lockfile
>>
>> On
Hi,
For your information and for all in the same situation than me.
I found on release notes that's very well explained that when the
server is down but controller doesn't know about it. It can be because
the upgrades done in ceph during several releases. In this case firefly
there are some ins
Hi,
How to make ceph still work if the cluster lost all OSD hosts except one on
disaster where the capacity of single host is bigger than total used data?
I need to minimize downtime while recovering/reinstalling the lost hosts.
More generally, how to make ceph choose highest number of types afte
On Tue, May 10, 2016 at 10:40:08AM +0200, Yoann Moulin wrote:
:RadowGW (S3 and maybe swift for hadoop/spark) will be the main usage. Most of
:the access will be in read only mode. Write access will only be done by the
:admin to update the datasets.
No one seems to have pointed this out, but if yo
On 05/11/2016 04:00 PM, Wido den Hollander wrote:
>
>> Op 11 mei 2016 om 15:51 schreef Simon Engelsman :
>>
>>
>> Hello everyone,
>>
>> We experienced a strange scenario last week of unfound objects and
>> inconsistent reports from ceph tools. We solved it with the help from
>> Sage, and we want
I bumped up the backfill/recovery settings to match up Hammer. It is probably
unlikely that long tail latency is a parallelism issue. If so, entire recovery
would be suffering not the tail alone. It's probably a prioritization issue.
Will start looking and update my findings.
I can't add devl be
Hi George,
Which version of Ceph is this?
I've never had incompete pgs stuck like this before. AFAIK it means
that osd.52 would need to be brought up before you can restore those
PGs.
Perhaps you'll need ceph-objectstore-tool to help dump osd.52 and
bring up its data elsewhere. A quick check on t
Yes Mark, I have the following io profile going on during recovery.
[recover-test]
ioengine=rbd
clientname=admin
pool=mypool
rbdname=<>
direct=1
invalidate=0
rw=randrw
norandommap
randrepeat=0
rwmixread=40
rwmixwrite=60
iodepth=256
numjobs=6
end_fsync=0
bssplit=512/4:1024/1:1536/1:2048/1:2560/1:30
Hey Dan,
This is on Hammer 0.94.5. osd.52 was always on a problematic machine and when
this happened had less data on its local disk than the other OSDs. I've tried
adapting that blog post's solution to this situation to no avail.
I've tried things like looking at all probing OSDs in the query
Hey cephers,
Just a reminder, the Ceph Developer Monthly for May was rescheduled
from last week and will be happening tonight at 9p EST.
http://wiki.ceph.com/Planning
If you have ongoing work in Ceph, please join us for a review and
discussion. Thanks!
--
Best Regards,
Patrick McGarry
Direct
Hi all,
I tried to upgrade from Infernalis to the master branch. I see the following
error:
ssd@OptiPlex-9020-1:~/src/jewel-master$ ceph -s
Traceback (most recent call last):
File "/home/ssd/src/jewel-master/ceph-install/bin/ceph", line 118, in
import rados
ImportError:
/home/ssd/src/jew
Hi all,
I tried to upgrade from Infernalis to the master branch. I see the following
error:
ssd@OptiPlex-9020-1:~/src/jewel-master$ ceph -s
Traceback (most recent call last):
File "/home/ssd/src/jewel-master/ceph-install/bin/ceph", line 118, in
import rados
ImportError:
/home/ssd/src/jew
While I'm usually not fond of blaming the client application, this is
really the swift command line tool issue. It tries to be smart by
comparing the md5sum of the object's content with the object's etag,
and it breaks with multipart objects. Multipart objects is calculated
differently (md5sum of t
In multipart upload of S3, the checksum algorithm is different from that in
swift.
Including the following page for your convenience.
http://stackoverflow.com/questions/12186993/what-is-the-algorithm-to-compute-the-amazon-s3-etag-for-a-file-larger-than-5gb
原始邮件
发件人:Saverio protoziopr...@gmail.co
Hello,
On Wed, 11 May 2016 11:27:28 -0400 Jonathan D. Proulx wrote:
> On Tue, May 10, 2016 at 10:40:08AM +0200, Yoann Moulin wrote:
>
> :RadowGW (S3 and maybe swift for hadoop/spark) will be the main usage.
> Most of :the access will be in read only mode. Write access will only be
> done by the
On Wed, 11 May 2016 16:10:06 + Somnath Roy wrote:
> I bumped up the backfill/recovery settings to match up Hammer. It is
> probably unlikely that long tail latency is a parallelism issue. If so,
> entire recovery would be suffering not the tail alone. It's probably a
> prioritization issue. Wi
It looks like the Etag computing algorithm for Swift multipart doesn't add the
dash character, which makes
s3cmd regards the file as a regular one (otherwise it will just skip the Etag
checking step). You may also take
a look at this:
https://github.com/s3tools/s3cmd/blob/master/S3/S3.py#L1509
Hi,
my ceph df output as following,
# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
911T 911T 121G 0.01
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
block7 0 0 303T 0
image
> Op 11 mei 2016 om 15:42 schreef Saverio Proto :
>
>
> Hello there,
>
> Our setup is with Ceph Hammer (latest release).
>
> We want to publish in our Object Storage some Scientific Datasets.
> These are collections of around 100K objects and total size of about
> 200 TB.
>
> For Object Stora
Sadly not. RGW generally requires updates to the OSD-side object class code
for a lot of its functionality andisnt expected to work against older
clusters. :(
On Wednesday, May 11, 2016, Saverio Proto wrote:
> Hello,
>
> I have a production Ceph cluster running the latest Hammer Release.
>
> We
Yes, it's intentional. All ceph CLI operations are idempotent.
On Tuesday, May 10, 2016, Swapnil Jain wrote:
> Hi
>
> I am using infernalis 9.2.1. While creating bucket, if the bucket already
> exists, its still returns 0 as exit status. Is it intentional out of some
> reason or a bug?
>
>
>
> r
49 matches
Mail list logo