Hi,
You need to monitor latency instead of peak points. As Ceph is writing to
two other nodes if you have 3 replicas that is 4x extra the latency
compared to one roundtrip to the first OSD from client. So smaller and more
IO equals more pain in latency.
And the worst thing is that there is nothin
Hello,
On Tue, 24 May 2016 07:03:25 + Josef Johansson wrote:
> Hi,
>
> You need to monitor latency instead of peak points. As Ceph is writing to
> two other nodes if you have 3 replicas that is 4x extra the latency
> compared to one roundtrip to the first OSD from client. So smaller and
> m
Hi,
On Mon, May 23, 2016 at 8:24 PM, Anthony D'Atri wrote:
>
>
> Re:
>
>> 2. Inefficient chown documentation - The documentation states that one
>> should "chown -R ceph:ceph /var/lib/ceph" if one is looking to have ceph-osd
>> ran as user ceph and not as root. Now, this command would run a ch
It may be possible to do it with civetweb, but I use apache because of
HTTPS config.
On Tue, May 24, 2016 at 5:49 AM, fridifree wrote:
> What apache gives that civetweb not?
> Thank you
>
> On May 23, 2016 11:49 AM, "Anand Bhat" wrote:
>>
>> For performance, civetweb is better as fastcgi module
Hi all,
I'm mid-upgrade on a large cluster now. The upgrade is not going smoothly
-- it looks like the ceph-mon's are getting bombarded by so many of these
crc error warnings that they go into elections.
Did anyone upgrade a large cluster from 0.94.6 to 0.94.7 ? If not I'd
advise waiting until th
Right now I am using Apache, There is no any comparison(performance and
features) among them over the internet.
if someone can share about is experience it would be great
2016-05-24 10:51 GMT+03:00 Luis Periquito :
> It may be possible to do it with civetweb, but I use apache because of
> HTT
Hello List
To confirm what Christian has said. We have been playing with a 3 node
4 SSD (3610) per node cluster. Putting the journals on the OSD SSDs we
were getting 770MB /s sustained with large sequential writes, and 35
MB/s and about 9200 IOPS with small random writes. Putting an NVME as
journa
On Tue, May 24, 2016 at 08:51:13AM +0100, Luis Periquito wrote:
> It may be possible to do it with civetweb, but I use apache because of
> HTTPS config.
Civetweb should be able to handle ssl just fine:
rgw_frontends = civetweb port=7480s ssl_certificate=/path/to/some_cert.pem
--
Regards,
Karol
Hi,
Im looking for a guide/formula/ways to measure the total IOPS of a given
Ceph cluster.
/vlad
ᐧ
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
I'm using mod_rewrite and mod_expires
Dunno if it can be done via civetweb, however, my installation is older
On 24/05/2016 10:49, Karol Mroz wrote:
> On Tue, May 24, 2016 at 08:51:13AM +0100, Luis Periquito wrote:
>> It may be possible to do it with civetweb, but I use apache because of
>> HTTPS
I’m using this guide for FIO rbd
https://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html
Hope it is helpful
Adir
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Vlad
Blando
Sent: Tuesday, May 24, 2016 11:57 AM
To: ceph-users
Subject: [cep
Hi again,
We just finished the upgrade (5 mons, 1200 OSDs). As I mentioned, we
had loads of monitor elections and slow requests during the upgrades.
perf top showed the leader spending lots of time in LogMonitor::preprocess_log:
43.79% ceph-mon [.] LogMonitor::preprocess_log
To m
Hi,
> On 24 May 2016, at 09:16, Christian Balzer wrote:
>
>
> Hello,
>
> On Tue, 24 May 2016 07:03:25 + Josef Johansson wrote:
>
>> Hi,
>>
>> You need to monitor latency instead of peak points. As Ceph is writing to
>> two other nodes if you have 3 replicas that is 4x extra the latency
>
Hi,
I’m diagnosing a problem where monitors fall out of quorum now and then. It
seems that when two monitors do a new election, one answer is not received
until 5 minutes later. I checked ntpd on the servers, and all of them are spot
on, no sync problems. This is happening a couple of time ever
This would be useful. Thanks.
ᐧ
Regards,
Vladimir FS Blando
Cloud Operations Manager
www.morphlabs.com
On Tue, May 24, 2016 at 5:35 PM, Adir Lev wrote:
> I’m using this guide for FIO rbd
>
>
> https://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html
>
>
>
> Hope it
If you are not tweaking ceph.conf settings when using NVRAM as journal , I
would highly recommend to try the following.
1. Since you have very small journal , try to reduce
filestore_max_sync_interval/min_sync_interval significantly.
2. If you are using Jewel , there are bunch of filestore thro
Hi Anthony,
>
>> 2. Inefficient chown documentation - The documentation states that one should
>> "chown -R ceph:ceph /var/lib/ceph" if one is looking to have ceph-osd ran as
>> user ceph and not as root. Now, this command would run a chown process one
>> osd
>> at a time. I am considering my c
Hi All,
Has anyone tested 0.94.7 on Debian Jessie? I've heard that the most
recent Jewel releases for Jessie were missing pieces (systemd files) so I am a
little more hesitant than usual.
Thanks!
Chad.
___
ceph-users mailing list
ceph-users@li
Hammer don't use systemd unit files, so it's working fine.
(jewel/infernalis still missing systemd .target files)
- Mail original -
De: "Chad William Seys"
À: "ceph-users"
Envoyé: Mardi 24 Mai 2016 17:11:27
Objet: [ceph-users] is 0.94.7 packaged well for Debian Jessie
Hi All,
Has a
Thanks!
Hammer don't use systemd unit files, so it's working fine.
(jewel/infernalis still missing systemd .target files)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Having some issues with blocked ops on a small cluster. Running
0.94.5 with cache tiering. 3 cache nodes with 8 SSDs each and 3
spinning nodes with 12 spinning disk and journals. All the pools are
3x replicas.
Started experiencing problems with OSDs in the cold tier consuming the
entirety of th
Hello Team,
I set up a Ceph cluster version 0.94.5 for testing and trying to connect to
RADOS gateway with s3cmd tool. I’m getting error 400 when trying to list
buckets:
DEBUG: Command: ls
DEBUG: CreateRequest: resource[uri]=/
DEBUG: Using signature v2
DEBUG: SignHeaders: 'GET\n\n\n\nx-amz-date
Hi Sean,
Thank you for the hint. Just tried with config file below:
# s3cmd --configure
New settings:
Access Key: XXX49NCDM95R0EHV
Secret Key: LGXc0G4oGfX6rQ35a3dPgI4Ov
Default Region: US
Encryption password: password
Path to GPG program: /usr/bin/gpg
Use HTTPS protoc
Hello!
On Mon, May 23, 2016 at 02:34:37PM +, Somnath.Roy wrote:
> You need to build ceph code base to use jemalloc for OSDs..LD_PRELOAD won't
> work..
Is it true for Xenial too or only for Trusty? I don't want to rebuild Jewel on
xenial hosts...
--
WBR, Max A. Krasilnikov
___
>>Is it true for Xenial too or only for Trusty? I don't want to rebuild Jewel on
>>xenial hosts...
yes, for xenial (and debian wheezy/jessie too).
I don't don't why they are LD_PRELOAD commented in /etc/default/ceph,
because it's really don't do nothing, if tcmalloc is present.
you could try to
Hi Sylvain,
this is probably related to the fact that the systemd unit file for the RGW is
configured to run as user ceph. As ceph is not a privileged user, it can not
bind to lower port numbers.
Modify the ceph-radosgw unit file and make sure the set user is set for root.
To verify this is th
Hello list:
I've been building my own Ceph Debian Jessie packages (and QEMU too) to get
jemalloc support, and in Infernalis I was explicitly setting a dependency in
the control file which seemed to work. However, that option is gone in Jewel,
replaced with this /etc/default/ceph preload. Withou
Hello!
I have cluster with 5 SSD drives as OSD backed by SSD journals, one per osd. One
osd per node.
Data drives is Samsung 850 EVO 1TB, journals are Samsung 850 EVO 250G, journal
partition is 24GB, data partition is 790GB. OSD nodes connected by 2x10Gbps
linux bonding for data/cluster network.
Hi,
i had the 250 GB Samsung PRO. They suck for journals because they are
super slow in the - for ceph - required dsync.
Have a look at
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
for more informations.
I advice you also to drop th
Hi everyone,
I am doing a test lab in order to understand how Ceph (version 10.2.1)
works with LUKS. Specifically, how the OSD dm-crypt key management is done.
I have read [1] and I've found the same scheme so far. However, I have
problem to Open the LUKS partition manually. Of course, before to
Greetings,
I have a small Ceph 10.2.1 test cluster using a 3-replicate pool based on 24
SSDs configured with bluestore. I created and wrote an rbd image called
"image1", then deleted the image again.
rbd -p ssd_replica create --size 100G image1
rbd --pool ssd_replica bench-write --io-size 2M
Any chance you are using cache tiering? It's odd that you can see the
objects through "rados ls" but cannot delete them with "rados rm".
On Tue, May 24, 2016 at 4:34 PM, Kevan Rehm wrote:
> Greetings,
>
> I have a small Ceph 10.2.1 test cluster using a 3-replicate pool based on 24
> SSDs configu
Nope, not using tiering.
Also, this is my second attempt, this is repeatable for me, I'm trying to
duplicate a previous occurrence of this same problem to collect useful
debug data. In the previous case, I was eventually able to get rid of the
objects (but have forgotten how), but that was follow
My money is on bluestore. If you can try to reproduce on filestore,
that would rapidly narrow it down.
-Sam
On Tue, May 24, 2016 at 1:53 PM, Kevan Rehm wrote:
> Nope, not using tiering.
>
> Also, this is my second attempt, this is repeatable for me, I'm trying to
> duplicate a previous occurrenc
Okay, will do. If the problem goes away with filestore, I'll switch back
to bluestore again and re-duplicate the problem. In that case, are there
particular things you would like me to collect? Or clues I should look
for in logs?
Thanks, Kevan
On 5/24/16, 4:06 PM, "Samuel Just" wrote:
>My
Having some problems with my cluster. Wondering if I could get some
troubleshooting tips:
Running hammer 0.94.5. Small cluster with cache tiering. 3 spinning
nodes and 3 SSD nodes.
Lots of blocked ops. OSDs are consuming the entirety of the system
memory (128GB) and then falling over. Lots o
On Wed, May 18, 2016 at 6:04 PM, Goncalo Borges
wrote:
> Dear All...
>
> Our infrastructure is the following:
>
> - We use CEPH/CEPHFS (9.2.0)
> - We have 3 mons and 8 storage servers supporting 8 OSDs each.
> - We use SSDs for journals (2 SSDs per storage server, each serving 4 OSDs).
> - We have
On Mon, May 23, 2016 at 12:52 AM, Yan, Zheng wrote:
> To enable quota, you need to pass "--client-quota" option to ceph-fuse
Yeah, this is a bit tricky since the kernel just doesn't respect quota
at all. Perhaps once the kernel does support them we should make this
the default. Or do something li
On one of my test clusters that I¹ve upgraded from Infernalis to Jewel
(10.2.1), and I¹m having a problem where reads are resulting in unfound
objects.
I¹m using cephfs on top of a erasure coded pool with cache tiering which I
believe is related.
>From what I can piece together, here is what the
On Tue, May 24, 2016 at 2:16 PM, Heath Albritton wrote:
> Having some problems with my cluster. Wondering if I could get some
> troubleshooting tips:
>
> Running hammer 0.94.5. Small cluster with cache tiering. 3 spinning
> nodes and 3 SSD nodes.
>
> Lots of blocked ops. OSDs are consuming the
Hello,
Hello,
On Tue, 24 May 2016 15:32:02 -0700 Gregory Farnum wrote:
> On Tue, May 24, 2016 at 2:16 PM, Heath Albritton
> wrote:
> > Having some problems with my cluster. Wondering if I could get some
> > troubleshooting tips:
> >
> > Running hammer 0.94.5. Small cluster with cache tierin
>>And if we still need to add explicit support, does anyone have any advice for
>>how to achieve this?
Here a small patch to build deb with jemalloc
diff --git a/debian/control b/debian/control
index 3f0f5c2..bb60dba 100644
--- a/debian/control
+++ b/debian/control
@@ -37,7 +37,6 @@ Build-Depe
>>Given the messages in this thread, it seems that the jemalloc library isn't
>>actually being used? But if so, why would it be loaded (and why would
>>tcmalloc *also* be loaded)?
I think this is because rocksdb is static linked, and use tcmalloc
and
from jemalloc doc:
https://github.com/jema
I'm sorry it was not right map, that map right
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable straw_calc_version 1
# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
de
Hello,
On Tue, 24 May 2016 21:20:49 +0300 Max A. Krasilnikov wrote:
> Hello!
>
> I have cluster with 5 SSD drives as OSD backed by SSD journals, one per
> osd. One osd per node.
>
More details will help identify other potential bottlenecks, such as:
CPU/RAM
Kernel, OS version.
> Data drives i
Dear All
according commit c3260b276882beedb52f7c77c622a9b77537a63f, bucket info
resides in an instance specific object. The struct "RGWBucketEntryPoint"
contain a pointer to the bucket instance info. I want to know why
to separate bucket name and instance at that time.
Thanks for the reply!
sunfc
Thank you Greg...
There is one further thing which is not explained in the release notes
and that may be worthwhile to say.
The rpm structure (for redhat compatible releases) changed in Jewel
where now there is a ( ceph + ceph-common + ceph-base + ceph-mon/osd/mds
+ others ) packages while i
/usr/bin/ceph is a python script so it's not segfaulting but some binary it's
launching is and there doesn't appear to be much information about it in the
log you uploaded.
Are you able to capture a core file and generate a stack trace from gdb?
The following may help to get some data.
$ ulimit
Hello,
Thanks for the update and I totally agree that it should try to do 2x
replication on the single storage node.
I'll try to reproduce what you're seeing tomorrow on my test cluster, need
to move some data around first.
Christian
On Wed, 25 May 2016 08:58:54 +0700 Никитенко Виталий wrote:
Not going to attempt threading and apologies for the two messages on
the same topic. Christian is right, though. 3 nodes per tier, 8 SSDs
per node in the cache tier, 12 spinning disks in the cold tier. 10GE
client network with a separate 10GE back side network. Each node in
the cold tier has tw
Hello Ceph Users,
We have a Ceph test cluster, that we want to bring into production and will
grow rapidly in the future.
Ceph version:
ceph 0.80.7-2+deb8u1 amd64
distributed storage and file system
ceph-common0.80.7-2+deb
51 matches
Mail list logo