Hi Casper,
Thank you for the response, problem is solved now. After some searching, it
turned out to be that after Luminous, setting mon_osd_backfillfull_ratio
and mon_osd_nearfull_ratio do not take effect anymore. This is because
these settings are being read from the OSD map and the commands "
Hi,
Any idea why 2 servers with one OSD each will provide better performance
than 3 ?
Servers are identical
Performance is impacted irrespective if I used SSD for WAL/DB or not
Basically, I am getting lots of cur MB/s zero
Network is separate 10 GB for public and private
I tested it with iperf
Try filestore instead of bluestore ?
- Rado
From: ceph-users On Behalf Of Steven
Vacaroaia
Sent: Thursday, April 19, 2018 8:11 AM
To: ceph-users
Subject: [ceph-users] ceph luminous 12.2.4 - 2 servers better than 3 ?
Hi,
Any idea why 2 servers with one OSD each will provide better performance
If I may guess, because with 3 it reads from 3 and with 2 it reads only
from 2. You should be able to verify this with something like dstat -d
-D sda,sdb,sdc,sdd,sde,sdf,sdg not?
With replication of 2, objects are still being stored among the 3 nodes.
I am getting with iperf3 on 10Gbit
[ ID] I
Hi Steven,
There is only one bench. Could you show multiple benches of the different
scenarios you discussed? Also provide hardware details.
Hans
On Apr 19, 2018 13:11, "Steven Vacaroaia" wrote:
Hi,
Any idea why 2 servers with one OSD each will provide better performance
than 3 ?
Servers are
Thanks for helping
I thought that with CEPH, the more servers you have the better the
performance
- that is why I am so confused
Also, I tried to add the 4th server ( still no luck - in fact the rado
bench output I included was from 4 servers, one OSD on each, bluestore,
replication 2 )
Here
> I thought that with CEPH, the more servers you have the better the
performance
> - that is why I am so confused
You will have overall better performance of your concurrent client
connections. Because all
client reads/writes are spread over all disks/servers.
___
Sure ..thanks for your willingness to help
Identical servers
Hardware
DELL R620, 6 cores, 64GB RAM, 2 x 10 GB ports,
Enterprise HDD 600GB( Seagate ST600MM0006), Enterprise grade SSD 340GB
(Toshiba PX05SMB040Y)
All tests done with the following command
rados bench -p rbd 50 write --no-cleanup &&
Hi,
Quoting Stefan Kooman (ste...@bit.nl):
> Hi,
>
> TL;DR: we see "used" memory grows indefinitely on our OSD servers.
> Until the point that either 1) a OSD process gets killed by OOMkiller,
> or 2) OSD aborts (proably because malloc cannot provide more RAM). I
> suspect a memory leak of the OS
Hi,
I am building my first Ceph cluster from hardware leftover from a previous
project. I have been reading a lot of Ceph documentation but need some help
to make sure I going the right way.
To set the stage below is what I have
Rack-1
1 x HP DL360 G9 with
- 256 GB Memory
- 5 x 300GB HDD
On Thu, Apr 19, 2018 at 11:10 AM, Shantur Rathore
wrote:
> Hi,
>
> I am building my first Ceph cluster from hardware leftover from a previous
> project. I have been reading a lot of Ceph documentation but need some help
> to make sure I going the right way.
> To set the stage below is what I have
Hi,
does anyone have experience in changing auth cap in production environments?
I'm trying to add an additional pool with rwx to my client.libvirt (OpenNebula).
ceph auth cap client.libvirt mon 'allow r' mgr 'allow r' osd 'profile rbd,
allow rwx pool=one , allow rwx pool=two'
Does this move ha
Hi,
TL;DR there seems to be a problem with quota calculation for rgw in our
Jewel / Ubuntu 16.04 cluster. Our support people suggested we raise it
with upstream directly; before I open a tracker item I'd like to check
I've not missed something obvious :)
Our cluster is running Jewel on Ubunt
I take it that the first bench is with replication size 2, the second bench
is with replication size 3? Same for the 4 node OSD scenario?
Also please let us know how you setup block.db and Wal, are they on the SSD?
On Thu, Apr 19, 2018, 14:40 Steven Vacaroaia wrote:
> Sure ..thanks for your wil
replication size is always 2
DB/WAL on HDD in this case
I tried with OSDs with WAL/DB on SSD - they exhibit the same symptoms (
cur MB/s 0 )
In summary, it does not matter
- which server ( any 2 will work better than any 3 or 4)
- replication size ( it tried with size 2 and 3 )
- location of W
On Thu, Apr 19, 2018 at 11:32 AM, Sven Barczyk wrote:
> Hi,
>
>
>
> does anyone have experience in changing auth cap in production
> environments?
>
> I’m trying to add an additional pool with rwx to my client.libvirt
> (OpenNebula).
>
>
>
> ceph auth cap client.libvirt mon ‘allow r’ mgr ‘allow r
On Thu, Apr 19, 2018 at 11:32 AM, Sven Barczyk wrote:
> Hi,
>
>
>
> does anyone have experience in changing auth cap in production
> environments?
>
> I’m trying to add an additional pool with rwx to my client.libvirt
> (OpenNebula).
>
>
>
> ceph auth cap client.libvirt mon ‘allow r’ mgr ‘allow r
I see, the second one is the read bench. Even in the 2 node scenario the
read performance is pretty bad. Have you verified the hardware with micro
benchmarks such as 'fio'? Also try to review storage controller settings.
On Apr 19, 2018 5:13 PM, "Steven Vacaroaia" wrote:
replication size is alwa
fio is fine and megacli setings are as below ( device with WT is the SSD)
Vendor Id : TOSHIBA
Product Id : PX05SMB040Y
Capacity : 372.0 GB
Results
Jobs: 20 (f=20): [W(20)] [100.0% done] [0KB/447.1MB/0KB /s] [0/115K/0 iops]
[eta 00m
Last thing I can come up with is doing a 2 node scenario with at least one
of the nodes being an other. Maybe you've already done that..
But again, even the read performance in your shown bench of the 2 node
cluster is pretty bad.
The premise of this thread that a 2 node cluster does work well,
The rule of thumb is not to have tens of millions of objects in a radosgw
bucket, because reads will be slow. If using bucket index sharding (with
128 or 256 shards), does this eliminate this concern? Has anyone tried
tens of millions (20-40M) of objects with sharded indexes?
Thank you
___
Hi All,
Just noticed on 2 Ceph Luminous 12.2.4 clusters, Ceph mgr spams the syslog
with lots of "mon failed to return metadata for mds" every second.
```
2018-04-20 06:06:03.951412 7fca238ff700 1 mgr send_beacon active
2018-04-20 06:06:04.934477 7fca14809700 0 ms_deliver_dispatch: unhandled
mes
22 matches
Mail list logo