Re: [ceph-users] Small-cluster performance issues

fcid Tue, 22 Aug 2017 13:32:10 -0700

Thanks for your advices Maged, Chris

I'll answer bellow



On 08/22/2017 04:30 PM, Mazzystr wrote:

Also examine your network layout. Any saturation in the privatecluster network or client facing network will be felt in clients /libvirt / virtual machines
As OSD count increases...

  * Ensure client network private cluster network seperation -
    different nics, different wires, different switches
  * Add more nics both client side and private cluster network side
    and lag them.
  * If/When your dept's budget suddenly swells...implement 10 gig-e.

We have different NICs for each network but they are connected to thesame switch. In that switch both nets are logically separated by VLANs.The switch does not look saturated for now (it is a 10gbit-e), but usingthe same switch may become a problem as the OSD count increases.


Monitor, capacity plan, execute  :)

/Chris C

On Tue, Aug 22, 2017 at 3:02 PM, Maged Mokhtar <mmokh...@petasan.org<mailto:mmokh...@petasan.org>> wrote:


    It is likely your 2 spinning disks cannot keep up with the load.
    Things are likely to improve if you double your OSDs hooking them
    up to your existing SSD journal. Technically it would be nice to
    run a load/performance tool (either atop/collectl/sysstat) and
    measure how busy your resources are, but it is most likely your 2
    spinning disks will show near 100% busy utilization.

We have a monitoring "stack" compounded by collectd/graphite/grafana andI can see the spinning disks almost saturated when performing IO heavytasks on the cluster.


    filestore_max_sync_interval: i do not recommend decreasing this to
    0.1, i would keep it at 5 sec

I'll increase this parameter today, since we have some maintenance workto do.


    osd_op_threads do not increase this unless you have enough cores.

I'll look into this today too.


    but adding disks is the way to go

    Maged

    On 2017-08-22 20:08, fcid wrote:

    Hello everyone,

    I've been using ceph to provide storage using RBD for 60 KVM
    virtual machines running on proxmox.

    The ceph cluster we have is very small (2 OSDs + 1 mon per node,
    and a total of 3 nodes) and we are having some performace issues,
    like big latency times (apply lat:~0.5 s; commit lat: 0.001 s),
    which get worse by the weekly deep-scrubs.

    I wonder if doubling the numbers of OSDs would improve latency
    times, or if there is any other configuration tweak recommended
    for such small cluster. Also, I'm looking forward to read any
    experience of other users using a similiar configuration.

    Some technical info:

      - Ceph version: 10.2.5

      - OSDs have SSD journal (one SSD disk per 2 OSDs) and have a
    spindle for backend disk.

      - Using CFQ disk queue scheduler

      - OSD configuration excerpt:

    osd_recovery_max_active = 1
    osd_recovery_op_priority = 63
    osd_client_op_priority = 1
    osd_mkfs_options = -f -i size=2048 -n size=64k
    osd_mount_options_xfs = inode64,noatime,logbsize=256k
    osd_journal_size = 20480
    osd_op_threads = 12
    osd_disk_threads = 1
    osd_disk_thread_ioprio_class = idle
    osd_disk_thread_ioprio_priority = 7
    osd_scrub_begin_hour = 3
    osd_scrub_end_hour = 8
    osd_scrub_during_recovery = false
    filestore_merge_threshold = 40
    filestore_split_multiple = 8
    filestore_xattr_use_omap = true
    filestore_queue_max_ops = 2500
    filestore_min_sync_interval = 0.01
    filestore_max_sync_interval = 0.1
    filestore_journal_writeahead = true

    Best regards,



    _______________________________________________
    ceph-users mailing list
    ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
    <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>


--
Fernando Cid O.
Ingeniero de Operaciones
AltaVoz S.A.
 http://www.altavoz.net
Viña del Mar, Valparaiso:
 2 Poniente 355 of 53
 +56 32 276 8060
Santiago:
 San Pío X 2460, oficina 304, Providencia
 +56 2 2585 4264

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Small-cluster performance issues

Reply via email to