Re: [ceph-users] Optimise Setup with Bluestore

Mark Nelson Wed, 16 Aug 2017 12:44:25 -0700

Hi Mehmet!

On 08/16/2017 11:12 AM, Mehmet wrote:

:( no suggestions or recommendations on this?


Am 14. August 2017 16:50:15 MESZ schrieb Mehmet <c...@elchaka.de>:

    Hi friends,

    my actual hardware setup per OSD-node is as follow:

    # 3 OSD-Nodes with
    - 2x Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz ==> 12 Cores, no
    Hyper-Threading
    - 64GB RAM
    - 12x 4TB HGST 7K4000 SAS2 (6GB/s) Disks as OSDs
    - 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device for
    12 Disks (20G Journal size)
    - 1x Samsung SSD 840/850 Pro only for the OS

    # and 1x OSD Node with
    - 1x Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz (10 Cores 20 Threads)
    - 64GB RAM
    - 23x 2TB TOSHIBA MK2001TRKB SAS2 (6GB/s) Disks as OSDs
    - 1x SEAGATE ST32000445SS SAS2 (6GB/s) Disk as OSDs
    - 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device for
    24 Disks (15G Journal size)
    - 1x Samsung SSD 850 Pro only for the OS

The single P3700 for 23 spinning disks is pushing it. They have highwrite durability but based on the model that is the 400GB version? Ifyou are doing a lot of writes you might wear it out pretty fast and it'sa single point of failure for the entire node (if it dies you have a lotof data dying with it). General unbalanced setups like this aretrickier to get performing well as well.


    As you can see, i am using 1 (one) NVMe (Intel DC P3700 NVMe – 400G)
    Device for whole Spinning Disks (partitioned) on each OSD-node.

    When „Luminous“ is available (as next LTE) i plan to switch vom
    „filestore“ to „bluestore“ 😊

    As far as i have read bluestore consists of
    - „the device“
    - „block-DB“: device that store RocksDB metadata
    - „block-WAL“: device that stores RocksDB „write-ahead journal“

    Which setup would be usefull in my case?
    I Would setup the disks via "ceph-deploy".

So typically we recommend something like a 1-2GB WAL partition on theNVMe drive per OSD and use the remaining space for DB. If you run outof DB space, bluestore will start using the spinning disks to store KVdata instead. I suspect this will still be the advice you will want tofollow, though at some point having so many WAL and DB partitions on theNVMe may start becoming a bottleneck. Something like 63K sequentialwrites to heavily fragmented objects might be worth testing, but in mostcases I suspect DB and WAL on NVMe is still going to be faster.


    Thanks in advance for your suggestions!
    - Mehmet
    ------------------------------------------------------------------------

    ceph-users mailing list
    ceph-users@lists.ceph.com
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Optimise Setup with Bluestore

Reply via email to