Re: [ceph-users] Ceph bluestore performance on 4kn vs. 512e?

2019-02-26 Thread Martin Verges
Hello Oliver, as 512e requires the drive to read a 4k block, change the 512 byte and then write back the 4k block to the disk, it should have a significant performance impact. However costs are the same, so always choose 4Kn drives. By the way, this might not affect you, as long as you write 4k at

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Hector Martin
Hi Igor, On 11/01/2019 20:16, Igor Fedotov wrote: In short - we're planning to support main device expansion for Nautilus+ and to introduce better error handling for the case in Mimic and Luminous. Nautilus PR has been merged, M & L PRs are pending review at the moment: Got it. No problem then

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Igor Fedotov
Hi Hector, just realized that you're trying to expand main (and exclusive) device which isn't supported in mimic. Here is bluestore_tool complaint (pretty confusing and not preventing from the partial expansion though)  while expanding: expanding dev 1 from 0x1df2eb0 to 0x3a38120 Ca

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Hector Martin
Sorry for the late reply, Here's what I did this time around. osd.0 and osd.1 should be identical, except osd.0 was recreated (that's the first one that failed) and I'm trying to expand osd.1 from its original size. # ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0 | grep size

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2018-12-27 Thread Igor Fedotov
Hector, One more thing to mention - after expansion please run fsck using ceph-bluestore-tool prior to running osd daemon and collect another log using CEPH_ARGS variable. Thanks, Igor On 12/27/2018 2:41 PM, Igor Fedotov wrote: Hi Hector, I've never tried bluefs-bdev-expand over encrypte

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2018-12-27 Thread Igor Fedotov
Hi Hector, I've never tried bluefs-bdev-expand over encrypted volumes but it works absolutely fine for me in other cases. So it would be nice to troubleshoot this a bit. Suggest to do the following: 1) Backup first 8K for all OSD.1 devices (block, db and wal) using dd. This will probably al

Re: [ceph-users] Ceph Bluestore : Deep Scrubbing vs Checksums

2018-11-25 Thread Ronny Aasen
On 22.11.2018 17:06, Eddy Castillon wrote: Hello dear ceph users: We are running a ceph cluster with Luminous (BlueStore). As you may know this new  ceph version has a new feature called "Checksums".  I would like to ask if this feature replace to deep-scrub. In our cluster, we run deep-scrub

Re: [ceph-users] ceph-bluestore-tool failed

2018-11-01 Thread ST Wong (ITSC)
uot;, "description": "bluefs wal" }, "/var/lib/ceph/osd/ceph-2/block.db": { "osd_uuid": "6d999288-a4a4-4088-b764-bf2379b4492b", "size": 524288000, "btime": "2018-10-18 15:59:06.175997&qu

Re: [ceph-users] ceph-bluestore-tool failed

2018-10-31 Thread Igor Fedotov
You might want to try --path option instead of --dev one. On 10/31/2018 7:29 AM, ST Wong (ITSC) wrote: Hi all, We deployed a testing mimic CEPH cluster using bluestore.    We can’t run ceph-bluestore-tool on OSD with following error: --- # ceph-bluestore-tool show-label --dev *device

Re: [ceph-users] ceph bluestore data cache on osd

2018-07-23 Thread Igor Fedotov
Firstly I'd suggest to inspect bluestore performance counters before and after adjusting cache parameters (and after running the same test suite). Namely: "bluestore_buffer_bytes" "bluestore_buffer_hit_bytes" "bluestore_buffer_miss_bytes" Is hit ratio (bluestore_buffer_hit_bytes) much diffe

Re: [ceph-users] Ceph Bluestore performance question

2018-02-24 Thread Oliver Freyermuth
Am 24.02.2018 um 07:00 schrieb David Turner: > Your 6.7GB of DB partition for each 4TB osd is on the very small side of > things. It's been discussed a few times in the ML and the general use case > seems to be about 10GB DB per 1TB of osd. That would be about 40GB DB > partition for each of you

Re: [ceph-users] Ceph Bluestore performance question

2018-02-23 Thread David Turner
Your 6.7GB of DB partition for each 4TB osd is on the very small side of things. It's been discussed a few times in the ML and the general use case seems to be about 10GB DB per 1TB of osd. That would be about 40GB DB partition for each of your osds. This general rule covers most things except for

Re: [ceph-users] Ceph Bluestore performance question

2018-02-22 Thread Oliver Freyermuth
Hi Vadim, many thanks for these benchmark results! This indeed looks extremely similar to what we achieve after enabling connected mode. Our 6 OSD-hosts are Supermicro systems with 2 HDDs (Raid 1) for the OS, and 32 HDDs (4 TB) + 2 SSDs for the OSDs. The 2 SSDs have 16 LVM volumes each (whi

Re: [ceph-users] Ceph Bluestore performance question

2018-02-22 Thread Vadim Bulst
Hi Oliver, i also use Infiniband and Cephfs for HPC purposes. My setup: * 4x Dell R730xd and expansion shelf, 24 OSD à 8TB, 128GB Ram, 2x10Core Intel 4th Gen, Mellanox ConnectX-3, no SSD-Cache * 7x Dell R630 Clients * Ceph-Cluster running on Ubuntu Xenial and Ceph Jewel deployed with

Re: [ceph-users] Ceph Bluestore performance question

2018-02-20 Thread Oliver Freyermuth
Answering the first RDMA question myself... Am 18.02.2018 um 16:45 schrieb Oliver Freyermuth: > This leaves me with two questions: > - Is it safe to use RDMA with 12.2.2 already? Reading through this mail > archive, > I grasped it may lead to memory exhaustion and in any case needs some hacks

Re: [ceph-users] Ceph Bluestore performance question

2018-02-19 Thread Caspar Smit
"I checked and the OSD-hosts peaked at a load average of about 22 (they have 24+24HT cores) in our dd benchmark, but stayed well below that (only about 20 % per OSD daemon) in the rados bench test." Maybe because your dd test uses bs=1M and rados bench is using 4M as default block size? Caspar 20

Re: [ceph-users] Ceph Bluestore performance question

2018-02-18 Thread Oliver Freyermuth
Hi Stijn, > the IPoIB network is not 56gb, it's probably a lot less (20gb or so). > the ib_write_bw test is verbs/rdma based. do you have iperf tests > between hosts, and if so, can you share those reuslts? Wow - indeed, yes, I was completely mistaken about ib_write_bw. Good that I asked! You

Re: [ceph-users] Ceph Bluestore performance question

2018-02-18 Thread Stijn De Weirdt
hi oliver, the IPoIB network is not 56gb, it's probably a lot less (20gb or so). the ib_write_bw test is verbs/rdma based. do you have iperf tests between hosts, and if so, can you share those reuslts? stijn > we are just getting started with our first Ceph cluster (Luminous 12.2.2) and > doing

Re: [ceph-users] CEPH bluestore space consumption with small objects

2017-08-08 Thread Marcus Haarmann
uot; CC: "Wido den Hollander" , "ceph-users" , "Marcus Haarmann" Gesendet: Dienstag, 8. August 2017 17:50:44 Betreff: Re: [ceph-users] CEPH bluestore space consumption with small objects Marcus, You may want to look at the bluestore_min_alloc_size setting as well

Re: [ceph-users] CEPH bluestore space consumption with small objects

2017-08-08 Thread Pavel Shub
Marcus, You may want to look at the bluestore_min_alloc_size setting as well as the respective bluestore_min_alloc_size_ssd and bluestore_min_alloc_size_hdd. By default bluestore sets a 64k block size for ssds. I'm also using ceph for small objects and I've see my OSD usage go down from 80% to 20%

Re: [ceph-users] CEPH bluestore space consumption with small objects

2017-08-03 Thread Gregory Farnum
Don't forget that at those sizes the internal journals and rocksdb size tunings are likely to be a significant fixed cost. On Thu, Aug 3, 2017 at 3:13 AM Wido den Hollander wrote: > > > Op 2 augustus 2017 om 17:55 schreef Marcus Haarmann < > marcus.haarm...@midoco.de>: > > > > > > Hi, > > we are

Re: [ceph-users] CEPH bluestore space consumption with small objects

2017-08-03 Thread Wido den Hollander
> Op 2 augustus 2017 om 17:55 schreef Marcus Haarmann > : > > > Hi, > we are doing some tests here with a Kraken setup using bluestore backend (on > Ubuntu 64 bit). > We are trying to store > 10 mio very small objects using RADOS. > (no fs, no rdb, only osd and monitors) > > The setup was

Re: [ceph-users] ceph bluestore RAM over used - luminous

2017-05-14 Thread Benoit GEORGELIN - yulPa
- Mail original - > De: "Benoit GEORGELIN" > À: "ceph-users" > Envoyé: Samedi 13 Mai 2017 19:57:41 > Objet: [ceph-users] ceph bluestore RAM over used - luminous > Hi dear members of the list, > > I'm discovering CEPH and doing some testing. > I came across a strange behavior about the R

Re: [ceph-users] Ceph Bluestore

2017-03-15 Thread Christian Balzer
Hello, On Wed, 15 Mar 2017 09:07:10 +0100 Michał Chybowski wrote: > > Hello, > > > > your subject line has little relevance to your rather broad questions. > > > > On Tue, 14 Mar 2017 23:45:26 +0100 Michał Chybowski wrote: > > > >> Hi, > >> > >> I'm going to set up a small cluster (5 nodes wit

Re: [ceph-users] Ceph Bluestore

2017-03-15 Thread Michał Chybowski
W dniu 15.03.2017 o 09:05, Eneko Lacunza pisze: Hi Michal, El 14/03/17 a las 23:45, Michał Chybowski escribió: I'm going to set up a small cluster (5 nodes with 3 MONs, 2 - 4 HDDs per node) to test if ceph in such small scale is going to perform good enough to put it into production enviro

Re: [ceph-users] Ceph Bluestore

2017-03-15 Thread Michał Chybowski
Hello, your subject line has little relevance to your rather broad questions. On Tue, 14 Mar 2017 23:45:26 +0100 Michał Chybowski wrote: Hi, I'm going to set up a small cluster (5 nodes with 3 MONs, 2 - 4 HDDs per node) to test if ceph in such small scale is going to perform good enough to

Re: [ceph-users] Ceph Bluestore

2017-03-15 Thread Eneko Lacunza
Hi Michal, El 14/03/17 a las 23:45, Michał Chybowski escribió: I'm going to set up a small cluster (5 nodes with 3 MONs, 2 - 4 HDDs per node) to test if ceph in such small scale is going to perform good enough to put it into production environment (or does it perform well only if there are t

Re: [ceph-users] Ceph Bluestore

2017-03-14 Thread Christian Balzer
Hello, your subject line has little relevance to your rather broad questions. On Tue, 14 Mar 2017 23:45:26 +0100 Michał Chybowski wrote: > Hi, > > I'm going to set up a small cluster (5 nodes with 3 MONs, 2 - 4 HDDs per > node) to test if ceph in such small scale is going to perform good > e