[ceph-users] RocksDB device selection (performance requirements)

2019-11-04 Thread Huseyin Cotuk
Hi all,

The only recommendation I can find about db device selection is about the 
capacity (4% of the data disk) on the documents.  Is there any suggestions 
about technical specs like throughput, IOPS and db device per data disk?

While designing a specific infrastructure with filestore, we were looking its 
specs to meet requirements of all the disks behind the journal device. But in 
bluestore, data is directly written into data device via bluefs adapter while 
metadata is written into db (RocksDB) device. 

I know that it depends on the workload, but is there any best practice or 
recommendation about selection of db device? 

IMO, using NVME disks that we used for filestore journals as db devices is not 
meaningful. Because NVME disks have minimal latency and extraordinary 
throughput and IOPS performance. So I am not sure that DB device needs that 
kind of performance. So I want to use those NVME disks for a full flash pool, 
and choose another disks for db device. 

Any suggestion or recommendation would be appreciated. 

Best regards,
Huseyin Cotuk
hco...@gmail.com




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Balancer configuration fails with Error EINVAL: unrecognized config option 'mgr/balancer/max_misplaced'

2019-11-04 Thread Thomas Schneider
Hi,

I want to adjust balancer throttling and executed this command that
returns an error:
root@ld3955:~# ceph config set mgr mgr/balancer/max_misplaced .01
Error EINVAL: unrecognized config option 'mgr/balancer/max_misplaced'

root@ld3955:~# ceph balancer status
{
    "active": true,
    "plans": [],
    "mode": "upmap"
}

Ceph is healthy, though:
root@ld3955:~# ceph -s
  cluster:
    id: 6b1b5117-6e08-4843-93d6-2da3cf8a6bae
    health: HEALTH_WARN
    mon ld5505 is low on available space

  services:
    mon: 3 daemons, quorum ld5505,ld5506,ld5507 (age 19h)
    mgr: ld5507(active, since 23h), standbys: ld5506, ld5505, ld5508
    mds: cephfs:1 {0=ld4464=up:active} 1 up:standby
    osd: 442 osds: 442 up, 442 in

  data:
    pools:   6 pools, 8312 pgs
    objects: 62.07M objects, 237 TiB
    usage:   710 TiB used, 821 TiB / 1.5 PiB avail
    pgs: 8312 active+clean

  io:
    client:   1.3 KiB/s rd, 913 KiB/s wr, 1 op/s rd, 34 op/s wr


Can you please advise howto adjust balancer throttling?

THX
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mgr daemons becoming unresponsive

2019-11-04 Thread Janek Bevendorff

On 02.11.19 18:35, Oliver Freyermuth wrote:

Dear Janek,

in my case, the mgr daemon itself remains "running", it just stops reporting to 
the mon. It even still serves the dashboard, but with outdated information.


This is not so different. The MGRs in my case are running, but stop 
responding. So neither can I get Prometheus metrics from them, nor do 
they report back to the MON properly, sometimes resulting in the MONs 
regarding them as not-running.



I grepped through the logs and could not find any clock skew messages. So it 
seems to be a different issue
(albeit both issues seem to be triggered by the devicehealth module).

Cheers,
Oliver

On 2019-11-02 18:28, Janek Bevendorff wrote:

These issues sound a bit like a bug I reported a few days ago: 
https://tracker.ceph.com/issues/39264 


Related: https://tracker.ceph.com/issues/39264 


On 02/11/2019 17:34, Oliver Freyermuth wrote:

Dear Reed,

yes, also the balancer is on for me - but the instabilities vanished as soon as 
I turned off device health metrics.

Cheers,
Oliver

Am 02.11.19 um 17:31 schrieb Reed Dier:

Do you also have the balancer module on?

I experienced extremely bad stability issues where the MGRs would silently die 
with the balancer module on.
And by on, I mean 'active:true` by way of `ceph balancer on`.

Once I disabled the automatic balancer, it seemed to become much more stable.

I can still manually run the balancer without issues (except for one pool), but 
the balancer is what appeared to be my big driver of instability.

Reed


On Nov 2, 2019, at 11:24 AM, Oliver Freyermuth  
wrote:

Hi Thomas,

indeed, I also had the dashboard open at these times - but right now, after 
disabling device health metrics,
I can not retrigger it even when playing wildly on the dashboard.

So I'll now reenable health metrics and try to retrigger the issue with cranked 
up debug levels as Sage suggested.
Maybe in your case, if you can stand mgr failures, this would also be 
interesting to get the dashboard issue debugged?

Cheers,
Oliver

Am 02.11.19 um 08:23 schrieb Thomas:

Hi Oliver,

I experienced a situation where MGRs "goes crazy", means MGR was active but not 
working.
In the logs of the standby MGR nodes I found an error (after restarting 
service) that pointed to Ceph Dashboard.

Since disabling the dashboard my MGRs are stable again.

Regards
Thomas

Am 02.11.2019 um 02:48 schrieb Oliver Freyermuth:

Dear Cephers,

interestingly, after:
   ceph device monitoring off
the mgrs seem to be stable now - the active one still went silent a few minutes 
later,
but the standby took over and was stable, and restarting the broken one, it's 
now stable since an hour, too,
so probably, a restart of the mgr is needed after disabling device monitoring 
to get things stable again.

So it seems to be caused by a problem with the device health metrics. In case 
this is a red herring and mgrs become instable again in the next days,
I'll let you know.

Cheers,
 Oliver

Am 01.11.19 um 23:09 schrieb Oliver Freyermuth:

Dear Cephers,

this is a 14.2.4 cluster with device health metrics enabled - since about a day, all mgr daemons go 
"silent" on me after a few hours, i.e. "ceph -s" shows:

cluster:
  id: 269cf2b2-7e7c-4ceb-bd1b-a33d915ceee9
  health: HEALTH_WARN
  no active mgr
  1/3 mons down, quorum mon001,mon002
  services:
  mon:3 daemons, quorum mon001,mon002 (age 57m), out of quorum: 
mon003
  mgr:no daemons active (since 56m)
  ...
(the third mon has a planned outage and will come back in a few days)

Checking the logs of the mgr daemons, I find some "reset" messages at the time when it 
goes "silent", first for the first mgr:

2019-11-01 21:34:40.286 7f2df6a6b700  0 log_channel(cluster) log [DBG] : pgmap 
v1798: 1585 pgs: 1585 active+clean; 1.1 TiB data, 2.3 TiB used, 136 TiB / 138 
TiB avail
2019-11-01 21:34:41.458 7f2e0d59b700  0 client.0 ms_handle_reset on 
v2:10.160.16.1:6800/401248
2019-11-01 21:34:42.287 7f2df6a6b700  0 log_channel(cluster) log [DBG] : pgmap 
v1799: 1585 pgs: 1585 active+clean; 1.1 TiB data, 2.3 TiB used, 136 TiB / 138 
TiB avail

and a bit later, on the standby mgr:

2019-11-01 22:18:14.892 7f7bcc8ae700  0 log_channel(cluster) log [DBG] : pgmap 
v1798: 1585 pgs: 166 active+clean+snaptrim, 858 active+clean+snaptrim_wait, 561 
active+clean; 1.1 TiB data, 2.3 TiB used, 136 TiB / 138 TiB avail
2019-11-01 22:18:16.022 7f7be9e72700  0 client.0 ms_handle_reset on 
v2:10.160.16.2:6800/352196
2019-11-01 22:18:16.893 7f7bcc8ae700  0 log_channel(cluster) log [DBG] : pgmap 
v1799: 1585 pgs: 166 active+clean+snaptrim, 858 active+clean+snaptrim_wait, 561 
active+clean; 1.1 TiB data, 2.3 TiB used, 136 TiB / 138 TiB avail

Interestingly, the dashboard still works, but presents outdated information, 
and for example zero I/O going on.
I believe this st

[ceph-users] Run optimizer to create a new plan on specific pool fails

2019-11-04 Thread Thomas Schneider
Hi,

I want to create an optimizer plan on each pool.
My cluster has multiple crush roots, and multiple pools each
representing a specific drive (HDD, SSD, NVME).

Some pools are balanced, some are not.

Therefore I want to run  optimizer to create a new plan on specific pool.
However this fails for any pool with this error message:
root@ld3955:~# ceph balancer optimize hdd-plan hdd
Error EALREADY: Unable to find further optimization, or pool(s)' pg_num
is decreasing, or distribution is already perfect
root@ld3955:~# ceph balancer optimize ssd-plan ssd
Error EALREADY: Unable to find further optimization, or pool(s)' pg_num
is decreasing, or distribution is already perfect
root@ld3955:~# ceph balancer optimize hdb_backup-plan hdb_backup
Error EALREADY: Unable to find further optimization, or pool(s)' pg_num
is decreasing, or distribution is already perfect

root@ld3955:~# ceph osd pool ls
hdb_backup
hdd
ssd
nvme
cephfs_data
cephfs_metadata

What is causing this error?

THX
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Device Health Metrics on EL 7

2019-11-04 Thread Benjeman Meekhof
Hi Oliver,

The ceph-osd RPM packages include a config in
/etc/sudoers.d/ceph-osd-smartctl that looks something like this:
ceph ALL=NOPASSWD: /usr/sbin/smartctl -a --json /dev/*
ceph ALL=NOPASSWD: /usr/sbin/nvme * smart-log-add --json /dev/*

If you are using SElinux you will have to adjust capabilities there as
well.  I think we did something kind of similar to what is attached to
this tracker issue:
https://tracker.ceph.com/issues/40683

That seemed to get us as far as hosts being able to report disk health
to the module.

thanks,
Ben



On Sat, Nov 2, 2019 at 11:38 PM Oliver Freyermuth
 wrote:
>
> Dear Cephers,
>
> I went through some of the OSD logs of our 14.2.4 nodes and found this:
> --
> Nov 01 01:22:25  sudo[1087697]: ceph : TTY=unknown ; PWD=/ ; USER=root ; 
> COMMAND=/sbin/smartctl -a --json /dev/sds
> Nov 01 01:22:51  sudo[1087729]: pam_unix(sudo:auth): conversation failed
> Nov 01 01:22:51  sudo[1087729]: pam_unix(sudo:auth): auth could not identify 
> password for [ceph]
> Nov 01 01:22:51  sudo[1087729]: pam_succeed_if(sudo:auth): requirement "uid 
> >= 1000" not met by user "ceph"
> Nov 01 01:22:53  sudo[1087729]: ceph : command not allowed ; TTY=unknown 
> ; PWD=/ ; USER=root ; COMMAND=nvme lvm smart-log-add --json /dev/sds
> --
> It seems with device health metrics, the OSDs try to run smartctl with 
> "sudo", which expectedly fails, since the Ceph user (as system user) has a 
> uid smaller than 1000.
> Also, it's of course not in /etc/sudoers.
>
> Does somebody have a working setup with device health metrics which could be 
> shared (and documented, or made part of future packaging ;-) ) ?
>
> Cheers,
> Oliver
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph + Rook Day San Diego - November 18

2019-11-04 Thread Mike Perez
Hi Cephers,

I'm happy to announce the availability of the schedule for Ceph + Rook Day
San Diego. Registration for this event will be free until Tuesday 11:59
UTC, so register now:

https://ceph.io/cephdays/ceph-rook-day-san-diego-2019/

We still have some open spots in the schedule, but the lineup is already
looking great. We are looking for more Rook related topics. If you're
interested, you can submit them through our CFP form for the selection
committee to review:

https://zfrmz.com/hkg8EF9NYb6IvWnvoRUi

We currently have SoftIron, Red Hat and SUSE set as sponsors for this
event, we'd like to invite more to contact us at eve...@ceph.io:

https://ceph.io/wp-content/uploads/2019/11/Ceph-Day-Partner-Sponsorship.pdf

-- 

Mike Perez

he/him

Ceph Community Manager


M: +1-951-572-2633

494C 5D25 2968 D361 65FB 3829 94BC D781 ADA8 8AEA
@Thingee   Thingee
 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [ceph-user] Upload objects failed on FIPS enable ceph cluster

2019-11-04 Thread Amit Ghadge
Hi All,

I'm using ceph-14.2.4 and testing in FIPS enable cluster. Downloading
objects are works but ceph raised segmentation exception while uploading.

Please help me here.  And please provide debugging stage, So I could take
in development environment.

Thanks,
Amit G
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-04 Thread Hermann Himmelbauer
Hi,
I recently upgraded my 3-node cluster to proxmox 6 / debian-10 and
recreated my ceph cluster with a new release (14.2.4 bluestore) -
basically hoping to gain some I/O speed.

The installation went flawlessly, reading is faster than before (~ 80
MB/s), however, the write speed is still really slow (~ 3,5 MB/s).

I wonder if I can do anything to speed things up?

My Hardware is as the following:

3 Nodes with Supermicro X8DTT-HIBQF Mainboard each,
2 OSD per node (2TB SATA harddisks, WDC WD2000F9YZ-0),
interconnected via Infiniband 40

The network should be reasonably fast, I measure ~ 16 GBit/s with iperf,
so this seems fine.

I use ceph for RBD only, so my measurement is simply doing a very simple
"dd" read and write test within a virtual machine (Debian 8) like the
following:

read:
dd if=/dev/vdb | pv | dd of=/dev/null
-> 80 MB/s


write:
dd if=/dev/zero | pv | dd of=/dev/vdb
-> 3.5 MB/s

When I do the same on the virtual machine on a disk that is on a NFS
storage, I get something about 30 MB/s for reading and writing.

If I disable the write cache on all OSD disks via "hdparm -W 0
/dev/sdX", I gain a little bit of performance, write speed is then 4.3 MB/s.

Thanks to your help from the list I plan to install a second ceph
cluster which is SSD based (Samsung PM1725b) which should be much
faster, however, I still wonder if there is any way to speed up my
harddisk based cluster?

Thank you in advance for any help,

Best Regards,
Hermann


-- 
herm...@qwer.tk
PGP/GPG: 299893C7 (on keyservers)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-04 Thread Martin Verges
Hello,

harddisks are awful slow. That's normal and expected and result of random
io as it would be in a Ceph cluster.
You can speed up raw bandwidth performance using EC but not on such small
clusters and not when having high io load.

As you mentioned Proxmox, when it comes to VM workloads spinning media is
never an option, use Flash!

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mo., 4. Nov. 2019 um 23:44 Uhr schrieb Hermann Himmelbauer <
herm...@qwer.tk>:

> Hi,
> I recently upgraded my 3-node cluster to proxmox 6 / debian-10 and
> recreated my ceph cluster with a new release (14.2.4 bluestore) -
> basically hoping to gain some I/O speed.
>
> The installation went flawlessly, reading is faster than before (~ 80
> MB/s), however, the write speed is still really slow (~ 3,5 MB/s).
>
> I wonder if I can do anything to speed things up?
>
> My Hardware is as the following:
>
> 3 Nodes with Supermicro X8DTT-HIBQF Mainboard each,
> 2 OSD per node (2TB SATA harddisks, WDC WD2000F9YZ-0),
> interconnected via Infiniband 40
>
> The network should be reasonably fast, I measure ~ 16 GBit/s with iperf,
> so this seems fine.
>
> I use ceph for RBD only, so my measurement is simply doing a very simple
> "dd" read and write test within a virtual machine (Debian 8) like the
> following:
>
> read:
> dd if=/dev/vdb | pv | dd of=/dev/null
> -> 80 MB/s
>
>
> write:
> dd if=/dev/zero | pv | dd of=/dev/vdb
> -> 3.5 MB/s
>
> When I do the same on the virtual machine on a disk that is on a NFS
> storage, I get something about 30 MB/s for reading and writing.
>
> If I disable the write cache on all OSD disks via "hdparm -W 0
> /dev/sdX", I gain a little bit of performance, write speed is then 4.3
> MB/s.
>
> Thanks to your help from the list I plan to install a second ceph
> cluster which is SSD based (Samsung PM1725b) which should be much
> faster, however, I still wonder if there is any way to speed up my
> harddisk based cluster?
>
> Thank you in advance for any help,
>
> Best Regards,
> Hermann
>
>
> --
> herm...@qwer.tk
> PGP/GPG: 299893C7 (on keyservers)
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io