Re: [ceph-users] RBD Block performance vs rbd mount as filesystem

2016-11-05 Thread Alexandre DERUMIER
here some tips I use to improve librbd performance && qemu:

- disabling cephx auth

- disable debug_ms : (I'm jumping from 30k iops to 45k iops, with 4k randread)

[global]

debug ms = 0/0


- compile qemu with jemalloc (--enable-jemalloc)
https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html



- Mail original -
De: "Jason Dillaman" 
À: "Bill WONG" 
Cc: "aderumier" , "ceph-users" 
Envoyé: Mardi 1 Novembre 2016 02:06:22
Objet: Re: [ceph-users] RBD Block performance vs rbd mount as filesystem

For better or worse, I can repeat your "ioping" findings against a 
qcow2 image hosted on a krbd-backed volume. The "bad" news is that it 
actually isn't even sending any data to the OSDs -- which is why your 
latency is shockingly low. When performing a "dd ... oflag=dsync" 
against the krbd-backed qcow2 image, I can see lots of IO being 
coalesced from 4K writes into larger writes, which is artificially 
inflating the stats. 



On Mon, Oct 31, 2016 at 11:08 AM, Bill WONG  wrote: 
> Hi Jason, 
> 
> it looks the situation is the same, no difference. my ceph.conf is below, 
> any comments or improvement required? 
> --- 
> [global] 
> fsid = 106a12b0-5ed0-4a71-b6aa-68a09088ec33 
> mon_initial_members = ceph-mon1, ceph-mon2, ceph-mon3 
> mon_host = 192.168.8.11,192.168.8.12,192.168.8.13 
> auth_cluster_required = cephx 
> auth_service_required = cephx 
> auth_client_required = cephx 
> filestore_xattr_use_omap = true 
> osd pool default size = 3 
> osd pool default min size = 1 
> osd pool default pg num = 4096 
> osd pool default pgp num = 4096 
> osd_crush_chooseleaf_type = 1 
> mon_pg_warn_max_per_osd = 0 
> max_open_files = 131072 
> 
> [mon] 
> mon_data = /var/lib/ceph/mon/ceph-$id 
> 
> mon clock drift allowed = 2 
> mon clock drift warn backoff = 30 
> 
> [osd] 
> osd_data = /var/lib/ceph/osd/ceph-$id 
> osd_journal_size = 2 
> osd_mkfs_type = xfs 
> osd_mkfs_options_xfs = -f 
> filestore_xattr_use_omap = true 
> filestore_min_sync_interval = 10 
> filestore_max_sync_interval = 15 
> filestore_queue_max_ops = 25000 
> filestore_queue_max_bytes = 10485760 
> filestore_queue_committing_max_ops = 5000 
> filestore_queue_committing_max_bytes = 1048576 
> journal_max_write_bytes = 1073714824 
> journal_max_write_entries = 1 
> journal_queue_max_ops = 5 
> journal_queue_max_bytes = 1048576 
> osd_max_write_size = 512 
> osd_client_message_size_cap = 2147483648 
> osd_deep_scrub_stride = 131072 
> osd_op_threads = 8 
> osd_disk_threads = 4 
> osd_map_cache_size = 1024 
> osd_map_cache_bl_size = 128 
> osd_mount_options_xfs = "rw,noexec,nodev,noatime,nodiratime,nobarrier" 
> osd_recovery_op_priority = 4 
> osd_recovery_max_active = 10 
> osd_max_backfills = 4 
> rbd non blocking aio = false 
> 
> [client] 
> rbd_cache = true 
> rbd_cache_size = 268435456 
> rbd_cache_max_dirty = 134217728 
> rbd_cache_max_dirty_age = 5 
> --- 
> 
> 
> 
> On Mon, Oct 31, 2016 at 9:20 PM, Jason Dillaman  wrote: 
>> 
>> On Sun, Oct 30, 2016 at 5:40 AM, Bill WONG  wrote: 
>> > any ideas or comments? 
>> 
>> Can you set "rbd non blocking aio = false" in your ceph.conf and retry 
>> librbd? This will eliminate at least one context switch on the read IO 
>> path -- which result in increased latency under extremely low queue 
>> depths. 
>> 
>> -- 
>> Jason 
> 
> 



-- 
Jason 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] suddenly high memory usage for ceph-mon process

2016-11-05 Thread mj

Hi Igor and David,

Thanks for your replies. There are no ceph-mds processes running in our 
cluster.


I'm guesing David's reply applies to us, and we just need to setup 
additional monitoring for memory usage, so we get notified in case it 
happens again.


Anyway: we learned that this can happen, so next time we know where to 
look first.


Thanks both, for you replies,

MJ

On 11/04/2016 03:26 PM, igor.podo...@ts.fujitsu.com wrote:

Maybe you hit this https://github.com/ceph/ceph/pull/10238 still waits for 
merge.

This will occur only if you have ceph-mds process in your cluster, but it's not 
configured (you not need to use MDS, this process could be running only on some 
node).

Check your monitor logs for something like: "up but filesystem disabled" and 
how many similar lines you have.

Regards,
Igor.

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
mj
Sent: Friday, November 4, 2016 2:06 PM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] suddenly high memory usage for ceph-mon process

Hi,

Running ceph 0.94.9 on jessie (proxmox), three hosts, 4 OSDs per host, ssd
journal, 10G cluster network. Hosts have 65G ram. The cluster is generally not
very buzy.

Suddenly we were getting HEALTH_WRN today, with two osd's (both on the
same server) being slow. Looking into this, we noticed very high memory
usage on that host: 75% memory for ceph-mon!

(normally here ceph-mon uses around 1% - 2%)

I restarted ceph-mon on that host, and that seems to have brought things
back to normal immediately.

I don't see anything out of the ordinary in /var/log/syslog on that server, and
also generally the cluster is HEALTH_OK. No changes to configs lately (last
many weeks) and last time I applied updates and rebooted is 30 days ago.

No idea what could have caused this. Any ideas what to check, where to
look? What would typically cause such high memory usage for the ceph-mon
process?

MJ

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com