Re: [ceph-users] Meanning of ceph perf dump

2015-07-05 Thread Ray Sun
Yes, I am trying to do all the tests today including SSD Journal + All SSD pools. Thanks, I will have a try. Best Regards -- Ray On Mon, Jul 6, 2015 at 7:54 AM, Somnath Roy wrote: > Are you using SSD journal ? > > I would say try tuning with the following parameters first.. > > > > jo

Re: [ceph-users] Meanning of ceph perf dump

2015-07-05 Thread Somnath Roy
Are you using SSD journal ? I would say try tuning with the following parameters first.. journal_max_write_entries journal_max_write_bytes journal_queue_max_ops journal_queue_max_bytes filestore_max_sync_interval filestore_min_sync_interval f

Re: [ceph-users] Meanning of ceph perf dump

2015-07-05 Thread Ray Sun
Roy, I try to use Grafana to make a trend image of apply_latency of my osd 0, and it seems grow all the time. So as my understanding, both of the keys(avgcount, sum) is accumulation values. So the correct arithmetic should be: current apply latency = (avgcount current - avgcount previous)/(sum cur

Re: [ceph-users] Meanning of ceph perf dump

2015-07-05 Thread Somnath Roy
Yes, similar values are reported in ‘ceph osd perf’..But, here the commit_latency meaning the journal commit latency in admin socket perf dump.. Also, remember, these values (latencies) are not stable (that’s why you will be seeing spiky write performance) and very difficult to correlate I guess

Re: [ceph-users] Meanning of ceph perf dump

2015-07-05 Thread Ray Sun
Roy, This is really helpful. So as your description, for each ops/seconds, I can use avgcount/sum. Is this the same value as "ceph osd perf"? osd fs_commit_latency(ms) fs_apply_latency(ms) 023 85 1 22 2

Re: [ceph-users] EC cluster design considerations

2015-07-05 Thread Adrien Gillard
The virtualized MON will be on a completely different platform and won't use any of the ceph cluster resources (either compute or disk). I was thinking of putting the 'master' MON (the one with the lowest IP) on a dedicated server because I read the load is heavier on this one (lots of logs in part

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Tuomas Juntunen
I would say this is a problem of ntfs mount.. I found another way to do this, so that’s that. Thanks for noticing. Br,T -Original Message- From: Ilya Dryomov [mailto:idryo...@gmail.com] Sent: 5. heinäkuuta 2015 20:45 To: Tuomas Juntunen Cc: ceph-users Subject: Re: [ceph-users] RBD mo

Re: [ceph-users] Meanning of ceph perf dump

2015-07-05 Thread Somnath Roy
Hi Ray, Here is the description of the different latencies under filestore perf counters. Journal_latency : -- This is the latency of putting the ops in journal. Write is acknowledged after that (well a bit after that, there is one context switch after this). commitcycle_la

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Ilya Dryomov
On Sun, Jul 5, 2015 at 7:57 PM, Tuomas Juntunen wrote: > Hi > > Is there any other kernel that would work? Anyone else had this kind of > problem with rbd map? Well, 4.1 is the latest and therefore the easiest to debug, assuming this is a kernel client problem. What is the output of /sys/kernel

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Tuomas Juntunen
Hi Is there any other kernel that would work? Anyone else had this kind of problem with rbd map? Br, T -Original Message- From: Ilya Dryomov [mailto:idryo...@gmail.com] Sent: 5. heinäkuuta 2015 19:42 To: Tuomas Juntunen Cc: ceph-users Subject: Re: [ceph-users] RBD mounted image on lin

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Ilya Dryomov
On Sun, Jul 5, 2015 at 7:37 PM, Tuomas Juntunen wrote: > Couple of times the same and and the whole rbd mount is hanged > > can't df, or ls. > > umount -l and rbd unmap takes 10-20mins to get rid of it and then I can mount > again and try the transfer > > I have 20TB of stuff that needs to be cop

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Tuomas Juntunen
Couple of times the same and and the whole rbd mount is hanged can't df, or ls. umount -l and rbd unmap takes 10-20mins to get rid of it and then I can mount again and try the transfer I have 20TB of stuff that needs to be copied to that partition. Br, T -Original Message- From: Ilya

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Ilya Dryomov
On Sun, Jul 5, 2015 at 6:58 PM, Tuomas Juntunen wrote: > Hi > > That's the only error what comes from this, there's nothing else. Is it repeated? Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/

Re: [ceph-users] EC cluster design considerations

2015-07-05 Thread Paul Evans
On Jul 4, 2015, at 2:44 PM, Adrien Gillard mailto:gillard.adr...@gmail.com>> wrote: Lastly, regarding Cluster Throughput: EC seems to require a bit more CPU and memory than straight replication, which begs the question of how much RAM and CPU are you putting into the chassis? With proper amou

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Tuomas Juntunen
Hi That's the only error what comes from this, there's nothing else. Br, T -Original Message- From: Ilya Dryomov [mailto:idryo...@gmail.com] Sent: 5. heinäkuuta 2015 18:30 To: Tuomas Juntunen Cc: ceph-users Subject: Re: [ceph-users] RBD mounted image on linux server kernel error and

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Ilya Dryomov
On Sun, Jul 5, 2015 at 6:18 PM, Tuomas Juntunen wrote: > Hi > > > > We are experiencing the following > > > > - Hammer 0.94.2 > > - Ubuntu 14.04.1 > > - Kernel 3.16.0-37-generic > > - 40TB NTFS disk mounted through RBD > > > > > > First 50GB goes fine, but then

[ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Tuomas Juntunen
Hi We are experiencing the following - Hammer 0.94.2 - Ubuntu 14.04.1 - Kernel 3.16.0-37-generic - 40TB NTFS disk mounted through RBD First 50GB goes fine, but then this happens Jul 5 16:56:01 cephclient kernel: [110581.046141] kworker/u65:

[ceph-users] Meanning of ceph perf dump

2015-07-05 Thread Ray Sun
Cephers, Is there any documents or code definition to explain ceph perf dump? I am a little confusing about the output, for example, under filestore, there's journal_latency and apply_latency and each of them has avgcount and sum. I am not quite sure what's the unit and meaning of the numbers? How

Re: [ceph-users] problem with cache tier

2015-07-05 Thread Shinobu Kinjo
I'm quite clear now. > this is my test setup - that's why I'm trying to break it and fix it (best way to learn) Thanks for your feedback!! Kinjo On Sun, Jul 5, 2015 at 8:51 PM, Jacek Jarosiewicz < jjarosiew...@supermedia.pl> wrote: > Well, the docs say that when your osds get full you should

Re: [ceph-users] problem with cache tier

2015-07-05 Thread Jacek Jarosiewicz
Well, the docs say that when your osds get full you should add another osd - and the cluster should redistribute data by it self: http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/#no-free-drive-space this is my test setup - that's why I'm trying to break it and fix it (bes

Re: [ceph-users] problem with cache tier

2015-07-05 Thread Shinobu Kinjo
That's good! So was the root cause is because the osd was full? What's your thought about that? Was there any reason to delete any files? Kinjo On Sun, Jul 5, 2015 at 6:51 PM, Jacek Jarosiewicz < jjarosiew...@supermedia.pl> wrote: > ok, I got it working... > > first i manually deleted some fi

Re: [ceph-users] problem with cache tier

2015-07-05 Thread Jacek Jarosiewicz
ok, I got it working... first i manually deleted some files from the full osd, set the flag noout and restarted the osd daemon. then i waited a while for the cluster to backfill pgs, and after that the rados -p cache cache-try-flush-evict-all command went OK. I'm wondering though, because t