@John, Can you clarify which values would suggest that my metadata pool is too slow? I have added a link that includes values for the "op_active" & "handle_client_request"....gathered in a crude fashion but should hopefully give enough data to paint a picture of what is happening.
http://pastebin.com/5zAG8VXT thanks in advance, Bob On Thu, Aug 6, 2015 at 1:24 AM, Bob Ababurko <b...@ababurko.net> wrote: > I should have probably condensed my finding over the course of the day > into one post but, I guess that just not how i'm built..... > > Another data point. I ran the `ceph daemon mds.cephmds02 perf dump` in a > while loop w/ 1 second sleep and grepping out the stats John mentioned and > at times(~every 10-15 seconds), I have some large objector.op_active > values. After the high values hit, there are 5-10 seconds of zero values. > > "handle_client_request": 5785438, > "op_active": 2375, > "handle_client_request": 5785438, > "op_active": 2444, > "handle_client_request": 5785438, > "op_active": 2239, > "handle_client_request": 5785438, > "op_active": 1648, > "handle_client_request": 5785438, > "op_active": 1121, > "handle_client_request": 5785438, > "op_active": 709, > "handle_client_request": 5785438, > "op_active": 235, > "handle_client_request": 5785572, > "op_active": 0, > ............... > > Should I be concerned about these "op_active" values? I see that in my > narrow slice of output, "handle_client_request" does not increment. What > is happening there? > > thanks, > Bob > > On Wed, Aug 5, 2015 at 11:43 PM, Bob Ababurko <b...@ababurko.net> wrote: > >> I found a way to get the stats you mentioned: >> mds_server.handle_client_request >> & objecter.op_active. I can see these values when I run: >> >> ceph daemon mds.<id> perf dump >> >> I recently restarted the mds server so my stats reset but I still have >> something to share: >> >> "mds_server.handle_client_request": 4406055 >> "objecter.op_active": 0 >> >> Should I assume that op_active might be operations in writes or reads >> that are queued? I haven't been able to find anything describing what >> these stats actually mean so if anyone knows where to find them, please >> advise. >> >> On Wed, Aug 5, 2015 at 4:59 PM, Bob Ababurko <b...@ababurko.net> wrote: >> >>> I have installed diamond(built by ksingh found at >>> https://github.com/ksingh7/ceph-calamari-packages) on the MDS node and >>> I am not seeing the mds_server.handle_client_request OR objecter.op_active >>> metrics being sent to graphite. Mind you, this is not the graphite that is >>> part of the calamari install but our own internal graphite cluster. >>> Perhaps that is the reason? I could not get calamari working correctly on >>> hammerhead/centos7.1 so I put it on pause for now to concentrate on the >>> cluster itself. >>> >>> Ultimately, I need to find a way to get a hold of these metrics to >>> determine the health of my MDS so I can justify moving forward on a SSD >>> based cephfs metadata pool. >>> >>> On Wed, Aug 5, 2015 at 4:05 PM, Bob Ababurko <b...@ababurko.net> wrote: >>> >>>> Hi John, >>>> >>>> You are correct in that my expectations may be incongruent with what is >>>> possible with ceph(fs). I'm currently copying many small files(images) >>>> from a netapp to the cluster...~35k sized files to be exact and the number >>>> of objects/files copied thus far is fairly significant(below in bold): >>>> >>>> [bababurko@cephmon01 ceph]$ sudo rados df >>>> pool name KB objects clones degraded >>>> unfound rd rd KB wr wr KB >>>> cephfs_data 3289284749 *163993660* 0 0 >>>> 0 0 0 328097038 3369847354 >>>> cephfs_metadata 133364 524363 0 0 >>>> 0 3600023 5264453980 95600004 1361554516 >>>> rbd 0 0 0 0 >>>> 0 0 0 0 0 >>>> total used 9297615196 164518023 >>>> total avail 19990923044 >>>> total space 29288538240 >>>> >>>> Yes, that looks like ~164 million objects copied to the cluster. I >>>> would assume this will potentially be a burden to the MDS but I have yet to >>>> confirm with the ceph daemontool mds.<id>. I cannot seem to run it on the >>>> mds host as it doesn't seem to know about that command: >>>> >>>> [bababurko@cephmds01]$ sudo ceph daemonperf mds.cephmds01 >>>> no valid command found; 10 closest matches: >>>> osd lost <int[0-]> {--yes-i-really-mean-it} >>>> osd create {<uuid>} >>>> osd primary-temp <pgid> <id> >>>> osd primary-affinity <osdname (id|osd.id)> <float[0.0-1.0]> >>>> osd reweight <int[0-]> <float[0.0-1.0]> >>>> osd pg-temp <pgid> {<id> [<id>...]} >>>> osd in <ids> [<ids>...] >>>> osd rm <ids> [<ids>...] >>>> osd down <ids> [<ids>...] >>>> osd out <ids> [<ids>...] >>>> Error EINVAL: invalid command >>>> >>>> This fails in a similar manner on all the hosts in the cluster. I'm >>>> very green w/ ceph and i'm probably missing something obvious. Is there >>>> something I need to install to get access to the 'ceph daemonperf' command >>>> in hammerhead? >>>> >>>> thanks, >>>> Bob >>>> >>>> On Wed, Aug 5, 2015 at 2:43 AM, John Spray <jsp...@redhat.com> wrote: >>>> >>>>> On Tue, Aug 4, 2015 at 10:36 PM, Bob Ababurko <b...@ababurko.net> >>>>> wrote: >>>>> > My writes are not going as I would expect wrt to IOPS(50-1000 IOPs) >>>>> & write >>>>> > throughput( ~25MB/s max). I'm interested in understanding what it >>>>> takes to >>>>> > create a SSD pool that I can then migrate the current >>>>> Cephfs_metadata pool >>>>> > to. I suspect that the spinning disk metadata pool is a bottleneck >>>>> and I >>>>> > want to try to get the max performance out of this cluster to prove >>>>> that we >>>>> > would build out a larger version. One caveat is that I have copied >>>>> about 4 >>>>> > TB of data to the cluster via cephfs and dont want to lose the data >>>>> so I >>>>> > obviously need to keep the metadata intact. >>>>> >>>>> I'm a bit suspicious of this: your IOPS expectations sort of imply >>>>> doing big files, but you're then suggesting that metadata is the >>>>> bottleneck (i.e. small file workload). >>>>> >>>>> There are lots of statistics that come out of the MDS, you may be >>>>> particular interested in mds_server.handle_client_request, >>>>> objecter.op_active, to work out if there really are lots of RADOS >>>>> operations getting backed up on the MDS (which would be the symptom of >>>>> a too-slow metadata pool). "ceph daemonperf mds.<id>" may be some >>>>> help if you don't already have graphite or similar set up. >>>>> >>>>> > If anyone has done this OR understands how this can be done, I would >>>>> > appreciate the advice. >>>>> >>>>> You could potentially do this in a two-phase process where you >>>>> initially set a crush rule that includes both SSDs and spinners, and >>>>> then finally set a crush rule that just points to SSDs. Obviously >>>>> that'll do lots of data movement, but your metadata is probably a fair >>>>> bit smaller than your data so that might be acceptable. >>>>> >>>>> John >>>>> >>>> >>>> >>> >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com