Thanks Yehuda,

hint of admin socket is useful.
The problem was fixed.

Cheers




2013/12/25 Yehuda Sadeh <yeh...@inktank.com>

> On Tue, Dec 24, 2013 at 8:16 AM, Kuo Hugo <tonyt...@gmail.com> wrote:
> > Hi folks,
> >
> > After some more tests. I can not addressed the bottleneck currently.
> Never
> > hit CPU bound.
> >
> >
> > OSD op threads : 60
> > rgw_thread_pool_size : 300
> > pg = 2000
> > pool size = 3
> >
> > Try to find the max concurrency of 1KB write of this cluster
> > Rados bench 1000~5000+ about 3000 reqs/sec
> >
> > mon.0 [INF]  2053 kB/s wr, 2053 op/s
> > mon.0 [INF]  2093 kB/s wr, 2093 op/s
> > mon.0 [INF]  3380 kB/s wr, 3380 op/s
> >
> > So that I think the IOPS for single pool is around 3000 op/s
> >
> >
> >
> > With 30 OSD and journals on 3 SSD by ssbench
> >
> > 1KB Result in : 1200 reqs/sec
> >
> > 4MB object get 150MB/sec
> >
> >
> > With 30 OSD and journals on 30 HDDs
> >
> > 1KB ssbench Result in : 450 reqs/sec
> >
> > mon.0 [INF] 1650 kB/s rd, 413 kB/s wr, 5783 op/s
> > mon.0 [INF] 1651 kB/s rd, 409 kB/s wr, 5760 op/s
> > mon.0 [INF] 1708 kB/s rd, 423 kB/s wr, 5959 op/s
> >
> >
> > 1KB Rados Bench result in : 900 reqs/sec
> >
> > mon.0 [INF] 803 kB/s wr, 803 op/s
> > mon.0 [INF] 806 kB/s wr, 806 op/s
> > mon.0 [INF] 911 kB/s wr, 911 op/s
> >
> >
> >           4MB object get 350MB/s throughPUT
> >
> > Based on the above result, SSD helps for small object but not that good
> for
> > object size which over 1MB.
> >
> >
> > Why 1KB object benchmark from ssbench generated much more ops than rados
> > bench?
> > From my perspective,
> >
> > 1. Every request is producing auth/bucket/object-put operation from
> RadosGW
> > to Rados.
> > 2. Need to read bucket data
> >
>
> Writes generate much more operations, whereas reads should be roughly
> the same. Write operations need to take care of the index update and
> the old object cleanup, whereas reads go directly to the object.
> Unless you have cache disabled.
>
> >
> > How to improve the performance(?) :
> >
> > 1. Higher concurrency will reduce the performance of RadosGW :
> >
> >     Cuncurrency 100
> >
> > Count: 13283 (    0 error;     0 retries:  0.00%)  Average requests per
> > second: 436.1
> > mon.0 [INF] 1650 kB/s rd, 413 kB/s wr, 5783 op/s
> > mon.0 [INF] 1651 kB/s rd, 409 kB/s wr, 5760 op/s
> > mon.0 [INF] 1708 kB/s rd, 423 kB/s wr, 5959 op/s
> >
> >
> >     Concurrency 200
> >
> > Count:  7027 (   17 error;   475 retries:  6.76%)  Average requests per
> > second: 190.0
> > mon.0 [INF] 2001 kB/s rd, 492 kB/s wr, 6959 op/s
> > mon.0 [INF] 1877 kB/s rd, 457 kB/s wr, 6498 op/s
> > mon.0 [INF] 1332 kB/s rd, 330 kB/s wr, 4647 op/s
> >
> >     Higher concurrency will have more error and retries.
> >     I'm not sure the bottleneck is on Http server or Rados cluster
> maximum
> > IOPS in this case.
> >     Does any chance to make it faster by tuning Apache's setting? The CPU
> > util on this node
> >
> > 2. For hitting network maximum bandwidth, more HDDs or journal with SSD
> will
> > help?
> >
> >
> > Any suggestion would be appreciate ~
> >
> >
>
> Try bumping up the 'objecter inflight op bytes' configurable.
> Currently it's set to 100M, so try setting it to something like 1G
> (1073741824).
> You can also try playing with the max inflight ops (objecter inflight
> ops), it's currently at 1024.
> One way to see if your gateway is throttling is to connecto to the
> admin socket (ceph --admin-daemon=<path to admin socket> help), and
> dump the perf counters. If there's any throttle with wait times on it
> then it's throttling.
>
> Yehuda
>
>
> >
> >
> >
> >
> > 2013/12/24 Kuo Hugo <tonyt...@gmail.com>
> >>
> >> Hi folks,
> >>
> >>
> >> There're 30 HDDs on three 24 threads severs. Each has 2 10G NICs. one
> for
> >> public and one for cluster . A dedicated 32threads server for RadosGW.
> >>
> >> My setting is to achieve same availability as Swift. So that the pool
> >> size=3 anf min_size=2.  for all RadosGW related pools. Each pool's pg
> is set
> >> to 2000.
> >>
> >> Everything is working well but performance.
> >>
> >> Round1) Journals all a SSD with 10 partitions on each server.
> >>
> >> It's faster for small object(1KB). 1100reqs/sec under concurrency=100.
> >> But there's a problem, the total throughPUT has only 150MB/sec.
> >>
> >>
> >> Round2) Journals on HDDs itself
> >>
> >> Better throughPU in this way. The Rados Bench shows 300~400MB/sec.
> >> But the 1KB reqs/sec is really bad about 400reqs/sec.
> >>
> >>
> >> And ..... the reqs/sec reduced along with the number of concurrency.
> >> For example 500 concurrency can only handle 120reqs/sec.
> >>
> >> Dose anyone use RadosGW for high concurrency cases in real?
> >> Could you please let me know which http server are you running for
> RadosGW
> >> ?
> >> How will you leverage all these equipments for building a most
> efficiency
> >> Rados+RadosGW cluster with Swift API ?
> >>
> >> For reference, with same HW and similar setup, Swift can get
> 1600reqs/sec
> >> with 1000 concurrency.
> >>
> >>
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to