Thanks for the response, Yehuda.

[ more text below ]

On 2014年08月05日 05:33, Yehuda Sadeh wrote:
On Fri, Aug 1, 2014 at 9:49 AM, Osier Yang <agedos...@gmail.com> wrote:
[ correct the URL ]


On 2014年08月02日 00:42, Osier Yang wrote:
Hi, list,

I managed to setup radosgw in testing environment to see if it's
stable/mature enough
for production use these several days. In the meanwhile, I tried to read
the source code
of radosgw to understand how it actually manages the underlying storage.

The testing result shows the the write performance to a bucket is not
good, as far as I
understood from the code, it's caused by there is only *one* bucket index
object for a
single bucket, which is not nice in principle. And moreover, requests to
the whole bucket
could be blocked if the corresponding bucket index object happens to be in
recovering or
backfilling process. This is not acceptable in production use. Although I
saw Guang Yang
did some work (the prototype patches [1]) to try to resolve the problem
with the bucket
index sharding, I'm not quite confident about if it could solve the
problem from root,
since it looks like radosgw is trying to manage millions or billions
objects in one bucket
with the index, I'm a bit worried about it even the index sharding is
supported.

Another problem I encounted is: when I upgraded radosgw to latest version
(Firefly),
radosgw-admin works well, read request works well too, but all write
request fails. Note
that I didn't do any changes on the config files, it means there is some
compactibilties
problems (client in new version fails to talk with ceph cluster in old
version). The error
looks like:

2014-07-31 10:13:10.045921 7fdb40ddd700 0 ERROR: can't read user header:
ret=-95
2014-07-31 10:13:10.045930 7fdb40ddd700 0 ERROR: sync_user() failed,
user=osier ret=-95
2014-07-31 17:00:56.075066 7fe514fe6780 0 ceph version 0.80.5
(38b73c67d375a2552d8ed67843c8a65c2c0feba6), process radosgw, pid 19974
2014-07-31 17:00:56.197659 7fe514fe6780 0 framework: fastcgi
2014-07-31 17:00:56.197666 7fe514fe6780 0 starting handler: fastcgi
2014-07-31 17:00:56.198941 7fe4f8ff9700 0 ERROR: FCGX_Accept_r returned -9
2014-07-31 17:00:56.211176 7fe4f9ffb700 0 ERROR: can't read user header:
ret=-95
2014-07-31 17:00:56.211197 7fe4f9ffb700 0 ERROR: sync_user() failed,
user=Bob Dylon ret=-95
2014-07-31 17:00:56.212306 7fe4f9ffb700 0 ERROR: can't read user header:
ret=-95
2014-07-31 17:00:56.212325 7fe4f9ffb700 0 ERROR: sync_user() failed,
user=osier ret=-95
Did you upgrade the osds? Did you restart the osds after upgrade?

No, I didn't upgrade osds, and didn't restart osds. What I did is simply using
newer version radosgw against the ceph cluster which still is using the old
version.

So it sounds like using newer version radosgw requires newer osd?


With these two experience, I was starting to think about if radosgw is
stable/mature
enough yet. It seems that dreamhost is the only one using radosgw for
service, though
it seems there are use cases in private environments from google. I have
no way to
demonstrate if it's stable and mature enough for production use except
trying to understand
how it works, however, I guess everybody knows it will be too hard to go
back if a distributed
system is already in production use. So I'm asking here to see if I could
get some advices/
thoughts/suggestions from who already managed to setup radosgw for
production use.

In case of the mail is long/boring enough, I'm submarizing my questions
here:

1) Is radosgw stable/mature enough for production use?
We consider it stable and mature for production use.

2) How it behaves in performance (especially on writing) in practice?
Different use cases and patterns have different performance
characteristics. As you mentioned, objects going to the same bucket
will contend on the bucket index. In the future we will be able to
shard that and it will mitigate the problem a bit. Other ideas are to
drop the bucket index altogether for use cases where object listing is
not really needed.

Bucket listing is important for us too. Disabling it will just make me crazy. :-)

I did the performance testing yesterday, I'd like share the result here:

1) Tesing environment

  ceph cluster:  3 nodes, 1 monitor and 2 osd on each node.  These 3 nodes
are relatively cheap PC (I even don't want to mention the description of their
CPU and memory here).

  ceph version:  Emperor

  radosgw version:  Emperor too (I didn't managed to successfully test with
newer radosgw).

  radosgw instance:  1; VM; memory/4G; CPU/1

  Client: VM; memory/1G, CPU/1

  Internal network bandwidth: 1g

I would execute testing command on the "client vm";  And since the testing
environment is far more worse than production environment, the testing
result somehow will be meaningless, so I would do testing with both "rest-bench"
against radosgw and "rados bench",  thus the result could tell how much
performance eaten up by radosgw (mainly the single bucket index object).

2) radosgw config (only the ones which *might* affect performace are listed)

rgw thread pool size = 1000

rgw enable usage log = true
rgw usage log tick interval = 30
rgw usage log flush threshold = 1024
rgw usage max shards = 32
rgw usage max user shards = 1

# Operation logs are disabled
#rgw enable ops log = true
#rgw ops log rados = true

#debug rgw = 20
#debug ms = 1

rgw cache lru size = 100000 # the default is 10000

3) Testing command:

root@testing-bob:~# rados --cluster=s3test0 -p osier_test bench 20 write -b $size -t 50

root@testing-bob:~# rest-bench --api-host=testing-s3gw0 --access-key=L6K3FF1OOXO4EY1FH9RF --secret="/pYIF3jc3NSkVCWPklSM+BIf7IVr74MSnSvbc4Ac" --protocol=http --uri_style=path --bucket=bob0 --seconds=20 --concurrent-ios=50 --block-size=$size --show-time write

As above command tells, I'm testing with 50 cocurrent threads in 20s for both
"rest-bench" and "rados bench".

4) Testing result

NOTE:
* The data size I'm using for testing is: 10Bytes; 1Kib; 10Kib; 100Kib; 500Kib;
    1Mib; 5Mib; 20Mib; and 40Mib

* Pay attention to the "finished" operations, instead of "Total writes made"; And the total time, since "rest-bench" doesn't always ends up in 20 seconds.

* As the testing result shows: for small data writes, the maximum performance lost with radosgw is nearly 80%, comparing with writes with rados directly.

* With the data size growing up, the performance lost is decreased, the best result is 50% lost. However, when the data size grows up to some point (see 40Mib testing result), we can see clearly that the writes are serialized, and
     the performance lost is huge then.

* As a summary, the single bucket index object affects the performance a lot.

[ 10Bytes ]

== rados ==

2014-08-04 22:34:04.151231min lat: 0.008503 max lat: 1.72593 avg lat: 0.197445
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat avg lat
20 50 4944 4894 0.002333140.000705719 0.250708 0.197445
 Total time run:         20.646440
Total writes made:      4945
Write size:             10
Bandwidth (MB/sec):     0.002

Stddev Bandwidth:       0.00130826
Max bandwidth (MB/sec): 0.00367165
Min bandwidth (MB/sec): 0
Average Latency:        0.208751
Stddev Latency:         0.223234
Max latency:            1.72593
Min latency:            0.008503

== radosgw ==

2014-08-04 23:51:41.327835min lat: 0.055295 max lat: 3.28076 avg lat: 1.20371 2014-08-04 23:51:41.327835 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-08-04 23:51:41.327835 20 50 849 7990.0003807580.000324249 1.51293 1.20371 2014-08-04 23:51:42.328136 21 50 850 8000.0003630869.53674e-06 3.06026 1.20603 2014-08-04 23:51:43.328345 22 50 850 8000.000346588 0 - 1.20603 2014-08-04 23:51:44.328556 23 50 850 8000.000331524 0 - 1.20603 2014-08-04 23:51:45.328769 24 50 850 8000.000317716 0 - 1.20603 2014-08-04 23:51:46.328989 25 50 850 8000.000305011 0 - 1.20603 2014-08-04 23:51:47.329214 26 49 850 8010.000293651.90735e-06 6.33663 1.21244
2014-08-04 23:51:48.329488 Total time run:         26.759887
Total writes made:      850
Write size:             10
Bandwidth (MB/sec):     0.000

Stddev Bandwidth:       0.000185797
Max bandwidth (MB/sec): 0.000543594
Min bandwidth (MB/sec): 0
Average Latency:        1.56037
Stddev Latency:         1.54032
Max latency:            9.433
Min latency:            0.055295



[ 1Kib ]

== rados ==

2014-08-04 22:38:12.770177min lat: 0.006787 max lat: 2.03145 avg lat: 0.191323
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat avg lat
    20      50      5196      5146  0.251217 0.0810547  0.951518 0.191323
 Total time run:         20.694827
Total writes made:      5197
Write size:             1024
Bandwidth (MB/sec):     0.245

Stddev Bandwidth:       0.209637
Max bandwidth (MB/sec): 0.999023
Min bandwidth (MB/sec): 0
Average Latency:        0.199098
Stddev Latency:         0.263302
Max latency:            2.03145
Min latency:            0.006787

== radosgw ==

2014-08-04 22:39:16.448663min lat: 0.058305 max lat: 5.84678 avg lat: 1.55047 2014-08-04 22:39:16.448663 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-08-04 22:39:16.448663 20 50 666 616 0.0300611 0.0449219 0.611505 1.55047 2014-08-04 22:39:17.448976 21 45 667 622 0.02890880.00585938 1.15587 1.54779 2014-08-04 22:39:18.449173 22 45 667 622 0.0275952 0 - 1.54779
2014-08-04 22:39:19.449449 Total time run:         22.759544
Total writes made:      667
Write size:             1024
Bandwidth (MB/sec):     0.029

Stddev Bandwidth:       0.0214045
Max bandwidth (MB/sec): 0.0722656
Min bandwidth (MB/sec): 0
Average Latency:        1.69066
Stddev Latency:         1.11007
Max latency:            5.89554
Min latency:            0.058305

[ 10Kib ]

== rados ==

2014-08-04 22:41:01.756019min lat: 0.003658 max lat: 1.33439 avg lat: 0.1979
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat avg lat
    20      50      5075      5025   2.45309   2.94922   0.13717 0.1979
 Total time run:         20.132932
Total writes made:      5076
Write size:             10240
Bandwidth (MB/sec):     2.462

Stddev Bandwidth:       1.57143
Max bandwidth (MB/sec): 6.03516
Min bandwidth (MB/sec): 0
Average Latency:        0.198299
Stddev Latency:         0.206342
Max latency:            1.33439
Min latency:            0.003658

== radosgw ==

2014-08-04 23:59:08.103196min lat: 0.081382 max lat: 4.20567 avg lat: 1.21503 2014-08-04 23:59:08.103196 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-08-04 23:59:08.103196 20 50 832 782 0.381632 0.273438 1.16601 1.21503 2014-08-04 23:59:09.103480 21 47 833 786 0.365322 0.0390625 3.06067 1.21989 2014-08-04 23:59:10.103669 22 45 833 788 0.34961 0.0195312 1.74456 1.22298 2014-08-04 23:59:11.103882 23 45 833 788 0.334413 0 - 1.22298
2014-08-04 23:59:12.104123 Total time run:         23.028529
Total writes made:      833
Write size:             10240
Bandwidth (MB/sec):     0.353

Stddev Bandwidth:       0.189902
Max bandwidth (MB/sec): 0.546875
Min bandwidth (MB/sec): 0
Average Latency:        1.36834
Stddev Latency:         0.93226
Max latency:            5.58653
Min latency:            0.081382


[ 100Kib ]

== rados ==

2014-08-04 22:43:12.878546min lat: 0.00586 max lat: 1.92724 avg lat: 0.215224
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat avg lat
    20      50      4600      4550    22.212   19.3359  0.111257 0.215224
 Total time run:         20.668664
Total writes made:      4601
Write size:             102400
Bandwidth (MB/sec):     21.739

Stddev Bandwidth:       13.2295
Max bandwidth (MB/sec): 60.0586
Min bandwidth (MB/sec): 0
Average Latency:        0.224462
Stddev Latency:         0.226179
Max latency:            1.92724
Min latency:            0.00586

== radosgw ==

2014-08-04 23:54:52.136557min lat: 0.121387 max lat: 5.76303 avg lat: 1.35267 2014-08-04 23:54:52.136557 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-08-04 23:54:52.136557 20 50 699 649 3.16721 0.341797 2.90547 1.35267 2014-08-04 23:54:53.136860 21 44 700 656 3.04896 0.683594 3.0446 1.37228
2014-08-04 23:54:54.137117 Total time run:         21.477508
Total writes made:      700
Write size:             102400
Bandwidth (MB/sec):     3.183

Stddev Bandwidth:       2.13648
Max bandwidth (MB/sec): 7.03125
Min bandwidth (MB/sec): 0
Average Latency:        1.5243
Stddev Latency:         1.20656
Max latency:            5.76303
Min latency:            0.121387


[ 500Kib ]

== rados ==

2014-08-04 22:45:55.736963min lat: 0.028845 max lat: 3.00344 avg lat: 0.386006
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat avg lat
    20      50      1816      1766   43.1048         0         - 0.386006
    21      50      1816      1766   41.0521         0         - 0.386006
    22      50      1816      1766   39.1861         0         - 0.386006
    23      50      1816      1766   37.4825         0         - 0.386006
    24      50      1817      1767   35.9411 0.0697545   11.6738 0.392395
 Total time run:         24.548976
Total writes made:      1817
Write size:             512000
Bandwidth (MB/sec):     36.140

Stddev Bandwidth:       34.3547
Max bandwidth (MB/sec): 80.0781
Min bandwidth (MB/sec): 0
Average Latency:        0.675502
Stddev Latency:         1.76753
Max latency:            12.673
Min latency:            0.028845

== radosgw ==

2014-08-04 22:46:51.406535min lat: 0.199692 max lat: 6.7487 avg lat: 1.95867 2014-08-04 22:46:51.406535 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-08-04 22:46:51.406535 20 50 490 440 10.7355 6.83594 4.42465 1.95867 2014-08-04 22:46:52.406851 21 50 491 441 10.2476 0.488281 4.87118 1.96528 2014-08-04 22:46:53.407047 22 50 491 441 9.78203 0 - 1.96528 2014-08-04 22:46:54.407243 23 50 491 441 9.35689 0 - 1.96528 2014-08-04 22:46:55.407438 24 50 491 441 8.96716 0 - 1.96528
2014-08-04 22:46:56.407725 Total time run:         24.562034
Total writes made:      491
Write size:             512000
Bandwidth (MB/sec):     9.761

Stddev Bandwidth:       7.0541
Max bandwidth (MB/sec): 23.9258
Min bandwidth (MB/sec): 0
Average Latency:        2.48753
Stddev Latency:         2.01542
Max latency:            10.482
Min latency:            0.199692


[ 1Mib ]

== rados ==

2014-08-04 22:48:05.348669min lat: 0.059332 max lat: 5.30887 avg lat: 0.895115
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat avg lat
    20      50      1099      1049   51.2076   11.7188   3.94854 0.895115
 Total time run:         20.426123
Total writes made:      1100
Write size:             1024000
Bandwidth (MB/sec):     52.590

Stddev Bandwidth:       32.4589
Max bandwidth (MB/sec): 87.8906
Min bandwidth (MB/sec): 0
Average Latency:        0.927253
Stddev Latency:         0.945177
Max latency:            5.30887
Min latency:            0.059332

== radosgw ==

2014-08-05 00:01:57.779506min lat: 0.291824 max lat: 11.4166 avg lat: 3.84634 2014-08-05 00:01:57.779506 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-08-05 00:01:57.779506 20 50 266 216 10.5389 0 - 3.84634 2014-08-05 00:01:58.779846 21 50 266 216 10.0373 0 - 3.84634 2014-08-05 00:01:59.780046 22 49 267 218 9.66998 0.651042 5.49671 3.86582 2014-08-05 00:02:00.780275 23 49 267 218 9.24974 0 - 3.86582
2014-08-05 00:02:01.780540 Total time run:         23.846098
Total writes made:      267
Write size:             1024000
Bandwidth (MB/sec):     10.934

Stddev Bandwidth:       8.56096
Max bandwidth (MB/sec): 35.1562
Min bandwidth (MB/sec): 0
Average Latency:        4.44906
Stddev Latency:         2.73214
Max latency:            11.4166
Min latency:            0.291824


[ 5Mib ]

== rados ==

2014-08-04 22:50:20.362025min lat: 1.95706 max lat: 8.57053 avg lat: 3.77608
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat avg lat
    20      49       290       241   58.8244   107.422   2.18729 3.77608
    21      17       291       274   63.6944   161.133   1.15035 3.50209
    22      17       291       274   60.7992         0         - 3.50209
    23      15       291       276   58.5804   4.88281   3.83261 3.50451
 Total time run:         23.359546
Total writes made:      291
Write size:             5120000
Bandwidth (MB/sec):     60.827

Stddev Bandwidth:       52.3299
Max bandwidth (MB/sec): 161.133
Min bandwidth (MB/sec): 0
Average Latency:        3.51905
Stddev Latency:         1.97284
Max latency:            8.57053
Min latency:            0.995771

== radosgw ==

2014-08-05 00:03:36.219328min lat: 4.10734 max lat: 16.7196 avg lat: 8.66002 2014-08-05 00:03:36.219328 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-08-05 00:03:36.219328 20 50 138 88 21.4675 9.76562 7.59011 8.66002 2014-08-05 00:03:37.219679 21 50 138 88 20.4456 0 - 8.66002 2014-08-05 00:03:38.219884 22 50 139 89 19.7386 2.44141 12.5832 8.70411 2014-08-05 00:03:39.220072 23 50 139 89 18.8808 0 - 8.70411 2014-08-05 00:03:40.220275 24 50 139 89 18.0945 0 - 8.70411 2014-08-05 00:03:41.220484 25 50 139 89 17.3711 0 - 8.70411 2014-08-05 00:03:42.220688 26 44 139 95 17.8293 7.32422 8.48089 8.92694
2014-08-05 00:03:43.220929 Total time run:         26.178957
Total writes made:      139
Write size:             5120000
Bandwidth (MB/sec):     25.926

Stddev Bandwidth:       18.0919
Max bandwidth (MB/sec): 68.3594
Min bandwidth (MB/sec): 0
Average Latency:        9.39576
Stddev Latency:         3.19278
Max latency:            18.8883
Min latency:            4.10734


[ 20Mib ]

== rados ==

2014-08-04 22:52:59.584809min lat: 9999 max lat: 0 avg lat: 0
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat avg lat
    20      34        34         0         0         0 -         0
    21      36        36         0         0         0 -         0
    22      38        38         0         0         0 -         0
    23      39        39         0         0         0 -         0
    24      41        41         0         0         0 -         0
    25      42        42         0         0         0 -         0
    26      42        42         0         0         0 -         0
    27      43        43         0         0         0 -         0
    28      46        46         0         0         0 -         0
    29      48        48         0         0         0 -         0
    30      41        50         9   5.81711   5.85938   28.2801 29.4592
    31       4        50        46   28.7793   722.656   3.37337 18.5771
    32       1        50        49   29.7045   58.5938   3.48328 17.6497
 Total time run:         32.246896
Total writes made:      50
Write size:             20480000
Bandwidth (MB/sec):     30.284

Stddev Bandwidth:       125.863
Max bandwidth (MB/sec): 722.656
Min bandwidth (MB/sec): 0
Average Latency:        17.3433
Stddev Latency:         9.29014
Max latency:            30.1162
Min latency:            2.33119

== radosgw ==

2014-08-04 22:54:03.379562min lat: 13.3435 max lat: 19.4876 avg lat: 17.8951 2014-08-04 22:54:03.379562 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-08-04 22:54:03.379562 20 49 64 15 14.5565 136.719 19.4876 17.8951 2014-08-04 22:54:04.379954 21 50 65 15 13.8672 0 - 17.8951 2014-08-04 22:54:05.380137 22 50 65 15 13.2404 0 - 17.8951 2014-08-04 22:54:06.380336 23 50 65 15 12.6677 0 - 17.8951 2014-08-04 22:54:07.380551 24 50 65 15 12.1426 0 - 17.8951 2014-08-04 22:54:08.380742 25 50 65 15 11.6593 0 - 17.8951 2014-08-04 22:54:09.380915 26 50 65 15 11.2129 0 - 17.8951 2014-08-04 22:54:10.381107 27 50 65 15 10.7995 0 - 17.8951 2014-08-04 22:54:11.381314 28 50 65 15 10.4155 0 - 17.8951 2014-08-04 22:54:12.381502 29 50 65 15 10.0579 0 - 17.8951 2014-08-04 22:54:13.381701 30 48 65 17 11.0205 3.90625 29.8248 19.2985 2014-08-04 22:54:14.381900 31 47 65 18 11.2938 19.5312 16.1941 19.1261 2014-08-04 22:54:15.382104 32 47 65 18 10.9422 0 - 19.1261
2014-08-04 22:54:16.538449 Total time run:         32.852659
Total writes made:      65
Write size:             20480000
Bandwidth (MB/sec):     38.643

Stddev Bandwidth:       24.8996
Max bandwidth (MB/sec): 136.719
Min bandwidth (MB/sec): 0
Average Latency:        24.9568
Stddev Latency:         8.35865
Max latency:            32.807
Min latency:            13.2119

[ 40Mib ]

== rados ==

2014-08-04 22:56:17.539373min lat: 9999 max lat: 0 avg lat: 0
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat avg lat
    40      49        49         0         0         0 -         0
    41      49        49         0         0         0 -         0
    42       1        50        49   45.5638   45.5729   3.07695 21.9831
 Total time run:         42.303319
Total writes made:      50
Write size:             40960000
Bandwidth (MB/sec):     46.170

Stddev Bandwidth:       6.9498
Max bandwidth (MB/sec): 45.5729
Min bandwidth (MB/sec): 0
Average Latency:        21.6088
Stddev Latency:         12.1152
Max latency:            41.2854
Min latency:            3.07695

== radosgw ==

2014-08-04 23:04:17.740359min lat: 9999 max lat: 0 avg lat: 0
2014-08-04 23:04:17.740359 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-08-04 23:04:17.740359 40 50 50 0 0 0 - 0 2014-08-04 23:04:18.740650 41 50 50 0 0 0 - 0 2014-08-04 23:04:19.740852 42 50 50 0 0 0 - 0 2014-08-04 23:04:20.741063 43 50 50 0 0 0 - 0 2014-08-04 23:04:21.741239 44 50 50 0 0 0 - 0 2014-08-04 23:04:22.742059 45 49 51 2 1.73223 1.73611 44.3911 44.3332 2014-08-04 23:04:23.742235 46 49 51 2 1.69465 0 - 44.3332 2014-08-04 23:04:24.742429 47 49 51 2 1.65866 0 - 44.3332
2014-08-04 23:04:25.742675 Total time run:         47.742303
Total writes made:      51
Write size:             40960000
Bandwidth (MB/sec):     41.728

Stddev Bandwidth:       0.250586
Max bandwidth (MB/sec): 1.73611
Min bandwidth (MB/sec): 0
Average Latency:        46.4062
Stddev Latency:         6.17721
Max latency:            47.7395
Min latency:            3.36783

Regards,
Osier
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to