Hello, I have got 3 servers, with 3 HDD-OSD / server (4 TB WD RE). I'm using radosgw primary. Every .rgw.* pool has 3 replica. Every server is rados gateway with apache2+fastcgi (ceph patched version). Servers type: SuperMicro SSG-6047R-E1R36L.
I've got 10 client machines which uploads objects parallel. Size of each object is variable, but between 50-500 KiB typical. haproxy is installed to every client machine, and upload script connects to localhost (haproxy) which round-robin distribute requests between three rgw servers. Typical avg upload times: Best: 260 ms Avg: 775 ms Worst: 1826 ms Not rare 10 secs upload time with 74 KiB (or same low) object size. Sample log entry from radosgw: 2 req 157744:0.000124::PUT /snd1_rec/20140111_human.mp4::initializing 2 req 157744:0.000214:s3:PUT /snd1_rec/20140111_human.mp4::getting op 2 req 157744:0.000230:s3:PUT /snd1_rec/20140111_human.mp4:put_obj:authorizing 2 req 157744:0.000518:s3:PUT /snd1_rec/20140111_human.mp4:put_obj:reading permissions 2 req 157744:0.000662:s3:PUT /snd1_rec/20140111_human.mp4:put_obj:init op 2 req 157744:0.000679:s3:PUT /snd1_rec/20140111_human.mp4:put_obj:verifying op mask 2 req 157744:0.000689:s3:PUT /snd1_rec/20140111_human.mp4:put_obj:verifying op permissions 2 req 157744:0.000699:s3:PUT /snd1_rec/20140111_human.mp4:put_obj:verifying op params 2 req 157744:0.000727:s3:PUT /snd1_rec/20140111_human.mp4:put_obj:executing 2 req 157744:11.025770:s3:PUT /snd1_rec/20140111_human.mp4:put_obj:http status=200 11 seconds between the last two logline. I think not radosgw is the slowest part, but I don't known what is. I'm monitoring lot of resources of servers (vmstat, disk throughput, disk io, cpu...) and there is nothing to run out. I don't know how important but there is not rare container with 4 million of objects. Ubuntu 12.04-x64 with 0.72.1-1precise. Thank you, Mihaly
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com