Good afternoon Charlie, So the issue we’re having is only with bucket listing.
alxndrmlr@alxndrmlr-mbp $ time s3cmd -c .s3cfg-riakcs-admin ls s3://bonfirehub-resources-can-east-doc-conversion DIR s3://bonfirehub-resources-can-east-doc-conversion/organizations/ real 2m0.747s user 0m0.076s sys 0m0.030s where as… alxndrmlr@alxndrmlr-mbp $ time s3cmd -c .s3cfg-riakcs-admin ls s3://bonfirehub-resources-can-east-doc-conversion/organizations/OrganizationID-1/documents/proposals DIR s3://bonfirehub-resources-can-east-doc-conversion/organizations/OrganizationID-1/documents/proposals/ real 0m10.262s user 0m0.075s sys 0m0.028s The contents of this bucket contains a lot of very small files (basically for each PDF we receive I split it to .JPG foreach page and store them here. Based on the my latest counts it looks like we have around 170,000 .JPG files in that bucket. Here’s a snippet from the HAProxy log for the 504 timeouts… Aug 12 16:01:34 localhost.localdomain haproxy[4718]: 192.0.223.236:48457 [12/Aug/2014:16:01:24.454] riak_cs~ riak_cs_backend/riak3 161/0/0/-1/10162 504 194 - - sH-- 0/0/0/0/0 0/0 {bonfirehub-resources-can-east-doc-conversion.bf-riakcs.com} "GET /?delimiter=/ HTTP/1.1" I’ve put together a video showing off the top results of each of the 5 riak nodes while performing $ time s3cmd -c .s3cfg-riakcs-admin ls s3://bonfirehub-resources-can-east-doc-conversion https://dl.dropboxusercontent.com/u/5723659/RiakCS%20ls%20monitoring%20results.mov Now I’ve had a hunch this is just a fundamentally expensive operation which exceeds the 5000ms response time threshold set in our HAProxy config (which I raised during the video to illustrate what’s going on). After reading http://www.quora.com/Riak/Is-it-really-expensive-for-Riak-to-list-all-buckets-Why and http://www.paperplanes.de/2011/12/13/list-all-of-the-riak-keys.html I’m feeling like this is just a fundamental issue with the data structure in Riak. Based on this I’m thinking that cost of this type of query is only going to get worse over time as we add more keys to this bucket (unless secondary indexes can be added). Or am I totally out to lunch here and there’s some other underlying problem? I’ve cc’d the mailing list on this as suggested. Alex Millar, CTO Office: 1-800-354-8010 ext. 704 Mobile: 519-729-2539 GoBonfire.com From: Charlie Voiselle <cvoise...@basho.com> Reply: Charlie Voiselle <cvoise...@basho.com>> Date: August 13, 2014 at 10:36:51 AM To: Alex Millar <a...@gobonfire.com>> Cc: Tad Bickford <tbickf...@basho.com>> Subject: Fwd: RiakCS 504 Timeout on s3cmd for certain keys
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com