Dear Riak-Users, we consider to save a large amount (50000000) of binary Data (Images) in a riak cluster. Each image has a size of 648 KB. We want to store 3 copy's of each image.
In this case i need to store 50000000 * 648 KB * 3 = 90.5 TB Data. This calculation didn't include any overhead for reorganisation and other stuff. On the other hand is the network. I run some benchmarks on a 4 node cluster. Each with a 1 Gbps interface. In addition to the benchmarks I've made some calculations. Some information for the benchmark: - I use the same interface for clustercommunication and benchmarking. - I use the riak http api interface - time curl -s HTTP://interface:8098/buckets/test-01/keys/[10001-20000].jpg > /dev/null In theory, a 1 Gbps interface provides 125 MB per second. In my calculation i only use 50 percent of the theoretically available bandwidth. This fit very well to my benchmarks. I try a while with the '{"props":{"r":X}}'. Calculation “r=2” available bandwidth = 62.5 MB per second / (3*648 KB) = 33 requests per second per node = 132 requests per second over the cluster. Calculation “r=1” available bandwidth = 62.5 MB per second / (2*648 KB) = 50 requests per second per node = 200 requests per second over the cluster. In this second case i see some strange effects in the network. My send and received queues grow verry fast. And after finishing the benchmark there is a while a lot of traffic between the riak nodes. Does anyone have experience with these data sets and can give a few hints at a possible setup? The goal is to processed at least 500 requests per second. Some other points in my considerations are the time required for a reorganization after a new node are added to the cluster or a node has been replaced. Many thanks for your reply and your attention. Kind regards Sebastian _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com