On Thu, Jul 28, 2011 at 12:00 PM, Kunal Nawale <knaw...@verivue.com> wrote: > Hi, > I am evaluating luwak to be used as a redundant file storage server. I am > trying to find out which backend will better suit for my purpose. Each of my > server has sixteen 1TB drives, 4 total servers, 48GB ram each, 1x10Gb > network interface > The file sizes that will be stored range from 1GB-20GB, with an average size > of 3 GB. > > Here are some observations/questions I had regarding this. > > 1) With bitcask backend, I tried uploading a 6 GB file. The upload and > download worked fine for this file. But when I tried to upload a 17 GB file > it took a very long time (more than 20 mins). Tried to download it but did > not succeed, the download always come back with a size of 1,000,000 bytes.
There are a few things that can cause these troubles. Have you checked the logs to see if there were any errors during any of these operations? On the upload side, it's possible that 6GB stands on one side of a boundary, and 17GB on the other. I'd suggest searching the size space in a binary fashion: does 11.5GB work? If there is a boundary, this is a good way to find it. It might be worth trying this both in the case where the Riak cluster is cleaned out after each attempt, and where it is left running with all data for all attempts. Does changing the "block_size" luwak parameter (controled by the X-Luwak-Block-Size HTTP header when you create the file) change where the boundary is? Still on the upload side, where was the upload client running? That client may have been applying extra memory pressure to one of your nodes if it was on the same machine as the cluster. On the download side, if you haven't modified the block_size of your luwak files, 1,000,000 indicates that there's exactly one block in the file. If this was an existing file of 1MB, then this just means that your 17GB upload failed before flushing the tree for the new data. We've also noticed that some clients (like Firefox) have trouble parsing Luwak's chunked response, due to an error in gzip encoding - try explicitly setting Accept-Encoding to only identity. > 2) I also tried fs_backend, but it turned out to be quite slow, the > upload of a 6 GB took considerably longer. The download never succeeded > it always returned me a chunk of that file not the whole file. The fs_backend was written as proof/testing code. It is not optimized for any variety of speed. Best to stick with Luwak for your use case, I think. > 3) Are there any performance measurements available about the read/write > bandwidths None directly, but you should be able to estimate the write speed: Luwak creates a Riak object for every N bytes of your file (N is known as the "block size"). Luwak will not be able to write these objects faster than any other Riak client. > 5) Can an object be read simultaneously while it is being written. With a > lag between the read write pointer being in the range of 60 MBytes. I questions 4&5 are related, and I think it's easier to answer 5 first. The simple answer is *new* Luwak files cannot be read while being written. This has to do with the fact that the HTTP interface does not expose a way to flush the file's tree to Riak before finishing the upload. The data for the file is being persisted, but the root pointer is not modified until the end. This also means that *existing* Luwak file *can* be read while being modified, but modifications made after the root pointer is found will be invisible. > 4) Are there any latency numbers available, I am specifically looking at the > time difference between the first byte read and the last byte write for an > object. I'm interpreting this question as, "After I finish writing, how long will it be before I can begin reading what I just wrote?" because of the tree-flushing behavior I described above. The answer depends on the backlog to the Luwak writer process (on the Riak/server side), and the depth of the resulting file tree. Once the writer has written the final block to an object, it must then flush the tree pointing to that object. Flushing the tree requires writing, at least, the root node and the "tld" object (where the metadata about the file is stored). The block_size and tree_order parameters (1,000,000 bytes and 250, by default) determine how many other nodes must be written between the root and the block. Each node is an additional Riak object write. I hope that helps, Bryan _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com