I am currently load testing Riak using riak_0.14.2-1_amd64.deb with fs.file-max set to 503840 for all users.
I have a reasonably large set of data (hundreds of millions of documents, many terabytes in size) that is currently stored in a combination of PostgreSQL+Redis and Disco/DDFS. The first for key/value and the second for map/reduce to satisfy the full set of user requirements. I am trying to consolidate these data sources so trying out a variety of different data stores with the potential of satisfying both usage types. With Riak, my main challenge is getting this data loaded. Using the PHP library I am able to push 100-200 documents/sec. Is there a recommended approach to bulk loading data? At that pace it would take a couple months to load everything. That is not necessarily a deal breaker, but wanted to sniff around for better options. Related to this, I did attempt to break up my records and load them with a bunch of concurrently running loaders. This actually seems to work fairly well with not much of a penalty in terms of documents/sec on any single loader process. But, once I reach 4-5 loaders running concurrently I consistently get the "Could not contact Riak Server" error and all of my loader processes die simultaneously. If I wait a few seconds the Riak server does begin to respond again. Any idea for approaching this differently? Is attempting to run many loaders concurrently a bad idea with Riak? I am running a single server right now while I test with bucket nval set to 1. -- View this message in context: http://riak-users.197444.n3.nabble.com/Bulk-loading-data-and-Could-not-contact-Riak-Server-error-tp3217091p3217091.html Sent from the Riak Users mailing list archive at Nabble.com. _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com