So, essentially, creation of new keys/objects is much slower than an update to already existing keys/objects? Just want to make sure I'm following your description properly. *
<http://www.loomlearning.com/> Jonathan Langevin Systems Administrator Loom Inc. Wilmington, NC: (910) 241-0433 - jlange...@loomlearning.com - www.loomlearning.com - Skype: intel352 * On Tue, Aug 30, 2011 at 12:30 PM, Kresten Krab Thorup <k...@trifork.com>wrote: > If you can do the inserts in sorted (ascending) key order, then innostore > will be significantly faster. > > Kresten > > > Mobile: + 45 2343 4626 | Skype: krestenkrabthorup | Twitter: @drkrab > Trifork A/S | Margrethepladsen 4 | DK- 8000 Aarhus C | Phone : +45 > 8732 8787 | www.trifork.com > > Trifork organizes the world class conference on software development: GOTO > Aarhus - check it out! > > > > On Aug 30, 2011, at 6:14 PM, David Koblas wrote: > > > I'm currently working on importing a very large dataset (800M) into Riak > and running into some serious performance problems. Hopefully this is just > configuration issues and nothing deeper... > > > > Hardware - > > * 8 proc box > > * 32 Gb ram > > * 5TB disk - RAID10 > > > > Have a cluster of 4 for these boxes all running riak - riak configuration > options that are different from stock: > > > > * Listening on all IP address "0.0.0.0" > > * {storage_backend, riak_kv_innostore_backend}, > > * innostore section - {buffer_pool_size, 17179869184}, %% 16GB > > * innostore section - {flush_method, "O_DIRECT"} > > > > What I see is that the performance of my import script runs at about > 200...300 keys per/second for keys that it's seen recently (e.g. re-runs) > then drops to 20ish keys per/sec for new keys. > > STATS: 1000 keys handled in 3 seconds 250.75 keys/sec > > STATS: 1000 keys handled in 3 seconds 258.20 keys/sec > > STATS: 1000 keys handled in 4 seconds 240.11 keys/sec > > STATS: 1000 keys handled in 5 seconds 177.63 keys/sec > > STATS: 1000 keys handled in 4 seconds 246.26 keys/sec > > STATS: 1000 keys handled in 5 seconds 184.79 keys/sec > > STATS: 1000 keys handled in 5 seconds 195.95 keys/sec > > STATS: 1000 keys handled in 47 seconds 21.02 keys/sec > > STATS: 1000 keys handled in 44 seconds 22.63 keys/sec > > STATS: 1000 keys handled in 42 seconds 23.64 keys/sec > > STATS: 1000 keys handled in 43 seconds 22.88 keys/sec > > STATS: 1000 keys handled in 45 seconds 22.12 keys/sec > > STATS: 1000 keys handled in 43 seconds 22.83 keys/sec > > STATS: 1000 keys handled in 43 seconds 23.11 keys/sec > > Of course with 800M records to import a performance of 20 keys/sec is not > useful, plus as time goes on having an insert rate at that level is going to > be problematic. > > > > Questions - > > Is there additional things to change for imports and datasets on this > scale? > > Is there a way to get additional debugging to see where the performance > issues are? > > > > Thanks, > > _______________________________________________ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com