[elephant-devel] working with many millions of objects

Red Daly Tue, 10 Oct 2006 00:46:17 -0700

I will be running experiments in informatics and modeling in the futurethat may contain (tens or hundreds of) millions of objects. Given theease of use of elephant so far, it would be great to use it as thepersistent store and avoid creating too many custom data structures.

I have recently run up against some performance bottlenecks when usingelephant to work with very large datasets (in the hundreds of millionsof objects). Using SleepyCat, I am able to import data very quicklywith a DB_CONFIG file with the following contents:


set_lk_max_locks 500000
set_lk_max_objects 500000
set_lk_max_lockers 500000
set_cachesize 1 0 0

I can import data very quickly until the 1 gb cache is too small toallow complete in-memory access to the database. at this point it seemsthat disk IO makes additional writes happen much slower. (I have alsotried increasing the 1 gb cache size, but the database fails to open ifit is too large--e.g. 2 gbs. I have 1.25 gb physical memory and 4 gbswap, so the constraint seems to be physical memory.) the max_lock,etc. lines allow transactions to contain hundreds of thousands ofindividual locks, limiting the transaction throughput bottleneck

What are the technical restrictions on writing several million objectsto the datastore? Is it feasible to create a batch import feature toallow large datasets to be imported using reasonable amounts of memoryfor a desktop computer?


I hope this email is at least amusing!

Thanks again,
red daly
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

[elephant-devel] working with many millions of objects

Reply via email to