On 6/21/2011 7:23 AM, David F. Skoll wrote:
On Tue, 21 Jun 2011 07:06:11 -0700
Marc Perkel<supp...@junkemailfilter.com>  wrote:

Trying to get MySQL bays working in a high volume environment.
Dedicated MySQL server with SSD drives. Can someone send me a sample
my.cnf file and make other suggestings to keep it running wihout
database corruption and other MySQL "features"? Or - should I be
using some other DB?
We've tried various ways of storing Bayes data (we have our own Bayes
implementation, so this discussion may not correspond exactly with the
SA implementation.)  After trying Berkeley DB files and PostgreSQL---we
would never use MySQL for any data we care about---we finally settled
on Dan Bernstein's CDB format.  It has by far the best performance.
See: http://www.dmo.ca/blog/benchmarking-hash-databases-on-large-data/
Take a look at the "Random Reads" timings.  CDB is 6 times faster than
Berkeley DB!

CDB is read-only, which means when you want to do Bayes training, you
have to rewrite the entire database.  This is not an issue for our
system because of how we do Bayes training, but it may be an issue
with the standard sa-learn.

Regards,

David.




Thanks David but I need real time updating and it's spread across multiple servers. So need PostgreSQL or MySQL.


--
Marc Perkel - Sales/Support
supp...@junkemailfilter.com
http://www.junkemailfilter.com
Junk Email Filter dot com
415-992-3400

Reply via email to