Hi, If you have big table you could also think about Hadoop/HBase or Cassandra but do not put large data set in MySQL. I agree with Bill that "Despite the fact that lots of people have been able to make it (MySQL) work" (me too, another example), there are issues with it. I have been using MySQL for a number of years, using it to handle large DBs with large number of users, the MySQL is the bottleneck, especially when running table joins for large data set, CPU and I/O load went up ......
If switching to PostgreSQL, PostgreSQL 9.1.x is very good choice for production deployment. Thanks Tony P.S. Today I did some stress tests on my PostgreSQL staging server: a) insert 2 billions records into the test table, b) full scan the table. here are some test results: Facts: Number of records: 2 billions records inserted today Full table scan: about 16.76 minutes to scan 2 billions of rows, really AMAZING! Database size: 109GB PostgrSQL: 9.2.1 Physical RAM: 8GB CPU: i5 ######## EXPLAIN ANALYZE SELECT COUNT(*) FROM test; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=33849559.60..33849559.61 rows=1 width=0) (actual time=1006476.308..1006476.309 rows=1 loops=1) -> Seq Scan on test (cost=0.00..28849559.28 rows=2000000128 width=0) (actual time=47.147..903264.427 rows=2000000000 loops=1) Total runtime: 1006507.963 ms On 11 Dec 2012, at 8:27 PM, Bill Moran wrote: > On Mon, 10 Dec 2012 15:26:02 -0500 (EST) "Mihai Popa" <mi...@lattica.com> > wrote: > >> Hi, >> >> I've recently inherited a project that involves importing a large set of >> Access mdb files into a Postgres or MySQL database. >> The process is to export the mdb's to comma separated files than import >> those into the final database. >> We are now at the point where the csv files are all created and amount >> to some 300 GB of data. >> >> I would like to get some advice on the best deployment option. >> >> First, the project has been started using MySQL. Is it worth switching >> to Postgres and if so, which version should I use? > > I've been managing a few large databases this year, on both PostgreSQL and > MySQL. > > Don't put your data in MySQL. Ever. If you feel like you need to use > something like MySQL, just go straight to a system that was designed with > no constraints right off the bat, like Mongo or something. > > Don't put large amounts of data in MySQL. There are lots of issuse with it. > Despite the fact that lots of people have been able to make it work (me, > for example) it's a LOT harder to keep running well than it is on > PostgreSQL. MySQL just isn't designed to deal with large data. As some > examples: lack of CREATE INDEX CONCURRENTLY, the fact that the default > configuration stores everything in a single file, the fact that any table > changes (including simple things like adding a comment, or seemingly > unrelated things like adding an index) require a complete table rebuild, > and the fact that if you use anything other than INT AUTO_INCREMENT for > your primary key you're liable to hit on awful inefficiencies. > > PostgreSQL has none of these problems. > > -- > Bill Moran <wmo...@potentialtech.com> > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general