Hi, 

If you have big table you could also think about Hadoop/HBase or Cassandra but 
do not put large data set in MySQL. I agree with Bill that "Despite the fact 
that lots of people have been able to make it (MySQL) work" (me too, another 
example), there are issues with it.  I have been using MySQL for a number of 
years, using it to handle large DBs with large number of users,  the MySQL is 
the bottleneck, especially when running table joins for large data set, CPU and 
I/O load went up ......

If switching to PostgreSQL, PostgreSQL 9.1.x is very good choice for production 
deployment. 

Thanks
Tony


P.S.  Today I did some stress tests on my PostgreSQL staging server: a)  insert 
2 billions records into the test table, b) full scan the table. here are some 
test results:  

Facts:  
Number of records: 2 billions records inserted today
Full table scan: about 16.76 minutes to scan 2 billions of rows, really 
AMAZING!  
Database size: 109GB
PostgrSQL: 9.2.1
Physical RAM: 8GB
CPU: i5 


########

EXPLAIN ANALYZE SELECT COUNT(*) FROM test;
QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=33849559.60..33849559.61 rows=1 width=0) (actual 
time=1006476.308..1006476.309 rows=1 loops=1)
->  Seq Scan on test  (cost=0.00..28849559.28 rows=2000000128 width=0) (actual 
time=47.147..903264.427 rows=2000000000 loops=1)
Total runtime: 1006507.963 ms





On 11 Dec 2012, at 8:27 PM, Bill Moran wrote:

> On Mon, 10 Dec 2012 15:26:02 -0500 (EST) "Mihai Popa" <mi...@lattica.com> 
> wrote:
> 
>> Hi,
>> 
>> I've recently inherited a project that involves importing a large set of
>> Access mdb files into a Postgres or MySQL database.
>> The process is to export the mdb's to comma separated files than import
>> those into the final database.
>> We are now at the point where the csv files are all created and amount
>> to some 300 GB of data.
>> 
>> I would like to get some advice on the best deployment option.
>> 
>> First, the project has been started using MySQL. Is it worth switching
>> to Postgres and if so, which version should I use?
> 
> I've been managing a few large databases this year, on both PostgreSQL and
> MySQL.
> 
> Don't put your data in MySQL.  Ever.  If you feel like you need to use
> something like MySQL, just go straight to a system that was designed with
> no constraints right off the bat, like Mongo or something.
> 
> Don't put large amounts of data in MySQL.  There are lots of issuse with it.
> Despite the fact that lots of people have been able to make it work (me,
> for example) it's a LOT harder to keep running well than it is on
> PostgreSQL.  MySQL just isn't designed to deal with large data.  As some
> examples: lack of CREATE INDEX CONCURRENTLY, the fact that the default
> configuration stores everything in a single file, the fact that any table
> changes (including simple things like adding a comment, or seemingly
> unrelated things like adding an index) require a complete table rebuild,
> and the fact that if you use anything other than INT AUTO_INCREMENT for
> your primary key you're liable to hit on awful inefficiencies.
> 
> PostgreSQL has none of these problems.
> 
> -- 
> Bill Moran <wmo...@potentialtech.com>
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Reply via email to