Planning a migration from PostgreSQL to Hadoop/Hive

Marcos Ortiz Wed, 04 May 2011 13:24:55 -0700

We are planning a migration from a large PostgreSQL-based DWH toHadoop/Hive. The principal reason for this migration is the massivegrowth of the data to analyze (5.6 TB and growing) where PostgreSQL likea MVCC-based RDBMS has its pitfalls with heavy updates and queryexecution with great quantities of data. (We had done many query tunningand optimization to the server, with a minor effect on the latency ofthe queries).

So, we have viewed Hadoop and we have done some tests combined with Hiveand HBase and it´s awesome the obtained performance.


Can you give us some advices to develop a good plan for this?

Environment:
- O.S:CentOS-5.5 64 bits
- Java version: 1.6. Update 20
- Hardware: 8 Nodes - AMD Opteron QuadCore 4130
                                    8 GB RAM
                                    1 TB HDD

Regards

--
Marcos Luís Ortíz Valmaseda
 Software Engineer (Large-Scaled Distributed Systems)
 University of Information Sciences,
 La Habana, Cuba
 Linux User # 418229
 http://about.me/marcosortiz

Planning a migration from PostgreSQL to Hadoop/Hive

Reply via email to