On 05/04/2011 04:14 PM, Alexandre "TAZ" dos Santos Andrade wrote:
Hi Marcos,

I'm doing exactally the same migration, first of all you have to remember that hive is gonna make mapreduce for each query you dont write the result on a table, second is a litle bit anoing to migrate the data, there's no direct connector so I user a simple dump, extracted the header and footer and Loaded in hive structure.

I hope I could Help you

Alexandre dos Santos Andrade

2011/5/4 Marcos Ortiz <mlor...@uci.cu <mailto:mlor...@uci.cu>>

    We are planning a migration from a large PostgreSQL-based DWH to
    Hadoop/Hive. The principal reason for this migration is the
    massive growth of the data to analyze (5.6 TB and growing) where
    PostgreSQL like a MVCC-based RDBMS has its pitfalls with heavy
    updates and query execution with great quantities of data. (We had
    done many query tunning and optimization to the server, with a
    minor effect on the latency of the queries).

    So, we have viewed Hadoop and we have done some tests combined
    with Hive and HBase and it´s awesome the obtained performance.

    Can you give us some advices to develop a good plan for this?

    Environment:
    - O.S:CentOS-5.5 64 bits
    - Java version: 1.6. Update 20
    - Hardware: 8 Nodes - AMD Opteron QuadCore 4130
                                       8 GB RAM
                                       1 TB HDD

    Regards

-- Marcos Luís Ortíz Valmaseda
     Software Engineer (Large-Scaled Distributed Systems)
     University of Information Sciences,
     La Habana, Cuba
     Linux User # 418229
    http://about.me/marcosortiz




--
<a href="http://cwconnect.computerworld.com.br/profile_view.aspx?customerid=alexandreandrade";><img src="http://cwconnect.computerworld.com.br/businesscard.aspx?customerid=alexandreandrade"; border="0" alt="Join Me at CW Connect!"></a>
Thanks a lot, Alexandre.
Did you use Sqoop to load the data from PostgreSQL to Hive?



--
Marcos Luís Ortíz Valmaseda
 Software Engineer (Large-Scaled Distributed Systems)
 University of Information Sciences,
 La Habana, Cuba
 Linux User # 418229
 http://about.me/marcosortiz

Reply via email to