I had trouble with Sqoop, so here's what I do (Perl):

    $cmd = qq#echo "select * from $tableName where $dateColumn >= '$dayStart
00:00:00' and $dateColumn < '$dayEnd 00:00:00'" \\
    | mysql -h $dwIP --quick -B --skip-column-names --user=$USER
--password=$PASS $databaseName \\
    | ssh hdfs\@$hadoopIP "cat | hadoop fs -put -
/user/hive/warehouse/$tableName/$dayStart/datafile"#;
    system ($cmd);

Then when I create my Hive table, I point each partition at
/user/hive/warehouse/$tableName/$dayStart et voila! Everything works just
fine and I have full control over the import.


On Fri, May 6, 2011 at 3:57 PM, bichonfrise74 <[email protected]>wrote:

> Thanks Zoltan. I will check it out.
>
>
> 2011/5/6 Zoltan Prekopcsak <[email protected]>
>
>>  Hi,
>>
>> I think this is what Sqoop is for:
>> http://www.cloudera.com/blog/2009/06/introducing-sqoop/
>> http://www.cloudera.com/downloads/sqoop/
>>
>> Best, Zoltan
>>
>>
>> 5/6/11 7:45 PM keltezéssel, bichonfrise74 írta:
>>
>>
>>> Hi,
>>>
>>> I want to be able to extract data from Mysql to Hadoop without writing
>>> the data to disk. I was thinking in the line of piping the extract and
>>> loading it to Hadoop.
>>>
>>> Something like this:
>>>
>>> mysql <extract_query> | hive -e 'load data <via_pipe> into table ...
>>>
>>> Has anyone done this before?
>>>
>>>
>>
>


-- 
Tim

Reply via email to