Re: Mysql - Hive Sync

Stephen Sprague Sat, 06 Sep 2014 06:54:55 -0700

interesting. thanks Muthu.

a colleague of mine pointed out this one too, linkedin's databus (
https://github.com/linkedin/databus/wiki)  this one looks extremely heavy
weight and again not sure its worth the headache.


i like the idea of a trigger on the mysql table and then broadcasting the
data to a another app via udp message.

cf. https://code.google.com/p/mysql-message-api/

the thing is you'll need to batch the records over say 5 minutes (or
whatever) then write the batch as one file to hdfs.

This seems infinitely simpler and more maintainable to me. :)




On Fri, Sep 5, 2014 at 11:53 PM, Muthu Pandi <muthu1...@gmail.com> wrote:

> Yeah installing Mysql hadoop applier took lot of time when building and
> installing GCC 4.6, and its working but its not serving the exact purpose.
> So now am trying with my own python scripting.
>
> Idea is reading insert query from binlog and save it under hive warehouse
> as table and query from there.
>
>
>
> *RegardsMuthupandi.K*
>
> [image: Picture (Device Independent Bitmap)]
>
>
>
> On Sat, Sep 6, 2014 at 4:47 AM, Stephen Sprague <sprag...@gmail.com>
> wrote:
>
>> great find, Muthu.  I would be interested in hearing any about any
>> success or failures using this adapter. almost sounds too good to be true.
>>
>> After reading the blog (
>> http://innovating-technology.blogspot.com/2013/04/mysql-hadoop-applier-part-2.html)
>> about it i see it comes with caveats and it looks a little rough around the
>> edges for installing.  Not sure i'd bet the farm on this product but YMMV.
>>
>> Anyway, curious to know how it works out for you.
>>
>>
>>
>> On Tue, Sep 2, 2014 at 11:03 PM, Muthu Pandi <muthu1...@gmail.com> wrote:
>>
>>> This cant be done since insert update delete are not supported in hive.
>>>
>>> Mysql Applier for Hadoop package servers the same purpose of the
>>> prototype tool which i intended to develop.
>>>
>>> link for "Mysql Applier for Hadoop"
>>> http://dev.mysql.com/tech-resources/articles/mysql-hadoop-applier.html
>>>
>>>
>>>
>>> *Regards Muthupandi.K*
>>>
>>>  [image: Picture (Device Independent Bitmap)]
>>>
>>>
>>>
>>> On Wed, Sep 3, 2014 at 10:35 AM, Muthu Pandi <muthu1...@gmail.com>
>>> wrote:
>>>
>>>> Yeah but we cant make it to work as near real time. Also my table
>>>> doesnt have like 'ID' to use for --check-column that's why opted out of
>>>> sqoop.
>>>>
>>>>
>>>>
>>>> *Regards Muthupandi.K*
>>>>
>>>>  [image: Picture (Device Independent Bitmap)]
>>>>
>>>>
>>>>
>>>> On Wed, Sep 3, 2014 at 10:28 AM, Nitin Pawar <nitinpawar...@gmail.com>
>>>> wrote:
>>>>
>>>>> have you looked at sqoop?
>>>>>
>>>>>
>>>>> On Wed, Sep 3, 2014 at 10:15 AM, Muthu Pandi <muthu1...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Dear All
>>>>>>
>>>>>>      Am developing a prototype of syncing tables from mysql to Hive
>>>>>> using python and JDBC. Is it a good idea using the JDBC for this purpose.
>>>>>>
>>>>>> My usecase will be generating the sales report using the hive, data
>>>>>> pulled from mysql using the prototype tool.My data will be around 
>>>>>> 2GB/day.
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Regards Muthupandi.K*
>>>>>>
>>>>>>  [image: Picture (Device Independent Bitmap)]
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nitin Pawar
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Mysql - Hive Sync

Reply via email to