I created a wiki page that lists all the MySQL replication options that people posted, plus a couple others. People may/may not find it useful.
https://github.com/wushujames/mysql-cdc-projects/wiki I wasn't sure where to host it, so I put it up on a Github Wiki. -James On Mar 17, 2015, at 11:09 PM, Xiao <lixiao1...@gmail.com> wrote: > Linkedin Gabblin compaction tool is using Hive to perform the compaction. > Does it mean Lumos is replaced? > > Confused… > > On Mar 17, 2015, at 10:00 PM, Xiao <lixiao1...@gmail.com> wrote: > >> Hi, all, >> >> Do you know whether Linkedin plans to open source Lumos in the near future? >> >> I found the answer from Qiao Lin’s post about replication from Oracle/mySQL >> to Hadoop. >> >> - https://engineering.linkedin.com/data-ingestion/gobblin-big-data-ease >> >> At the source side, it can be DataBus-based or file based. >> >> At the target side, it is Lumos to rebuild the snapshots due to inability to >> do an update/delete in Hadoop. >> >> The slides about Lumos: >> http://www.slideshare.net/Hadoop_Summit/th-220p230-cramachandranv1 >> The talk about Lumos: >> https://www.youtube.com/watch?v=AGlRjlrNDYk >> >> Event publishing is different from database replication. Kafka is used for >> change publishing or maybe also used for sending changes (recorded in files). >> >> Thanks, >> >> Xiao Li >> >> On Mar 17, 2015, at 7:26 PM, Arya Ketan <ketan.a...@gmail.com> wrote: >> >>> AFAIK , linkedin uses databus to do the same. Aesop is built on top of >>> databus , extending its beautiful capabilities to mysql n hbase >>> On Mar 18, 2015 7:37 AM, "Xiao" <lixiao1...@gmail.com> wrote: >>> >>>> Hi, all, >>>> >>>> Do you know how Linkedin team publishes changed rows in Oracle to Kafka? I >>>> believe they already knew the whole problem very well. >>>> >>>> Using triggers? or directly parsing the log? or using any Oracle >>>> GoldenGate interfaces? >>>> >>>> Any lesson or any standard message format? Could the Linkedin people share >>>> it with us? I believe it can help us a lot. >>>> >>>> Thanks, >>>> >>>> Xiao Li >>>> >>>> >>>> On Mar 17, 2015, at 12:26 PM, James Cheng <jch...@tivo.com> wrote: >>>> >>>>> This is a great set of projects! >>>>> >>>>> We should put this list of projects on a site somewhere so people can >>>> more easily see and refer to it. These aren't Kafka-specific, but most seem >>>> to be "MySQL CDC." Does anyone have a place where they can host a page? >>>> Preferably a wiki, so we can keep it up to date easily. >>>>> >>>>> -James >>>>> >>>>> On Mar 17, 2015, at 8:21 AM, Hisham Mardam-Bey < >>>> hisham.mardam...@gmail.com> wrote: >>>>> >>>>>> Pretty much a hijack / plug as well (= >>>>>> >>>>>> https://github.com/mardambey/mypipe >>>>>> >>>>>> "MySQL binary log consumer with the ability to act on changed rows and >>>>>> publish changes to different systems with emphasis on Apache Kafka." >>>>>> >>>>>> Mypipe currently encodes events using Avro before pushing them into >>>> Kafka >>>>>> and is Avro schema repository aware. The project is young; and patches >>>> for >>>>>> improvements are appreciated (= >>>>>> >>>>>> On Mon, Mar 16, 2015 at 10:35 PM, Arya Ketan <ketan.a...@gmail.com> >>>> wrote: >>>>>> >>>>>>> Great work. >>>>>>> Sorry for kinda hijacking this thread, but I though that we had built >>>>>>> some-thing on mysql bin log event propagator and wanted to share it . >>>>>>> You guys can also look into Aesop ( https://github.com/Flipkart/aesop >>>> ). >>>>>>> Its >>>>>>> a change propagation frame-work. It has relays which listens to bin >>>> logs of >>>>>>> Mysql, keeps track of SCNs and has consumers which can then >>>> (transform/map >>>>>>> or interpret as is) the bin log-event to a destination. Consumers also >>>> keep >>>>>>> track of SCNs and a slow consumer can go back to a previous SCN if it >>>> wants >>>>>>> to re-listen to events ( similar to kafka's consumer view ). >>>>>>> >>>>>>> All the producers/consumers are extensible and you can write your own >>>>>>> custom consumer and feed off the data to it. >>>>>>> >>>>>>> Common use-cases: >>>>>>> a) Archive mysql based data into say hbase >>>>>>> b) Move mysql based data to say a search store for serving reads. >>>>>>> >>>>>>> It has a decent ( not an awesome :) ) console too which gives a nice >>>> human >>>>>>> readable view of where the producers and consumers are. >>>>>>> >>>>>>> Current supported producers are mysql bin logs, hbase wall-edits. >>>>>>> >>>>>>> >>>>>>> Further insights/reviews/feature reqs/pull reqs/advices are all >>>> welcome. >>>>>>> >>>>>>> -- >>>>>>> Arya >>>>>>> >>>>>>> Arya >>>>>>> >>>>>>> On Tue, Mar 17, 2015 at 1:48 AM, Gwen Shapira <gshap...@cloudera.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Really really nice! >>>>>>>> >>>>>>>> Thank you. >>>>>>>> >>>>>>>> On Mon, Mar 16, 2015 at 7:18 AM, Pierre-Yves Ritschard < >>>> p...@spootnik.org >>>>>>>> >>>>>>>> wrote: >>>>>>>>> Hi kafka, >>>>>>>>> >>>>>>>>> I just wanted to mention I published a very simple project which can >>>>>>>>> connect as MySQL replication client and stream replication events to >>>>>>>>> kafka: https://github.com/pyr/sqlstream >>>>>>>>> >>>>>>>>> When you don't have control over an application, it can provide a >>>>>>> simple >>>>>>>>> way of consolidating SQL data in kafka. >>>>>>>>> >>>>>>>>> This is an early release and there are a few caveats (mentionned in >>>> the >>>>>>>>> README), mostly the poor partitioning which I'm going to evolve >>>> quickly >>>>>>>>> and the reconnection strategy which doesn't try to keep track of >>>> binlog >>>>>>>>> position, other than that, it should work as advertised. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> - pyr >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Hisham Mardam-Bey >>>>>> http://hisham.cc/ >>>>> >>>> >>>> >> >